Sharing our Machine Learning Model for YouTube Video Similarity

Today we are pleased to announce that the YouTube video similarity models used in our recent RegretsReporter research are now available on Hugging Face.

The models are highly effective (~93% accuracy) at identifying whether two videos are similar, according to our policy. We hope that these models will prove valuable to other researchers investigating YouTube.

This release includes two models: a model for cases in which video transcripts are available and a model for videos without transcripts.

Additionally, we have provided a Hugging Face Spaces demo that allows anyone to easily try the models. Just providing two YouTube video links you can see how similar those two videos are according to the models. For your convenience, the demo also includes a few predefined video pair examples.

If these models are useful to you, we’d love to hear from you. Please write to [email protected].