Looking for new ways to play with Common Voice 8? The most diverse multilingual open speech corpus in the world, Common Voice 8 contains 87 languages and 200,000 different voices. You can now train an AI model just for your voice using CV and Coqui STT.
Previously, you could only download data by language. Thanks to our incredible open source contributor Alexandre Macabies you can download your own voice clips! All you need to do is make sure you have a profile set up, record yourself (and make voice AI diverse and fairer in the process!) then hit the download button.
Why is this new feature important? You can now leverage your own data to make AI models understand you better. Our friends over at Coqui share a tutorial that takes you through creating your own, personal Speech-to-Text model.
Check out the tutorial here! Coqui co-founder Josh Meyer explains how to use it:
'At Coqui 🐸, we use Common Voice all the time! We love that there’s so many diverse voices from folks all over the world, because it allows us to create machine learning models that understand all kinds of people. Diverse data teaches models to be less biased, and we want AI to understand everyone.
But what if you don’t need a model that understands everyone? What if you just want your own AI model – a model that understands you better than anyone else? If you’re a DIY hacker, a machine learning enthusiast, or if you just love open source, you’re in the right place 💚
You can easily teach a model to know your voice by “fine-tuning” a general model to your data. Now, with Common Voice and 🐸, this is possible for everyone!
If you follow the steps in this notebook, you too can create your own, personal 🐸 Speech-to-Text model.
Then take your new model (it’s open source!) and plug it into whatever application you like… the sky’s the limit!'
We’re excited to see how you use this! You can download the entire CV dataset here, download your own clips in your profile, and use Coqui’s notebook here.