As technology makes massive shift to voice-enabled products, NVIDIA invests $1.5 million in Mozilla Common Voice to transform the voice recognition landscape.
Over the next decade, speech is expected to become the primary way people interact with devices — from laptops and phones to digital assistants and retail kiosks. Today’s voice-enabled devices, however, are inaccessible to much of humanity because they cannot understand vast swaths of the world’s languages, accents, and speech patterns.
To help ensure that people everywhere benefit from this massive technological shift, Mozilla is partnering with NVIDIA, which is investing $1.5 million in Mozilla Common Voice, an ambitious, open-source initiative aimed at democratizing and diversifying voice technology development.
Most of the voice data currently used to train machine learning algorithms is held by a handful of major companies. This poses challenges for others seeking to develop high-quality speech recognition technologies, while also exacerbating the voice recognition divide between English speakers and the rest of the world.
Launched in 2017, Common Voice aims to level the playing field while mitigating AI bias. It enables anyone to donate their voices to a free, publicly available database that startups, researchers, and developers can use to train voice-enabled apps, products, and services. Today, it represents the world’s largest multi-language public domain voice data set, with more than 9,000 hours of voice data in 60 different languages, including widely spoken languages and less used ones like Welsh and Kinyarwanda, which is spoken in Rwanda. More than 164,000 people worldwide have contributed to the project thus far.
This investment will accelerate the growth of Common Voice’s data set, engage more communities and volunteers in the project, and support the hiring of new staff.
To support the expansion, Common Voice will now operate under the umbrella of the Mozilla Foundation as part of its initiatives focused on making artificial intelligence more trustworthy. According to the Foundation’s Executive Director, Mark Surman, Common Voice is poised to pioneer data donation as an effective tool the public can use to shape the future of technology for the better.
“Language is a powerful part of who we are, and people, not profit-making companies, are the right guardians of how language appears in our digital lives,” said Surman. “By making it easy to donate voice data, Common Voice empowers people to play a direct role in creating technology that helps rather than harms humanity. Mozilla and NVIDIA both see voice as a prime opportunity where people can take back control of technology and unlock its full potential.”
“The demand for conversational AI is growing, with chatbots and virtual assistants impacting nearly every industry,” said Kari Briski, senior director of accelerated computing product management at NVIDIA. “With Common Voice’s large and open datasets, we’re able to develop pre-trained models and offer them back to the community for free. Together, we’re working toward a shared goal of supporting and building communities — particularly for under-resourced and under-served languages.”