Four projects won Mozilla’s call to fight bias and broaden inclusion in voice technology.
In June 2022, Mozilla’s Common Voice announced a global competition, ‘Our Voices, ’ to incentivize voice technologists to design models tackling bias and furthering diversity and inclusion. After a thorough review of submissions from around the world, the selecting panel is pleased to announce the top four entries that scooped first place in each category.
Competitors submitted entries in the following categories: Language, Variant and Accent speech to text (STT) models for any under-served language, variant (eg. dialect), or accent; Gender, which looked for models for under-resourced language that has been optimised for any marginalized gender community; Methodologies for any method aimed at tackling bias in speech tech; for example through a dataset audit or a benchmark bias corpus and finally - an Open Call category for any other DEI-relevant prototypes or tools. Entrants were required to use Mozilla’s Common Voice dataset but could also use other open-source tools they liked, for example, NVIDIA NeMo.
Says EM Lewis - Jong, Common Voice Product Lead, “building an open model ecosystem is part of our collaboration with NVIDIA, who helped support this work. Together we're supporting the creation of AI models with transparent lifecycles; from datasets - developed through meaningful community contribution, to models and use cases - opened up to fight monopolies.”
Moreover, “there are very few projects in the voice technology ecosystem that embed diversity, equity, and inclusion focus when designing and training their models. The winners presented great pieces of work which show that inclusivity in voice technology is attainable!” she adds.
Building an open model ecosystem is part of our collaboration with NVIDIA, who helped support this work. Together we're supporting the creation of AI models with transparent lifecycles; from datasets - developed through meaningful community contribution, to models and use cases - opened up to fight monopolies.
EM Lewis - Jong , Common Voice Product Lead
Here are the winning projects:
Davud Kakaie - Kurdish
Developed a great performing speech-to-text model for the Kurdish language. The first open model for the language.
WEDO Team - Thai
The team built a speech-to-text model that was highly performant for female Thai speakers. They are building voice-activated smart home appliances in Thai, and are thinking about how this can support people with disabilities and pathologies.
Bülent Özden - Community toolbox (Global)
His winning project is a toolbox model that - amongst other things - helps users to visualize and detect bias in datasets. The tools provide insights about the data, as well as making it easier for developers to use.
Preben Vangberg and Leena Farhat - Romansh
A speech-to-text model for the Swiss minority language Romansh and for the two major Romansh dialects (Sursilvan and Vallader). The models performed well with a small character error rate.
About
Mozilla Common Voice is an open-source initiative to make voice technology more inclusive. People can donate their voices to an open-source dataset, and technologists can then use that data to create new products. To date, Common Voice has collected more than 20,000 hours of speech in 100 languages.
Each week we will publish a feature on the winning projects.