Announcing ‘Our Voices,’ a New Competition by Mozilla to Fight Bias in Voice Technology

Announcing ‘Our Voices,’ a New Competition by Mozilla to Fight Bias in Voice Technology

The competition, led by Common Voice, seeks more diverse and inclusive speech recognition tools

Technologists around the world can enter, win from $20,000 prize pool

(WORLDWIDE | THURSDAY, JUNE 30, 2022) – Mozilla is launching a series of ambitious, global competitions titled “Our Voices,” with the goal of making voice technology less biased and more inclusive.

Hosted by Mozilla Common Voice, “Our Voices” challenges technologists around the world to build speech recognition systems that understand and work for everyone, no matter their gender, accent, or language. The competitions coincide with the release of Common Voice’s 10th dataset.

Says Em Lewis-Jong, Mozilla’s Product Lead for Common Voice: “Currently, the voice technology ecosystem is riddled with biases and big gaps. As a result, millions of people — and countless languages and accents — can’t benefit from an essential technology. ‘Our Voices’ rewards technologists who confront this problem head on.”

Currently, the voice technology ecosystem is riddled with biases and big gaps. As a result, millions of people — and countless languages and accents — can’t benefit from an essential technology.

Em Lewis-Jong, Mozilla’s Product Lead for Common Voice

The competition features a total of $20,000 USD in cash prizes across four categories:

Gender | For example, a speech to text (STT) model for an under-resourced language that performs equally well for female speakers
Variant and Accent | For example, accent classifiers by, and for, a community to support their needs
Methodologies | For example, a benchmark corpus, or dataset audit methodology for identifying bias
Open Call | Any other creative and DEI-relevant voice prototypes or tools

Entrants must primarily use Mozilla Common Voice data, but may also use other open-source tools like NVIDIA NeMo.

Explains Lewis-Jong: “To make the competition as accessible as possible, we are welcoming entries from people who don’t have a ton of data or GPU access at their disposal. We still want to see proof of concept entries with small, toy corpora, and early-stage methodology outlines. We are also awarding prizes in each language resource band, so that low-resource languages with much smaller datasets aren’t competing with languages with a ton of resources at their disposal.”

Mozilla Common Voice is an open-source initiative to make voice technology more inclusive. People can donate their voices to an open-source dataset, and technologists can then use that data to create new products. To date, Common Voice has collected more than 20,000 hours of speech in over 90 languages, from Abkhaz to Vietnamese.

Participants have from July to September to gather any additional data they require for their ideas. Common Voice’s eleventh dataset will be released on September 14, and participants will have until October 12 to build and submit their entries. Winners will receive $2,000 each and demo their work at a Speech Summit in October.

Entries will be judged holistically on criteria like word error rate, social need, deployability, and environmental impact. Judges are nine speech tech experts from NVIDIA, Mozilla, universities and other institutions and organizations. The panel is 50% women, 60% multilingual, and 30% people of colour.

For further rules, entrance details, and judge biographies, click here.

Press contact: Kevin Zawacki | [email protected]