Over the next decade, speech is expected to become the primary way people interact with devices — from phones and laptops to digital assistants. Today’s voice-enabled devices, however, are inaccessible to vast swaths of the planet’s languages, accents, and speech patterns. Currently, neither Amazon’s Alexa, Apple’s Siri, nor Google Home support a single native African language. Most of the voice data currently used to train machine learning algorithms is held by a handful of major companies. This poses challenges for companies seeking to develop high-quality speech recognition technologies, while also exacerbating the voice recognition divide between English speakers and the rest of the world.
Mozilla aims to change this through Common Voice an open-source initiative that makes it easy for anyone to donate their voice to a publicly available database that anyone can then use to train voice-enabled devices. Over the past two years, Rwandans have donated over 1,700 hours of voice data in Kinyarwanda, a widely spoken language with over 12 million speakers in Rwanda, and a pilot test is currently using the data to train a voice-enabled chatbot for COVID-19 information.
Based on the early success of the Kinyarwanda project, Mozilla is expanding the project to Kiswahili made possible by a $3.4 million investment from groups including the Bill & Melinda Gates Foundation, the Deutsche Gesellschaft für Internationale Zusammenarbeit (GIZ) GmbH (German Development Cooperation), and the UK’s Foreign Commonwealth & Development Office (FCDO).
Today we are very pleased to announce that Britone Mwasaru, Kathleen Siminyu and Rebecca Ryakitimbo Mwimbi have joined as three new Mozilla Common Voice Fellows dedicated to this project.
Britone Mwasaru will be working on voice technology with a focus on the Kiswahili language in order to address the exclusion of those whose first or preferred language is Kiswahili. Before joining Mozilla, Britone was Director of Technology at Swahilipot Hub where he led use and adoption of technology in the technology and arts community in Mombasa and Coast region of Kenya.
Kathleen Siminyu is an AI Researcher who has focused on Natural Language Processing for African Languages. She will be joining Mozilla Foundation as a Machine Learning Fellow to support the development of a Kiswahili Common Voice dataset and to build speech transcription models for end use cases in the agricultural and financial domains. In her NLP research, Kathleen has previously worked on speech transcription for Luhya languages and contributed to machine translation for Kenyan languages as part of Masakhane. Before joining Mozilla, Kathleen was Regional Coordinator of AI4D Africa, where she worked with ML and AI communities in Africa to run various programs.
Rebecca will be working on establishing and supporting diverse Kiswahili language and tech communities along axes of gender, age, regional origin, accent and vernacular usage towards building an open voice dataset in Kiswahili. She will work to ensure that the dataset accurately represents the Kiswahili population with the goal of encouraging adoption and implementation of voice technology. Before joining Mozilla, Rebecca has been an Internet Society fellow, an Afrisig fellow, a Google Policy fellow, a national geographic explorer and a digital rights program officer at Paradigm Initiative.
A key goal of the project is to explore whether it is possible to develop voice recognition for the languages of underserved communities as a platform. With this data available as a digital public good in the open source domain, it could allow local innovation in emerging markets to develop products and services serving marginalized communities. Common Voice will be collaborating with African companies, start-ups and universities to develop locally suitable, voice-enabled technology solutions that are relevant to the Sustainable Development Goals (SDGs).