Maendeleo. That’s the Kiswahili word for “progress.” And over the past three months, Mozilla’s Kiswahili Common Voice work has been a whirlwind of maendeleo.
First, some background: Mozilla Common Voice is an open-source initiative to make voice technology more inclusive. And some of our most urgent work is taking place on the African continent. Right now, neither Amazon’s Alexa, Apple’s Siri, nor Google Home support a single native African language. This means that millions of people who speak Kinyarwanda, Kiswahili, and other African languages can’t use voice technology to do something as simple as checking the weather — or something as important as checking for COVID updates.
Mozilla formally launched its Kiswahili work three months ago, with the goal of adding more Kiswahili speech to the Common Voice data set. We sought to build on the amazing work our staff and volunteers had already accomplished with African languages, like collecting hundreds of hours of Kinyarwanda voice data, a language with over 12 million speakers in Rwanda.
Right now, neither Amazon’s Alexa, Apple’s Siri, nor Google Home support a single native African language.
To set the Kiswahili work in motion, Mozilla added three fellows to our team — Britone Mwasaru, Kathleen Siminyu, and Rebecca Ryakitimbo — with deep experience in voice AI and community engagement. Fellows develop new thinking on how to address emerging threats and challenges facing a healthy internet. The Fellows’ first task? Get the word out about our Kiswahili work, and begin generating contributions.
Now, three months later, we’ve done just that — and more. Maendeleo.
The Kiswahili Common Voice platform is now live, and already has 2 hours of speech from over 40 contributors. Individuals who wish to contribute Kiswahili speech data can do so by visiting https://commonvoice.mozilla.org/sw.
(And it’s not just Kiswahili speech data that we’ve collected recently. Our most recent data set, released in August, featured 16 new languages and 4,600 new hours of speech.)
Meanwhile, our Fellows have launched a blog series (written in Kiswahili) encouraging Kiswahili contributions and explaining the positive impact that these voice contributions can have. In his first blog post, Britone explains how he was attracted to this work, and why it matters. He writes:
Nafasi hii niliyo nayo hapa Mozilla itaniwezesha kuhakikisha teknolojia itaweza kuelewa jinsi sisi tunavyoongea na kuwezesha uvumbuzi utakaotumia lugha ya Kiswahili. Lugha huwa kizuizi kikuu cha utumizi wa teknolojia. Sio kila mtu mwenye uwezo wa kusoma na kuelewa Kiingereza, hii ni fursa nzuri ya kutumia sauti kuwezesha teknolojia kutumikia kila mtu.
In English, that reads:
This opportunity I have here at Mozilla will enable me to ensure that technology will be able to understand how we speak and facilitate innovation that will use the Kiswahili language. Language becomes a major barrier to the use of technology. Not everyone is able to read and understand English, so this is a great opportunity to use voice to enable technology to serve everyone.
In addition to our blog, we’ve run online community sessions to build Common Voice’s Kiswahili community. Contributors can listen on YouTube or AirMozilla. One of these sessions was tailored for the Arusha Women School of Internet Governance, an important community for us: Our work aims not just to make voice data more inclusive across languages, but also across genders. When gender is not intentionally taken into account, we run the risk of developing a technology that does not work for women, girls and gender diverse people.
Of course, collecting Kiswahili speech data is a means to an end — this data is only valuable if people build with it. Our focus for this particular work is to have an impact in the Agriculture and Financial sectors - building technology that may be used for improved access to information and participation in this space. We developed community research including surveys and interviews with those already in this space to identify how the data can best be deployed, gathering feedback from farmers, developers, and feminist organizations who speak Kiswahili. We’ve been inspired by the real-world impact our Kinyarwanda work is having: Mozilla Fellow Remy Muhire and several others helped launch Mbaza, a Common Voice-powered chatbot that provides timely and accurate information about the pandemic in the Kinyarwanda, French, and English languages.
While we do this, we’re striving to ensure the technology will work for all Swahili speakers as there are differences in the way it is spoken. We are working with linguists to ensure that all Kiswahili variations in dialects and accents are accounted for.
The Kiswahili Common Voice team is proud of the work we’ve accomplished over the past three months. But in many ways, it’s just the start — and we need your help for this work to continue and thrive. Here are four ways to get involved:
- Take five minutes of your time to contribute your voice and/or validate Kiswahili sentences we already have: https://commonvoice.mozilla.org/sw
- We are seeking CC0 sources for sentences. Let us know if you have ideas: [email protected]
- If you are in fintech and agriculture, reach out to us so we can learn from you about use cases: [email protected]
- Follow our Fellows’ writing at foundation.mozilla.org/blog
Thank you for reading / Asante kwa kusoma.