An picture showing the Common Voice chat bot with information about the Kiswahili festival, taking place in Feb 24 and 25, 2023 in Mombasa, Kenya.

Attendees will have the opportunity to design voice models using the Kiswahili dataset and bag a total cash prize of KSH. 100,000

(MOMBASA, KENYA | FRIDAY, FEBRUARY 24) — Mozilla Common Voice Fellows are calling all Kiswahili speakers, voice technologists and data scientists, to join them on February 24-25 for the Kiswahili Festival. This two-day event will be hosted in partnership with Swahilipot Hub in Mombasa, Kenya and brings together the most important ingredient to growing the Kiswahili dataset: community.

Common Voice is the most multilingually diverse crowdsourced dataset in the world, powered by the voices of volunteer contributors worldwide. Participants can contribute to a multi-language voice dataset furthering the development of inclusive machine learning models for voice applications. Technologists who want to build voice applications can use the dataset to train machine learning models.

“Language is an important part of digital inclusion” says Mozilla Fellow and event host Britone Mwasaru. “This event is part of a community approach towards building the open voice dataset for the Kiswahili language on the Common Voice platform. It’s about lowering barriers to building and reducing bias in tools/products created. But it’s also about developing a language dataset by and for us. And I am very excited to see what that enables.”

The festival will help grow the dataset and build awareness around Common Voice. The flow of activities will proceed as follows:

  • Day 1 - Awareness and engagement day focused on dataset growth - voice validation and contributions

    On the first day participants will learn about Common Voice and how to contribute to the project. Participants will be able to contribute their voices & validate other contributed voices. There will be prizes for the top contributors.
  • Day 2 - Model competition open to all across Kenya.

    This coding challenge will kick off on Wednesday February 15, 2023 and run until Friday, February 24, at 3:00 pm EAT. The challenge will show a typical workflow for training and testing a speech to text model on Mozilla’s Common Voice Kiswahili data. Interested individuals can participate in person or virtually. Common Voice team will rank submitted solutions and the top projects will be awarded a total cash prize of KES 100,000 during the event.

Everyone across Kenya is welcome to participate in the competition. Once registration is completed here, a Zoom link for the session will be sent to participants along with information on how to join the session, participate, and make submissions.

We look forward to seeing the community come together to build a diverse voice dataset for the Kiswahili language.

Related content