Data Futures Lab

The Data Futures Lab is an experimental space for instigating new approaches to data stewardship challenges. It provides funding, scaffolding for collaboration, convening around emerging ideas, and a place to workshop approaches to data stewardship which give greater control and agency to people.

Funded Projects

2023 Cohort

  • 2023 Cohort

    DataKind

    Working with the Black Wealth Data Center, they are creating an interactive data tool to understand broadband inequalities in the U.S. Leveraging existing but dispersed data and a network of collaborators, they seek to shine a light on where ‘digital deserts’ exist, and how different communities are impacted by them.

  • 2023 Cohort

    OONI

    Developing a new version of OONI Run, a tool that allows coordinated testing of website blocking using OONI's existing infrastructure and contributing to its dataset. They seek to deploy this tool in the field through their established partnerships with 41 digital rights organizations around the world.

  • 2023 Cohort

    Posmo

    Consolidating their Posmo Data Market, a trustworthy data space sharing detailed personal mobility and socioeconomic data with third parties under ethical terms while providing data subjects with control over the use of their data.

  • 2023 Cohort

    Tattle

    Developing Uli, a browser plugin that gives social media users tools for a collective response to online gender-based violence. And also seeking new crowdsourcing tools for annotation and reporting.

  • 2023 Cohort

    Tidepool

    Maintaining its Big Data Donation Project, which enables people to donate data from their diabetes devices to help fuel the next generation of research. They will develop a blanket real-world evidence protocol to expand the potential impact of the donated data and conduct interviews to help measure and share the impact of the data sets donated.

2022 Cohort

  • 2022 Cohort

    Driver’s Seat Cooperative

    Driver’s Seat is a cooperative owned by rideshare and delivery drivers, with the mission of transforming the gig economy through shared data ownership. Drivers themselves are at the forefront of the organization, the mobile app, and website that they use to collect data and share insights. This allows them to make more money and have more control over their work.

  • 2022 Cohort

    PLACE

    PLACE Trust is a non-profit organization making high-resolution mapping images accessible under a public interest legal trust. Currently, all value in mapping data is in the hands of large tech companies. PLACE provides an opportunity for these data sets to be pooled and collectively owned by local governments and organizations.

  • 2022 Cohort

    Drivers Coop

    Driver’s Coop is a cooperatively owned rideshare app with over 5,000 drivers and 40,000 riders. The organization is working to revolutionize the rideshare industry through worker ownership and by providing alternatives to mainstream platforms.

  • 2022 Cohort

    Digital Democracy

    Digital Democracy works with local communities and individuals around the world to collect and share evidence of human and environmental rights abuses through an app called Mapeo. Users can download the app and use it offline while tracking remote terrains. It’s available in local languages, and records location points, photographs of landmarks, and events.

2021 Cohort

  • signal boost
    2021 Cohort

    Signal Boost

    Signalboost is a messaging application that provides encrypted broadcasts and hotlines to activists, organizers, and other vulnerable populations. It doesn’t rely on SMS, which is vulnerable to surveillance, and doesn’t require users to expose their phone number.

  • 2021 Cohort

    Consumer Reports Digital Lab

    Consumer Reports’ Digital Lab enables consumers to better exercise their data rights. They are building Permission Slip, a mobile app that lets consumers easily set permissions for what companies can do with their data, and also facilitates data-related requests on their behalf.

  • 2021 Cohort

    Worker Info Exchange

    Worker Info Exchange is a nonprofit trade union for drivers whose livelihoods depend on app-based services like Uber. The organization helps gig workers access, analyze, and act on insights from their personal data collected and processed at work.

Infrastructure Grants

  • 2024 Grants

    Data Provenance Initiative

    The Data Provenance Initiative has mapped 2,000+ popular, text-to-text fine-tuning datasets from origin to creation, cataloging their data sources, licenses, creators, and other metadata, for researchers and developers to explore. This work improves transparency, documentation, and informed use of datasets in AI.

  • 2024 Grant

    The Data Science Law Lab

    The Data Science Law Lab out of the University of Pretoria will conduct research that addresses the shortcomings of using creative commons licenses in certain contexts (such as reinforcing extractive practices and digital colonialism) and create a prototype for a new data license based on their findings.

  • 2024 Grant

    FLAIR (First Languages AI Reality) Initiative

    FLAIR will work with an indigenous language community using their software and methodology to collect the necessary corpus data to develop Automatic Speech Recognition (ASR) for the community’s language. They will publish the source code and methodology, which will enable further indigenous language communities to revitalize their languages more rapidly and effectively — using their own data, and on their own terms.

  • 2024 Grant

    Imperial College London

    The Computational Privacy Group at Imperial College London will build on their initial research around detecting privacy risk in AI generated synthetic datasets, and publish an open-source toolkit that enables builders to evaluate the privacy risk of AI generated synthetic data before releasing it.

  • 2024 Grant

    Fundación Vía Libre

    Building on their existing toolset, Fundación Vía Libre will will use community-centered methods to build a language dataset that represents stereotypes in Argentina, publish programming libraries to integrate the dataset in audit processes, and publish materials so that others can replicate their methods for other languages and contexts.

  • 2021 Grant

    Consumer Reports Digital Lab

    Consumer Reports’ Digital Lab created the Data Rights Protocol (DRP), a technical standard for the delivery of data requested through Subject Access Requests. The goal of DRP is to enable quicker turnaround on requests, set expectations for the type and amount of data that is delivered by complying organizations and, finally, use the standard and accompanying tools to show (in court) what compliance looks like or should look like.

  • 2022 Grant

    Te Hiku Media

    Te Hiku Media will complete documentation for the Kaitiakitanga indigenous data license which enables communities to maintain guardianship over their data.

  • 2023 Grant

    Data Nutrition Project

    The Data Nutrition Project creates tools and practices that encourage responsible AI development, partners across disciplines to drive broader change, and builds inclusion and equity into their work. They will host a convening to surface best practices and current challenges, and build community around dataset auditing.

Data Futures Lab Showcase

  • May 2023 Virtual Showcase

    Associação Data Labe (Brazil)

    Presented Cocozap a citizen-generated data, mapping, training and advocacy on basic sanitation in the favelas of Rio de Janeiro. Using a WhatsApp channel, residents of the Favela da Maré can report cases of open sewage, accumulated garbage, and lack of water.

  • Showcase @ MozFest House Kenya 2023

    Kounkuey Design Initiative (Kenya)

    Presented their Living Data Hubs project, a community-developed and owned Internet hotspot and data-collection tool providing internet services and digital literacy to the marginalized residents in the informal settlements in Kibera in Nairobi.

  • Showcase @ MozFest House Kenya 2023

    Makere AI Lab (Uganda)

    Presented their machine learning tool for localized and targeted agricultural advisory to smallholder farmers in Uganda.

  • Showcase @ MozFest House Kenya 2023

    Masakhane Research Foundation (Kenya)

    Presented Masakhane Web, a platform that hosts the already trained machine translation for African languages models from the community and allow contributions from users to create new data for retraining.

  • Showcase @ MozFest House Kenya 2023

    Media Monitoring Africa (South Africa)

    Presented Real 411, an independent public complaints platform where citizens can report misinformation, hate speech, incitement to violence and attacks on journalists online.

  • Showcase @ MozFest House Kenya 2023

    GhanaNLP (Ghana)

    Presented Khaya AI, a Natural Language Processing tool for Ghanaian languages, including automatic speech recognition and machine translation. Currently being applied to solve local problems.

Related Projects

Mozilla’s efforts to make the data economy more fair aren’t limited to the Data Futures Lab. Our related projects include:

Creative Media Awards

Supporting people and projects that reimagine the way data is governed, from researchers in South Africa, to sex workers in the U.S., to educators in Italy, and beyond.

Learn More

Common Voice

is an open-source initiative to make voice technology more inclusive. People can donate their voices to an open-source dataset, and technologists can then use that data to create new products. To date, Common Voice has collected more than 20,000 hours of speech in over 90 languages.

Learn more

Regrets Reporter

is Mozilla’s crowdsourced research project into YouTube’s recommendation AI. The open-source browser extension turns everyday YouTube users into YouTube watchdogs, who donate their data to shed light on one of the internet’s most powerful AI curation systems.

Learn more

*Privacy Not Included

is Mozilla’s consumer tech guide that focuses not on price or performance, but privacy. The guide highlights which gadgets and apps respect users’ personal data (and which don’t), allowing shoppers to make informed decisions.

Learn more

Funded Projects

2023 Cohort

  • 2023 Cohort

    DataKind

    Working with the Black Wealth Data Center, they are creating an interactive data tool to understand broadband inequalities in the U.S. Leveraging existing but dispersed data and a network of collaborators, they seek to shine a light on where ‘digital deserts’ exist, and how different communities are impacted by them.

  • 2023 Cohort

    OONI

    Developing a new version of OONI Run, a tool that allows coordinated testing of website blocking using OONI's existing infrastructure and contributing to its dataset. They seek to deploy this tool in the field through their established partnerships with 41 digital rights organizations around the world.

  • 2023 Cohort

    Posmo

    Consolidating their Posmo Data Market, a trustworthy data space sharing detailed personal mobility and socioeconomic data with third parties under ethical terms while providing data subjects with control over the use of their data.

  • 2023 Cohort

    Tattle

    Developing Uli, a browser plugin that gives social media users tools for a collective response to online gender-based violence. And also seeking new crowdsourcing tools for annotation and reporting.

  • 2023 Cohort

    Tidepool

    Maintaining its Big Data Donation Project, which enables people to donate data from their diabetes devices to help fuel the next generation of research. They will develop a blanket real-world evidence protocol to expand the potential impact of the donated data and conduct interviews to help measure and share the impact of the data sets donated.

2022 Cohort

  • 2022 Cohort

    Driver’s Seat Cooperative

    Driver’s Seat is a cooperative owned by rideshare and delivery drivers, with the mission of transforming the gig economy through shared data ownership. Drivers themselves are at the forefront of the organization, the mobile app, and website that they use to collect data and share insights. This allows them to make more money and have more control over their work.

  • 2022 Cohort

    PLACE

    PLACE Trust is a non-profit organization making high-resolution mapping images accessible under a public interest legal trust. Currently, all value in mapping data is in the hands of large tech companies. PLACE provides an opportunity for these data sets to be pooled and collectively owned by local governments and organizations.

  • 2022 Cohort

    Drivers Coop

    Driver’s Coop is a cooperatively owned rideshare app with over 5,000 drivers and 40,000 riders. The organization is working to revolutionize the rideshare industry through worker ownership and by providing alternatives to mainstream platforms.

  • 2022 Cohort

    Digital Democracy

    Digital Democracy works with local communities and individuals around the world to collect and share evidence of human and environmental rights abuses through an app called Mapeo. Users can download the app and use it offline while tracking remote terrains. It’s available in local languages, and records location points, photographs of landmarks, and events.

2021 Cohort

  • signal boost
    2021 Cohort

    Signal Boost

    Signalboost is a messaging application that provides encrypted broadcasts and hotlines to activists, organizers, and other vulnerable populations. It doesn’t rely on SMS, which is vulnerable to surveillance, and doesn’t require users to expose their phone number.

  • 2021 Cohort

    Consumer Reports Digital Lab

    Consumer Reports’ Digital Lab enables consumers to better exercise their data rights. They are building Permission Slip, a mobile app that lets consumers easily set permissions for what companies can do with their data, and also facilitates data-related requests on their behalf.

  • 2021 Cohort

    Worker Info Exchange

    Worker Info Exchange is a nonprofit trade union for drivers whose livelihoods depend on app-based services like Uber. The organization helps gig workers access, analyze, and act on insights from their personal data collected and processed at work.

Infrastructure Grants

  • 2024 Grants

    Data Provenance Initiative

    The Data Provenance Initiative has mapped 2,000+ popular, text-to-text fine-tuning datasets from origin to creation, cataloging their data sources, licenses, creators, and other metadata, for researchers and developers to explore. This work improves transparency, documentation, and informed use of datasets in AI.

  • 2024 Grant

    The Data Science Law Lab

    The Data Science Law Lab out of the University of Pretoria will conduct research that addresses the shortcomings of using creative commons licenses in certain contexts (such as reinforcing extractive practices and digital colonialism) and create a prototype for a new data license based on their findings.

  • 2024 Grant

    FLAIR (First Languages AI Reality) Initiative

    FLAIR will work with an indigenous language community using their software and methodology to collect the necessary corpus data to develop Automatic Speech Recognition (ASR) for the community’s language. They will publish the source code and methodology, which will enable further indigenous language communities to revitalize their languages more rapidly and effectively — using their own data, and on their own terms.

  • 2024 Grant

    Imperial College London

    The Computational Privacy Group at Imperial College London will build on their initial research around detecting privacy risk in AI generated synthetic datasets, and publish an open-source toolkit that enables builders to evaluate the privacy risk of AI generated synthetic data before releasing it.

  • 2024 Grant

    Fundación Vía Libre

    Building on their existing toolset, Fundación Vía Libre will will use community-centered methods to build a language dataset that represents stereotypes in Argentina, publish programming libraries to integrate the dataset in audit processes, and publish materials so that others can replicate their methods for other languages and contexts.

  • 2021 Grant

    Consumer Reports Digital Lab

    Consumer Reports’ Digital Lab created the Data Rights Protocol (DRP), a technical standard for the delivery of data requested through Subject Access Requests. The goal of DRP is to enable quicker turnaround on requests, set expectations for the type and amount of data that is delivered by complying organizations and, finally, use the standard and accompanying tools to show (in court) what compliance looks like or should look like.

  • 2022 Grant

    Te Hiku Media

    Te Hiku Media will complete documentation for the Kaitiakitanga indigenous data license which enables communities to maintain guardianship over their data.

  • 2023 Grant

    Data Nutrition Project

    The Data Nutrition Project creates tools and practices that encourage responsible AI development, partners across disciplines to drive broader change, and builds inclusion and equity into their work. They will host a convening to surface best practices and current challenges, and build community around dataset auditing.

Data Futures Lab Showcase

  • May 2023 Virtual Showcase

    Associação Data Labe (Brazil)

    Presented Cocozap a citizen-generated data, mapping, training and advocacy on basic sanitation in the favelas of Rio de Janeiro. Using a WhatsApp channel, residents of the Favela da Maré can report cases of open sewage, accumulated garbage, and lack of water.

  • Showcase @ MozFest House Kenya 2023

    Kounkuey Design Initiative (Kenya)

    Presented their Living Data Hubs project, a community-developed and owned Internet hotspot and data-collection tool providing internet services and digital literacy to the marginalized residents in the informal settlements in Kibera in Nairobi.

  • Showcase @ MozFest House Kenya 2023

    Makere AI Lab (Uganda)

    Presented their machine learning tool for localized and targeted agricultural advisory to smallholder farmers in Uganda.

  • Showcase @ MozFest House Kenya 2023

    Masakhane Research Foundation (Kenya)

    Presented Masakhane Web, a platform that hosts the already trained machine translation for African languages models from the community and allow contributions from users to create new data for retraining.

  • Showcase @ MozFest House Kenya 2023

    Media Monitoring Africa (South Africa)

    Presented Real 411, an independent public complaints platform where citizens can report misinformation, hate speech, incitement to violence and attacks on journalists online.

  • Showcase @ MozFest House Kenya 2023

    GhanaNLP (Ghana)

    Presented Khaya AI, a Natural Language Processing tool for Ghanaian languages, including automatic speech recognition and machine translation. Currently being applied to solve local problems.

Related Projects

Mozilla’s efforts to make the data economy more fair aren’t limited to the Data Futures Lab. Our related projects include:

Creative Media Awards

Supporting people and projects that reimagine the way data is governed, from researchers in South Africa, to sex workers in the U.S., to educators in Italy, and beyond.

Learn More

Common Voice

is an open-source initiative to make voice technology more inclusive. People can donate their voices to an open-source dataset, and technologists can then use that data to create new products. To date, Common Voice has collected more than 20,000 hours of speech in over 90 languages.

Learn more

Regrets Reporter

is Mozilla’s crowdsourced research project into YouTube’s recommendation AI. The open-source browser extension turns everyday YouTube users into YouTube watchdogs, who donate their data to shed light on one of the internet’s most powerful AI curation systems.

Learn more

*Privacy Not Included

is Mozilla’s consumer tech guide that focuses not on price or performance, but privacy. The guide highlights which gadgets and apps respect users’ personal data (and which don’t), allowing shoppers to make informed decisions.

Learn more