December DFL Speaker Series: What Can I Do If the Data Representing My Part of the World is Biased?
Award winning Senegalese artist Linda Dounia Rebeiz will discuss how she addresses bias embedded in mainstream AI tools by using her own datasets to train AI that more accurately reflects her reality.
Have an idea for a topic or want to host a community call? Email us and let us know.
Watch previous events
-
Speaker Series
Dr. Chijioke Okorie: Can New Data Licenses Address Major Issues? (South Africa)
Dr. Chijioke Okorie, Founder and Leader of the Data Science Law Lab at the University of Pretoria presented their work to create a new data license given the known drawbacks of using creative commons licenses in certain contexts (reinforcing extractive practices and digital colonialism, for example).
-
DFL Showcase
Live at MozFest House Amsterdam 2024
The Data Future Labs (DFL) Showcase highlights builders around the world developing data tools and platforms that prioritize the needs and interests of their communities. In the fast-paced world of AI and data the popular discourse and priorities are set by a small but vocal minority. In June 2024, we brought our Showcase to MozFest House Amsterdam. This edition was focused on shifting the narrative around what’s possible in data stewardship and Trustworthy AI. We presented a selection of projects from the Data Futures Lab’s network that are illustrative of what could be done to give more power to people and communities in the age of Artificial Intelligence.
-
Speaker Series
Common Voice: Is There a Better Way to Govern All This Data?
Common Voice is the world’s largest crowd-sourced multilingual open speech corpus. To date, all the data has been released under a CC0 data license but they are going through a collaborative process with data creators to understand how they might offer alternative governance pathways in the future. Product Director EM Lewis-Jong and Community Coordinator Gina Moape talked about their experience and findings thus far.
-
Speaker Series
Copyright Panel: Is Using that Data Even Legal?
Christopher Bavitz, of Berkman Klein Center for Internet & Society, Beatriz Busaniche, of Vía Libre Foundation and Creative Commons Argentina, and Dr. Melissa Omino of the Center for Intellectual Property and Information Technology Law (CIPIT) at Strathmore University, discussed the challenges of existing laws and licenses in the context of data acquisition for training generative AI systems.
-
Speaker Series
Cullen Miller: Spawning (US)
Cullen Miller, VP of Policy at Spawning, presented their work building some of the only tools that enable creatives to determine if their work is part of a training dataset (Have I been trained), opt-out (ai.txt), and identify active web scrapers and reject or misdirect all requests from the scrapers (Kudurru).
-
Speaker Series
Shayne Longpre, Naana Obeng-Marnu, and William Brannon: The Data Provenance Initiative (US)
Shayne Longpre, Naana Obeng-Marnu, and William Brannon, three core contributors to The Data Provenance Initiative, presented their work mapping of 2000+ popular, text-to-text finetuning datasets from origin to creation, cataloging their data sources, licenses, creators, and other metadata, for researchers and builders to explore.
-
Community Call
Anne Lee Steele and Jennifer Ding: The Alan Turing Institute (UK)
Anne Lee Steele and Jennifer Ding from The Alan Turing Institute talked about open, participatory approaches to data stewardship. They focused on case studies drawn from their experiences working with projects like BigScience, BigCode, and The Turing Way.
-
Community Call
Luiz Neves: StopClub (Brazil)
StopClup is a Brazilian app that centers drivers' needs and labor rights by providing them with tools that enable them to make more informed decisions about what rides they accept.
-
DFL Showcase
Live at MozFest House Kenya 2023
The Data Future Labs Showcase highlights local builders around the world developing data tools and platforms that prioritize the needs and interests of their communities. On September 2023, we hosted in Nairobi, Kenya a showcase of five innovative projects from Kenya, Uganda, South Africa and Ghana.
-
Community Call
Kasia Chmielinski, Matt Taylor & Sarah Newman: The Data Nutrition Project (US)
The Data Nutrition Project aims to build labels that highlight the key ingredients in a dataset such as meta-data and populations, as well as unique or anomalous features regarding distributions, missing data, and comparisons to other ‘ground truth’ datasets.
-
Community Call
Vía Libre: EDIA - Stereotypes and Discrimination in AI (Argentina)
Researchers from Fundación Vía Libre (Argentina) presented EDIA, a project that allows social scientists and domain experts in Latin America to explore biases and discriminatory stereotypes present in word embeddings and language models.
-
Community Call
Anne Kim & Katherine Paseman: Secure AI Labs (US)
SAIL helps patient advocacy groups gather, manage, and permission their patient data for ethical research. They presented SAIL's work, including their recent research and insights from working with "Super Patients" and their communities in social media.
-
DFL Showcase
Regina Opondo: Kounkuey Design Initiative (Kenya)
Presenting Living Data Hubs, a community-developed and owned Internet hotspot and data-collection tool providing internet services and digital literacy to the marginalized residents in the informal settlements in Kibera in Nairobi.
-
DFL Showcase
Gilberto Vieira: Data Labe (Brazil)
Presenting Cocôzap / Poopoozap, a citizen-generated data, mapping, training and advocacy on basic sanitation in the favelas of Rio de Janeiro.
-
Community Call
Maximilian Gahntz: Is that even legal? A guide for builders experimenting with data governance
Presenting "Is that even legal? A guide for builders experimenting with data governance." The series discusses practical research on regulations in the U.S., India, Germany, and Kenya, and explores alternative data management avenues.
-
Mozfest 2023
2023 Prototype Fund Cohort: Co-Created A Better Tech Future
The 2023 Data Futures Lab Prototype Fund cohort held a discussion of their unique approaches to data donations in a range of sectors including healthcare, mobility, internet measurement, and content moderation.
-
Mozfest 2023
Centering Communities in the Data Economy
This panel will explore that spectrum through the lens of indigenous language data, while drawing on a broader range of examples from other fields. Guests: Kathleen Siminyu, Mozilla Common Voice Fellow; Keoni Mahelona, CTO, Te Hiku Media; Timnit Gebru, Founder and Executive Director, DAIR. Moderated by J. Bob Alotta, Mozilla Foundation.
-
Community Call
Denise McKenzie: PLACE Data Trust (US)
Denise McKenzie, Community and Ethics Partner at PLACE, presents their work making high-resolution mapping images accessible under a public interest legal trust.