Mozilla's Internet Health Report is a podcast this year! We meet AI builders and policy folks who are making AI more trustworthy. This is an extended cut of an interview from the IRL podcast episode "AI from Above" that has been edited for ease of reading.
Astha Kapoor is the co-founder of Aapti Institute, a Bangalore based tech and society research organization. Her work explores how data can be used to benefit society while safeguarding rights. Astha previously worked as a consultant on financial inclusion, development policy, and government planning.
We are a public research firm based in Bangalore who look at the intersection of tech and society. We examine how people are negotiating and accessing technology, both online and offline. So for example, the government has been digitizing a lot of its services, and the assumption is that once you digitize something, access becomes easier, more equitable. But we know that that’s not the case: technology can increase marginalization, vulnerability, and complicate access further. So what does it mean to make access to technology more inclusive and equitable with the increased datafication of society around the world. We’re trying to understand the new mechanisms to govern data, such that individuals and communities have more say in how their data is used, collected, etc. We look at how to make data accessible while safeguarding rights, focussing on human governance mechanisms within organizations, and the policy questions needed to make democratic, participative data sharing more possible.
Geospatial data is interesting and tricky. People believe that spatial data isn’t about people, because it isn’t generated from human action. But it does impact people. It’s air pollution data, for instance, or data about a set of buildings. If you think of optical imagery of some cities, or where water bodies are, those are questions that impact communities. Therefore communities should also have a greater say in how that data is being collected, used, and managed, because there may also be harms. For instance, if there’s a dense part of the city versus a more open one, you could have different sets of rents or access to schools. Construction companies could use it to choose where to invest.
What we know about data, is that it’s not equal for everybody. Every set of data opens up an opportunity, and someone could exploit that opportunity.
People who use smartphones generate a lot of data, whereas communities that are not on smartphones may not. These communities may not be mapped, in a certain sense, and may not be able to get the same information or services. On the other end, you may have communities, like religious minorities or others who the state or other service providers want to discriminate against. The objective could be to collect a ton of data about them, to target them. We have so many examples of this happening. We’ve learned that data is political, and should be considered as such.
It’s a complex thing. There are, for instance, communities in Bangalore where I live, that are not reflected in Google Maps. So these communities do not experience the benefits of maps, and we know that maps offer huge community value — whether for navigation or to reveal small businesses. And that is a burden certain parts of the city have to carry, because, to other people, they’re undiscoverable. The map is dark there.
This should be corrected. In some communities, people have gone and photographed and created geospatial data about their own scenarios, where others were unable to. If you, as a community, are not visible to either the government, private sector, or community providers, you may not have lamp posts or sidewalks, or the conditions of your roads may be terrible. So members of a community may be able to collect data digitally or through surveys, but it requires significant mobilization. You need institutional mechanisms, potentially nonprofits, who can mobilize people to volunteer to start collecting data. There are apps and software that allow communities to do this, which you can then plug into existing datasets.
In India, we are contemplating data protection regulation that looks at protecting individual rights. What’s interesting is that geospatial data is not personal data, right? It doesn’t identify individuals or individually identifiable data. So I think it’s still a gray area for regulation in India. The government has started to think about how we should treat non-personal data, and how we can empower communities to draw value from it while preventing harm. This is also work that we’ve been doing at the Aapti Institute, exploring models for “data stewardship”. The question is how to involve communities through an intermediary that can represent their interests, while holding the government or the private sector accountable. Institutions for this don’t yet exist in any meaningful, regulated way, but we need representative bodies that reflect the concerns of citizens and communities at different levels. We need intermediaries that represent and safeguard the interests of say, a neighborhood, or women, or religious minorities.
I’m totally bought into the idea of stewardship. We need to build these stewards within communities, so that more data in data-dark communities is collected and provides a a platform from which to negotiate with the government and with tech companies. We need to create this intermediary layer which has a fiduciary responsibility to the community to represent the will and interest of the community on questions of data.
I do think that we do need to think more critically about questions of harm, exclusion, and bias that come from geospatial data — and of how we share it with governments. There is immense value in sharing data, we just need to share it in a way that is controlled. That is regulated. That involves communities and is being directed by communities in a certain way. Communities should have much more say than they do now, in how data is being collected, and for what.
Portrait photo of Astha Kapoor is by Hannah Yoon (CC-BY) 2022
Mozilla has taken reasonable steps to ensure the accuracy of the statements made during the interview, but the words and opinions presented here are ascribed entirely to the interviewee.