Educational data collection and analysis can both improve and introduce risks to India’s sprawling education system

Findings follow a nine-month investigation by Mozilla and Aapti Institute, supported by USAID

(INDIA | JUNE 10, 2024) -- India’s sprawling education system can benefit greatly from data collection and analysis — but also faces related challenges and risks like poor data privacy, outdated technology, and data fragmentation, according to new research conducted by Aapti Institute and Mozilla and funded by USAID.

The research, titled “Strengthening Data Ecosystems in Indian Schools,” explores how to responsibly leverage data like enrollment rates, student-teacher ratios, attendance, assessment results, and socio-economic information to shape education policies and interventions.

The investigation was carried out over the course of nine months by more than a dozen local experts, and spanned 10 states. The Indian school education system includes over 1.49 million institutions, from grade schools to universities, and serves more than 265 million students.

The research identifies eight major challenges in India's educational data ecosystem, like outdated processes and over-burdened teachers. It also proposes eight recommendations for addressing those challenges, like deploying data specialists in rural areas and promoting open data principles.

Says Mehan Jayasuriya, Senior Program Officer at Mozilla: “India has one of the largest education systems in the world, and data plays an increasingly pivotal role in shaping its policies and interventions. When handled responsibly, this data can transform educational outcomes for the better. But there are also significant risks, from mismanagement to misuse. Our investigation explores how to unlock positive transformation while mitigating harms.”

Says Astha Kapoor, Co-Founder of Aapti: “India's educational institutions are generating enormous amounts of data, which is being used to track student performance and teacher efficiency. However, in this vast and growing data ecosystem, key stakeholders — students (often under 18), teachers, and parents — do not have the means to understand or challenge how their data is being collected and used by government and private education providers. Additionally, teachers are increasingly burdened with data collection responsibilities, adding to their enormous workload. Therefore, the need to critically examine the education data ecosystem arises: Who collects the data, who uses it, for what purpose, and what agency do people have in this process?”

Says Chris Burns, Chief Digital Development Officer at USAID: “Globally, we’ve seen a rapid increase in the use of digital tools to drive development, but not as much emphasis on the reliability of the data underpinning such tools. Ensuring access to clean, participatory data is an essential component of our work to foster open, inclusive, secure and rights-respecting digital ecosystems that enable people to thrive.”

Key Challenges

Data fragmentation. Data collection efforts are fragmented, with multiple bodies and authorities reaching out to schools at different times for data. This puts a burden on schools, which results in the data not being provided in a timely manner.

Data rich, information poor. Schools often struggle with the practical application of data. This is compounded by a limited understanding of data's potential beyond mere numbers, highlighting the need for a more data-informed mindset.

Technical barriers. The digital divide poses a significant barrier, especially in rural and underprivileged regions, with poor ICT facilities potentially leading to data gaps which result in skewed policy decisions.

Poor coordination. There is no engagement between the organisations behind India’s two largest educational surveys, the Ministry of Education’s NAS Survey and Pratham’s ASER.

Outdated processes. The current data collection model is labour-intensive and fraught with challenges, from manual data entry to the logistical nightmare of paper-based records management.

Teacher burdens. The task of making data entries is an administrative burden for teachers without any additional monetary compensation.

Decentralization. Both the states and the central government have legislative power over the educational sector, contributing to implementation challenges.

Privacy and security risks. The involvement of multiple stakeholders, each with different levels of data access and handling capabilities, creates serious data privacy and security risks.


Establish a centralised data collection agency. Mozilla and Aapti recommend the formation of a central agency responsible for the annual collection of educational data through a standardised and rigorous process, coupled with the promotion of open data principles.

Incentivize teachers. Teachers undertaking data entry tasks should receive financial incentives for the additional workload.

Deploy data specialists in rural areas. Mozilla and Aapti recommend having data specialists in each administrative block. These specialists will provide essential support to schools by facilitating the use of technology in data collection processes.

Strengthening infrastructure and capacity. Mozilla and Aapti recommend reallocating budgets more effectively towards annual training of IT staff and teachers and ensuring reliable internet connections.

Provide feedback and iterative development. Mozilla and Aapti advise establishing a robust, bottom- up feedback loop. This mechanism should comprehensively assess the relevance and utilisation of collected data, alongside operational metrics such as data entry frequency and regional update discrepancies.

Addressing funding issues. Various NGOs and private organisations should consider a pivot in their funding strategies, focusing on the impact of their data (i.e. ability to influence systemic change over the long term) and not just the outputs (i.e. immediate results).

Incentivise data standard compliance. Linking financial incentives, particularly through schemes like POSHAN, can motivate state governments to align with central guidelines.

Create school leaderboards. Mozilla and Aapti recommend developing a standardised scoring system for benchmarking schools. This would allow schools to recognise their effectiveness in utilising the collected data.


The report follows several months of careful research and convenings. Research was carried out by five working groups across India, composed of academics, education advocates, and data scientists from public and private schools. Each working group, well-versed in India's educational ecosystem and current regulatory frameworks, initiated a project to explore the practical challenges associated with data in the sector. Working groups also convened to share emerging insights and solutions at the Strengthening Data Ecosystems in Indian Schools workshop from November 27 to 30, 2023 at the India Habitat Centre in New Delhi.