The Ministry of Statistics and Programme Implementation (MoSPI) has significantly upgraded its official data portal, making it directly readable by AI models to ensure credible government data usage and enhance public service delivery across India.
Key Points
- MoSPI has upgraded its official data portal to be directly readable by Large Language Models (LLMs) to ensure AI models use credible government data.
- The ministry is undertaking a data harmonisation exercise, standardising 288 priority datasets across various ministries for economic and social importance.
- A foundational challenge for AI in India is semantic interoperability, ensuring AI systems understand data context and classifications across different departments.
- The government is standardising metadata for 288 datasets using 38 identifiers and 88 international classifications to make data FAIR (Findable, Accessible, Interoperable, Reusable).
- The ultimate aim is to improve public service delivery by enabling faster and more efficient rollout of welfare programmes and significantly reducing leakages.
To ensure that AI models do not rely on non-credible sources for government data, the Ministry of Statistics and Programme Implementation (MoSPI) has upgraded its official data portal to be directly readable by large language models (LLMs), a senior official said on Friday.
Enhancing AI Access To Credible Government Data
Secretary of the Ministry of Statistics and Programme Implementation (MoSPI), Saurabh Garg, said the government is undertaking a data harmonisation exercise, standardising 288 priority datasets, which are important from an economic and social perspective, across ministries.
Speaking on the transition towards an “intelligence infrastructure”, he said the ministry has recently added a Model Context Protocol (MCP) layer wrapper around its portal. This technological upgrade allows LLMs to directly access and process official statistics.
“If the models don’t get easy access to credible data, there’ll be some other data filling up the gap,” Garg said at an NCAER event here, noting that the ministry is among the first globally to implement an MCP on government data to ensure AI models have access to trustworthy information.
Addressing Semantic Interoperability Challenges
However, Garg highlighted that the foundational challenge for AI in India is semantic interoperability — ensuring that AI systems can understand the context and classifications of data across different departments.
Illustrating the issue of siloed information, he pointed out that five different ministries have five definitions of what constitutes a “pakka” house.
“I think where we need to work more is on the semantic interoperability, so that AI systems can understand the context of the definitions and the classifications. And this is extremely important because if a definition of any concept in two systems is different, then those two systems cannot talk to each other,” Garg explained.
Standardising Datasets For Improved Public Services
To resolve these discrepancies, the government has identified 288 datasets across ministries and is standardising their metadata. Officials are utilising 38 different types of identifiers and 88 international classifications to ensure the data is FAIR — Findable, Accessible, Interoperable, and Reusable.
Garg emphasised that the ultimate aim of putting harmonised government data in the public domain is to improve public service delivery. He noted that integrated data sets are already enabling state governments to identify beneficiaries and roll out welfare programmes within weeks of announcement, a process that previously took a year or more, while significantly reducing leakages.
Disclaimer: News content is sourced from the stated source. Headlines, summaries, section headers, and images are automatically generated or selected using AI/algorithms and may not always be fully accurate. Readers are advised to refer to the full article for complete context.
























