SOCIETY | 13:47
162
6 min read

Uzbekistan to develop national AI language model to preserve cultural identity and ensure digital sovereignty

Uzbekistan has launched the development of a national artificial intelligence (AI) language model, a project that experts believe will reinforce digital sovereignty, preserve cultural identity, and boost AI integration in fields such as healthcare and education.

Photo: Freepik

The initiative is part of the national Strategy for the Development of Artificial Intelligence Technologies through 2030, approved in October 2024. Senior officials from the Ministry of Digital Technologies and the Presidential Administration have outlined the vision behind the project and its anticipated outcomes.

Data collection underway

According to Sarvar Sadullaev, Head of Department at the Ministry of Digital Technologies, Uzbekistan’s initial steps toward AI began in 2021, with full-scale activity gaining momentum in 2024. As a first step in implementing the national AI strategy, digitized ministries and agencies were tasked with submitting datasets to the Ministry of Digital Technologies.

“This initial phase involves collecting data in the Uzbek language – literary and analytical texts, images, and anonymized medical data such as MRI and PET/CT scans – for use in early disease diagnosis projects,” Sadullaev said. These will be stored in raw format and later categorized into datasets suitable for machine learning.

Negotiations are also underway with experts to label the data, a process that will follow collection. The ultimate goal is to build a comprehensive national language model that can understand and generate Uzbek text, interpret images, and work with specialized data.

More than just language processing

Sadullaev noted that more than 20 AI-based healthcare projects are already in progress. A small GPU cluster has been deployed to support them, and a large-scale cluster is planned for 2026, expected to enable up to 100 projects across various sectors.

A core mission of the national model, he said, is to ensure historical and cultural accuracy in digital content.

“In some countries, Amir Temur is seen as a hero, in others – a villain. Global language models like ChatGPT might not have access to accurate information about Uzbekistan or may present distorted facts. A national model would correct such gaps by embedding knowledge aligned with Uzbekistan’s perspective – from history and traditions to notable figures and language,” Sadullaev explained.

He emphasized that training the model will involve unique cultural and historical materials that are native to Uzbekistan. These locally sourced data will also be shared with major LLM (large language model) developers so that future global models can include Uzbekistan’s viewpoint.

Addressing bias and stereotypes

Hikmatilla Ubaydullaev, Deputy Head of the Department for Financial Technologies, Digitalization and Artificial Intelligence at the Presidential Administration, pointed out that interest in AI surged after ChatGPT became accessible in Uzbekistan in 2023.

“Even now, issues persist when using existing tools. For example, when asked to generate an image of an Uzbek person, models often produce stereotypical or inaccurate visuals – a bearded man in a skullcap or a veiled woman. This happens because there isn’t enough data about Uzbek people and their way of life,” he noted.

Ubaydullaev stressed the need for a national dataset that reflects the diversity of appearances and lifestyles across Uzbekistan. “For example, if we enter ‘Samarkand, Registan’ into an AI tool, it’s beginning to get it right. But we’ve had issues in the past – that’s why we’re building our own dataset,” he said.

A tool for all sectors

Sadullaev added that the national model will have broad applications – from translation and speech recognition to drafting medical protocols, generating call center scripts, and customer service in banking. It is intended to become a universal AI tool to support development across all sectors.

Ubaydullaev emphasized that the government is creating infrastructure that will reduce AI adoption costs for the private sector. “Unlike renting foreign servers, which can cost $10,000–20,000 per month, local computing will be cheaper thanks to domestic TAS-IX traffic. This is particularly beneficial for startups and small businesses,” he said.

He also highlighted that a local model is crucial for processing confidential information in government bodies, reducing dependence on external service providers.

“The more people in Uzbekistan interact with AI in their native language, the faster the model will improve,” Ubaydullaev said. “Uzbekistan isn’t just catching up – it’s building a sustainable and independent position in the global AI market by shaping a framework for digital independence based on national interests.”

Similar initiatives to develop local language models are underway in other regional countries, including Kazakhstan and Tajikistan.

Related News