The Rise of African-Built AI Models Trained in Local Languages and Why it Matters


Early this year, Google collaborated with African partners to launch WAXAL, a large-scale, openly accessible speech dataset aimed at addressing the persistent digital divide affecting African languages in artificial intelligence systems.

Some of the newly supported languages include Kiswahili, Somali, Afrikaans, Akan, Amharic, Hausa, Kinyarwanda, Afaan Oromo, Sesotho, Wolof, Yoruba, and IsiZulu.

Despite this progress, the development of African AI models continues to face significant challenges, particularly the scarcity of language data. Many African languages have limited digital and written resources, as they are predominantly used in spoken communication. As a result, the lack of standardized datasets makes it difficult for AI systems to accurately capture grammatical structures, dialectal variations, and linguistic nuances.

As technology advances worldwide, ensuring inclusive communication has become increasingly important. Many digital tools, including chatbots and virtual assistants, are primarily trained in widely spoken international languages. This leaves millions of people who primarily communicate in local African languages underserved and excluded from fully participating in the digital economy.

Additionally, African research institutions and global tech companies such as Google and Meta need to continue collaborating in building language technology, creating datasets, translation models, and research support languages for users across Africa. By doing so, researchers and developers across Africa will be helping to leverage the dataset to build more inclusive voice technologies, which will ensure that African languages are not excluded from AI adoption globally.

JOIN OUR TECHTRENDS NEWSLETTER

The impact of developing African-built AI models extends far beyond technological advancement, with the potential to transform multiple sectors. In education, these models can promote greater inclusivity by enabling learners to access information in their native languages. This is particularly beneficial for students who are more comfortable communicating in local languages than in widely used international languages such as English or French.

Other sectors that stand to benefit significantly include financial institutions and government agencies. In many cases, customers and citizens may not be fluent in widely used international languages, while service providers may have limited understanding of local languages. AI models capable of understanding and communicating in indigenous languages can help bridge this gap, improving access to financial services and public resources while promoting greater inclusion.

Furthermore, the development of AI models in local languages marks an important step toward digital inclusion and cultural preservation. It demonstrates that technological innovation can be designed to serve diverse communities and make digital tools accessible to everyone.

“The ultimate impact of WAXAL is the empowerment of people in Africa,” said Aisha Walcott-Bryant, Head of Google Research Africa. “This dataset provides the critical foundation for students, researchers, and entrepreneurs to build technology on their own terms, in their own languages, finally reaching over 100 million people. We look forward to seeing African innovators use this data to create everything from new educational tools to voice-enabled services that create tangible economic opportunities across the continent.”

Go to TECHTRENDSKE.co.ke for more tech and business news from the African continent and across the world.

Follow us on WhatsAppTelegramTwitter, and Facebook, or subscribe to our weekly newsletter to ensure you don’t miss out on any future updates. Send tips to editorial@techtrendsmedia.co.ke

Facebook Comments

FORUM

By Tawheda Ali

Covering innovation, startups, and digital trends across Africa. Send scoops to tawheda@techtrendsmedia.co.ke
Back to top button
×