Across West and Central Africa, hundreds of millions of people speak languages that barely exist in the digital world. No smartphone operating system, no autocorrect and no translation tool that works reliably in Fulfulde, Lingala or Dioula. To understand the potential for solutions, UNICEF commissioned a comprehensive mapping and language technology landscape from CLEAR Global. They asked us to look into 23 countries in the region, and we completed 34 language profiles, assessments of 32 actors working in the field, benchmark data drawn from 19 peer-reviewed sources and a public website built for anyone who wants to understand where things stand.

The commission was not born from abstract curiosity. For UNICEF, language is an operational reality. Public health campaigns, educational materials, child protection resources and community engagement tools still circulate primarily in a handful of languages, leaving rural and marginalized communities without access to information that can directly affect their lives and the lives of their children. Before the project, UNICEF had a broad sense of the challenge and knew of promising local efforts, like Moore language models integrated into a youth engagement app in Burkina Faso. What it lacked was a systematic, comparable picture: which models and datasets exist for a given language, which tasks they support, what independent benchmarks say about performance, and who is building what.

The landscape mapping was designed to fill exactly that gap.

A region of thousands of languages with digital tools that speak just a few

What emerged was a picture of a field growing rapidly but unevenly. Some of the 34 profiled languages had active model development, published datasets and measurable benchmarks. Others, including Koyraboro Senni, Gourmanché and Maasina Fulfulde, had no feasible evaluation path with today’s publicly available tools. That absence is itself a finding: it reveals how far some communities are from benefiting from the AI revolution, not because of a lack of speakers, but because the data infrastructure and investment simply have not been there.

The 32 actors assessed across the region reflect a diverse and fragmented ecosystem: university research labs, startups deploying products, volunteer communities building open models, and government-backed AI centres. Collaboration exists, particularly on open platforms where models, datasets and benchmarks are shared publicly. But the fragmentation is real, and it shapes how UNICEF can engage. The actor profiles in the landscape include dimensions like openness and alignment with UNICEF’s mission, because open resources reduce vendor lock-in and make independent assessment possible. For an organisation that needs to test, iterate and improve tools over time, and not just procure them once, that distinction matters enormously.

Building infrastructure for better decisions

One of the more important shifts the project aims to produce is a change in how country offices frame their decisions. As Niccolo Cirone, UNICEF Data & AI Specialist in West and Central Africa put it, the goal is to help teams “shift the conversation from ‘Can we use language AI?’ to ‘Which language AI is fit-for-purpose, and what evidence do we have?'” That means going beyond the existence of a model to asking what it actually does in context: how it handles dialectal variation, whether it performs appropriately on child-sensitive topics, and whether the communities it is meant to serve have had any role in shaping it.

The benchmarking work reflects that discipline. Where published scores existed from peer-reviewed sources, they were compiled and made available. Where they did not, manual evaluations were conducted for priority languages so that UNICEF teams could distinguish between a model that technically exists and one that performs acceptably for a real use case.

The tension at the heart of this work is not easily resolved. AI can reduce the time and cost of producing local-language content at scale, which is genuinely promising. But deployment without care carries real risks, particularly in communities where data is sparse, dialectal variation is high, and connectivity is still unreliable. Cirone is direct about this: “We should embrace language AI with guardrails, invest in the data and evaluation that marginalized languages need, prioritize openness and accountability, and treat inclusion as a design requirement, not an afterthought.”

The website is built as a living resource. Language profiles and actor data are maintained as structured files and can be updated as the field moves. It is open to anyone: UNICEF country offices making procurement decisions, NGO partners scoping new programmes, researchers looking for gaps, and startups wanting to understand where investment is most needed.

For CLEAR Global, the collaboration is a model for what landscape research can do when it is grounded in operational need. This is not a snapshot for a shelf. It is infrastructure for better decisions, in a region where the stakes of those decisions are very high.

Leave a Reply

Your email address will not be published. Required fields are marked *

8 + seventeen =