Starting 4 Billion Conversations with inclusive language technology
Advancing language technology for social good starts with understanding and collaboration
Each person’s language represents their unique culture and identity. Language technology enables us to get and share vital information from one side of the globe to the other. Yet the growing digital language divide leaves out half the world’s population.
Four billion people speak a language that is under-represented in the digital space. Although there are an estimated 7,000 languages spoken worldwide, just 5% are online. And while AI-powered chatbots like ChatGPT have surged, they work best in English, French, and other languages that are common on the internet (with large existing datasets). Vital language data doesn’t exist for languages that the world’s most marginalized communities speak, and too many people are ever-more excluded from critical conversations. If we want to make inclusive, sustainable progress, it’s time we start to listen.
Through funding from the German Federal Ministry of Economic Cooperation and Development, represented by the GIZ “FAIR Forward – Artificial Intelligence for All” project, we had the opportunity to build the foundations of the Four Billion Conversations movement (4BC). The goal was to spark multilingual conversations that give people the tools they need to enable fairer access to information and ensure everyone can be heard.
Building the Four Billion Conversations (4BC) movement: leveraging language technology to address exclusion
The Four Billion Conversations (4BC) movement unlocks the power of language technology to address exclusion. By promoting inclusivity for marginalized communities, we can create sustainable change globally. With impactful AI solutions that scale, we can enable more people to participate in conversations that affect them. In social development and humanitarian contexts, where people are excluded from access to crucial information, services and decision-making, multilingual tools and resources can support their needs. It all starts with understanding what local communities and individual service users want, and how we can multiply our impact to reach more people in their language. CLEAR Global’s language AI solutions so far include chatbots, machine translation, and automatic speech recognition technology solutions, particularly for low-resource languages.
Increasing awareness and reach of language AI
With this GIZ-funded project, CLEAR Global focused on building partner networks, and advancing and raising awareness of language AI in regions where people may have limited connectivity or low literacy. We explored multilingual channels like TILES (Touch Interface for Language Enabled Services), that work for people with limited connectivity or low literacy. We researched and developed resources that can support local communities with critical climate change resilience and sexual and reproductive health information needs. Our overall objectives were clear:
1. Make language AI and information about it accessible to non-specialists, so those without a technical background can use language tech easily.
2. Develop a playbook: a guide for social impact partners to integrate language AI into their programs.
3. Build dialogue: engage with program and tech partners across East Africa, West Africa, and South Asia, on technology for social good.
4. Mobilize communities with our translation platform: extend the TWB Platform’s capability for language data collection, validation and engagement with our global community language volunteers.
5. Identify use cases: explore potential applications of language AI in climate change resilience for farmers and sexual and reproductive health.
Understanding social impact organizations’ needs and challenges
First, we sought to learn about the needs of potential partners and users of language technology in the humanitarian and development sectors. We know from experience we must engage with local communities to understand their needs. So the team conducted structured interviews with social impact organizations and technology developers based across East Africa and South Asia. This would help us understand the existing landscape of language technology in low-resource contexts.
What we found:
– Social impact organizations say language technology has potential: language barriers hinder their programs. Only some of those we spoke to already use limited technology to communicate with service users.
– Adopting language technology requires understanding and support: organizations need funding for development and infrastructure, building staff capacity and addressing privacy concerns.
– Organizations that use technology to interact with users and to address language issues enhance their programs’ reach.
– Technology developers need support to build communities and networks to improve datasets and financial resources to support those initiatives in low-resource languages.
As part of this project, we collaborated with partners who focus on climate change resilience and sexual and reproductive health, including Data Science Nigeria, Digital Umuganda, Families Fit for Children, Gram Vaani, Kali, Karya, Lesan, Malaica, Masakhane, Reach a Hand Uganda and Urukundo.
Mobilizing communities to engage with language data collection
In this exploratory project, we developed a design mockup of a data collection and validation tool that would allow community members to contribute, collect and validate language data easily on the TWB Platform. This software would enable new community members to contribute to datasets that power language technology solutions. One of our aims is to create better platforms for mobilizing language communities to contribute to inclusive, low-bias data creation and validation. So, we collaborated with TWB community members and linguists globally to test out the user journey of our prototype. Additionally, experts at the GLG Social Impact Program’s network shared experience and insights to help inform our strategic considerations around open-sourcing the data collection solution.
We expect to leverage the insights gathered during the mockup testing phase to enhance the Minimum Viable Product (MVP) tool, evaluate the development efforts and ascertain the resources needed for creating the final tool. As for the next steps, we will refine the infrastructure and the tool development plan based on the feedback collected, before initiating the implementation phase.
One of the most important considerations is usability.
If we want to make vital language data collection more inclusive, we need to consider diverse user needs and experiences. During the process, researching and mapping initiatives across Africa offered useful benchmarks and insights into user-friendly data collection tools to improve community engagement. Notably, we observed that the majority of existing solutions tended to prioritize data collection, often neglecting the crucial aspect of user experience. While the tools we researched, such as Tatoeba, Sentence Society, Mozilla Common Voice and Pontoon, were technical and less intuitive, we delved into contemplating the user journey across various scenarios. Our approach focuses on human-centered design principles to create a more inclusive data collection tool, designed with users’ perspectives in mind.
Deploying TILES, an AI-powered information device, in collaboration with Gram Vaani.
Unlocking the power of language technology for climate resilience
CLEAR Global set out to test a solution that addresses a critical lack of access to information about climate change resilience in low-connectivity and low-literacy contexts. In close collaboration with India-based social impact tech company Gram Vaani, we deployed and evaluated an AI-powered information device to enable farmers to get critical answers to their questions about climate change and adaptation. We are introducing TILES – the Touch Interface for Language Enabled Services. This language technology solution responds to voice commands and gives spoken answers to questions in the target language, specifically the Hindi dialect as spoken in Bihar, India.
Growing real-world impact
Farmers in Bihar, India, need information on agriculture seasons, weather predictions, expected rainfall with timeline, agri-inputs and practices to combat the dangerous effects of climate change. The community is concerned about their livelihood and wants to know how to become resilient.
“The machine should tell us agriculture-related information such as weather-based information, seed to use and protect crops from pests. This information will help improve our agriculture and help us combat climate change.”
– TILES user, Gidhaur Tweet
Considering the local community’s concerns, we first had to develop targeted, localized content that could answer some of their most pressing questions. Using a Hindi automatic speech recognition (ASR) model, we trained it on relevant topics. We developed question/answer capability with speech-to-text (STT) and text-to-speech (TTS) functions to create a multilingual, two-way communication tool that works offline. Our team fine-tuned the model to adapt to local dialects and specific climate terminology used in the Bihar region to ensure the messages could be understood and delivered effectively. Importantly, this innovative tool would help inform and serve people who have little or no access to reliable, accurate and on-demand information on vital topics such as climate change resilience, adaptation and farming practices.
“I like the fact that the information could be received by speaking with the machine.”
– TILES user, Jamui district
With the help of GIZ, we were able to test the device out with communities of farmers affected by climate change. The team deployed TILES in a local seeds and pesticides shop. The goal was to discover how useful users found the tool for exchanging critical climate resilience messages. Could it support farmers to adapt and cope with the impacts of climate change in Bihar? TILES successfully captured 467 user interactions. The initiative gave us valuable insights and evidence on the potential impact of language AI to transform information sharing with, and listening to, marginalized communities.
We put humans at the center of solutions designed to support them
So we asked the local farmers what they thought:
- 95% said they’d use the system again
- 87% said TILES answered their questions correctly
- 70% said they’d spread the word
- 61% found the information relevant
(Results from Gram Vaani’s survey of people using our TILES device)
Language technology solutions such as TILES can provide opportunities for marginalized communities. Accessible and inclusive tools in the languages people speak can help users access valuable information and organizations listen to their needs in low internet connectivity and low literacy contexts. To learn more about the TILES project, read the CLEAR Global blog on using AI to support farmers in adapting to climate change. We’re excited to expand and scale this work to help reach more people in more contexts.
The Language AI Playbook: how to use technology to engage communities
CLEAR Global developed the Language AI Playbook to help social good partners integrate language AI technology into their programs. It’s a comprehensive guide for both program and technology partners. It attempts to make language AI comprehensible, and information about using language technologies accessible to those without a technical background.
How the playbook helps navigate language AI
The Language AI Playbook aims to empower organizations to leverage language technology effectively to scale their impact. By understanding how to identify impactful use cases and deploy language technology, NGO partners can drive more effective communication between service users and program providers. The playbook helps people with various backgrounds in programming understand how and when to use language technology to amplify communication and listen to people’s needs in low-resource multilingual contexts.
What’s inside the Language AI Playbook?
Packed with actionable insights and support for implementing language technology, the Language AI Playbook helps users understand relevant terms, identify impactful use cases, improve communication and collaboration, manage data effectively, implement language tech, and deploy machine translation solutions. It emphasizes real-world examples and encourages collaborative learning among partners.
- – Introduction to language technology: learn how language technology works, with examples of projects that leverage its impact.
- – Overview of language technology: understand the basic ideas, concepts and tools used in the field including various language AI technologies such as machine translation (MT), ASR, and chatbots/NLP.
- – Opportunities for partners: discover real-life examples that give a sense of how language technology can solve practical problems, especially in reaching marginalized communities in languages that don’t have a lot of resources.
- – Identifying impactful use cases: a guide to finding and understanding situations where language technology can make an impact. We’ll help you determine if an idea is realistic and what results you can expect.
- – Communication and collaboration: deals with understanding local community issues, collaborating with partners or local communities to solve problems, and sharing the impact of these initiatives among these communities.
- – Language technology implementation: guidelines, best practices and steps for using, integrating, designing and deploying language AI systems effectively, with practical examples.
- – Development and deployment guidelines: provides practical knowledge in deploying solutions with a focus on chatbots and machine translation. This chapter dives into building data, training models, deploying, monitoring and measuring progress and the impact of language AI projects.
Addressing an information gap
As part of the process, Uganda-based organization, Reach a Hand Uganda reviewed our playbook. Their feedback confirmed that the Language AI Playbook addresses an information gap; it provides valuable insights, understanding and guidance for organizations in multilingual contexts to support critical conversations. At the same time, it promotes capacity building around language AI to amplify marginalized voices.
“One of the significant strengths of the Language Technology Playbook is its emphasis on fostering collaborations. The playbook recognizes the importance of partnerships and cooperation among various stakeholders to ensure the effective implementation of language technology.” – Key informant interview
“For some organizations, even if they want to design tools, they may not have the full scope of what they want to achieve. The playbook gives you full scope so that even if you want to deploy in phases, you know what you will require to reach the end goal.” – Key informant interview
“The system is really good because as a developer, I see it can benefit very many people. Like when I was on campus I developed a system that was all about AI […] the system was recognizing the person according to his or her voice whether that person speaks in a Luganda, whether he speaks in a Runyoro, it could recognize him. So then I saw this documentation. It was kind of related because it’s going to help people on how to develop and use any language.” – Focus group discussion report
In the rapidly evolving domain of language technology, our Language AI Playbook has to be a living document, with the potential to continuously adapt: to identify, share and build new, relevant resources, solutions and approaches.
Learning and future potential for language technology
The 4BC journey so far reflects CLEAR Global and GIZ’s joint commitment to leveraging collaborative language technology for inclusive social impact. The challenges we encountered along the way have fueled ideas for future adaptations, improvements, expansion and scaling to support more people. Our Language AI Playbook is a valuable resource for organizations who find themselves navigating the complexities of language AI in low-resource contexts. And our use cases show great potential for more inclusive initiatives in the future.
Since we launched the Four Billion Conversations initiative, the generative AI revolution has energized the AI research community and the wider public. Powered by large language models (LLMs), it has opened up new possibilities. With it, the divide in available technology between dominant and marginalized language speakers increases. The Four Billion Conversations movement’s goal to make information available to more language speakers has never been more relevant.
The CLEAR Global team hopes to continue building the 4BC movement, explore further use cases, develop data tools, update the Language AI Playbook, and expand our reach.
Discover how we can collaborate to better listen to communities and scale our impact.
Become a partner.
Help spread the word about our important work.
Sharing our message is the easiest way you can help!
About CLEAR Global
CLEAR Global helps people get vital information and be heard, whatever language they speak. We help our partner organizations to listen to and communicate effectively with the communities they support. CLEAR Tech helps organizations identify potential use cases where language technology can drive user engagement and scale communication efforts. We develop language AI solutions such as chatbots, machine translation and speech recognition technology solutions for low-resource languages. CLEAR Global’s UX team can support user research, UX design and advise on human-centered design approaches to technology interventions.
Our Language services team can translate messages and documents into local languages, support audio translations and pictorial information, train staff and volunteers, and advise on two-way communication. We work with partners to field test and revise materials to improve comprehension and impact. This work is informed by CLEAR Insights research, language mapping and assessments of target populations’ communication needs.
For more information visit clearglobal.org
About GIZ Fair Forward:
On behalf of the German Federal Ministry for Economic Cooperation and Development (BMZ), the Deutsche Gesellschaft für Internationale Zusammenarbeit (“GIZ”) implements the project “FAIR Forward – Artificial Intelligence for All” which strives to create a more open, inclusive and sustainable approach to AI on the international level, and more specifically, to develop artificial intelligence ecosystems locally across its seven partner countries (Rwanda, Uganda, Kenya, South Africa, Ghana, India and Indonesia).
FAIR Forward pursues three main goals:
1. Remove entry barriers to AI – Access to training data and AI technologies for local innovation: FAIR Forward facilitates the provision of open, non-discriminatory and inclusive training data and open-source AI applications. Open access to African and Asian language data is a key priority to enable the development of AI-based voice interaction in local languages to empower marginalized groups.
2. Strengthen local technical know-how on AI – Capacity development in Africa and Asia: FAIR Forward supports digital learning and training for the development and use of AI and fosters cooperation with German and European research institutions and businesses.
3. Develop policy frameworks ready for AI – Ethical AI, data protection and privacy: FAIR Forward advocates for value-based AI that is rooted in human rights, international norms such as accountability, transparency of decision-making and privacy, and draws on European experiences such as the EU General Data Protection Regulation (GDPR). Therefore, the project supports the development of effective political and regulatory frameworks in Africa and Asia.
For more information, visit FAIR Forward – Open data for AI (bmz-digital.global)
– Written by Danielle Moore, Communications and Engagement Officer, CLEAR Global