Resource Library

Annual Report 2024

March 6, 2026

Read the 2024 Annual Report now. 2024 was a year of maturity for the organization. We are moving toward more sustainable income sources and […]

Shuwa Arabic voice dataset

November 21, 2025

Voice datasets are structured collections of audio recordings paired with corresponding text transcriptions, metadata, and annotations. These datasets serve as the foundation for training […]

Kanuri TTS and ASR models

November 18, 2025

Voice datasets are structured collections of audio recordings paired with corresponding text transcriptions, metadata, and annotations. These datasets serve as the foundation for training […]

Indigenous participation in early warning systems in Bolivia

October 8, 2025

Indigenous communities in Bolivia want access to practical, actionable and timely early warning systems in a language that they can understand. They are increasingly […]

Annual Report 2023

October 1, 2025

Changing direction is never very easy; in 2022 we developed a new Direction of Travel, focusing more on developing partnerships and language technology to […]

Toward more scalable speech technology for African languages

August 8, 2025

Learn how we are exploring the potential of synthetic data to improve automatic speech recognition for low-resource African languages Africa is home to over […]

Playbook for voice data collection for low-resource languages

August 2, 2025

The TWB Voice Playbook is a practical guide to planning and managing voice data collection projects for low-resource languages. It is aimed at both […]

Chichewa synthetic voice dataset, TTS models, ASR models

July 30, 2025

Below is a curated collection of open resources for text-to-speech (TTS), automatic speech recognition (ASR), and synthetic voice datasets in the Chichewa language. Text-to-Speech […]

Hausa synthetic voice dataset, TTS models, ASR models

July 30, 2025

Below is a curated collection of open resources for text-to-speech (TTS), automatic speech recognition (ASR), and synthetic voice datasets in the Hausa language. Text-to-Speech […]

Dholuo synthetic voice dataset, TTS models, ASR models

July 30, 2025

Below is a curated collection of open resources for text-to-speech (TTS), automatic speech recognition (ASR), and synthetic voice datasets in the Dholuo language. Text-to-Speech […]

From words to impact: an evaluation of CLEAR Global’s work in northeast Nigeria from 2017 to 2025

June 20, 2025

This report presents an evaluation of our northeast Nigeria program between 2017 and 2025. The program is on standby as of June 2025 due […]

Marma TTS and text data resources

June 18, 2025

Text data This dataset contains sentences in the Marma language (ISO code: rmz), with both original and normalized forms. The dataset is designed to […]