Partner Information Sheet

         Technology Innovation with Open Voice Data 

 

Speech is becoming a preferred way to interact with personal electronics and the future of human-machine interaction lies in voice control. However, developers, researchers and startups around the globe working on voice-recognition technology face one problem alike: A lack of freely available voice data in their respective language to train speech-to-text engines. Although machine-learning algorithms like Mozilla’s Deep Speech are in the public domain, training data is limited. 

 

Most of the voice data used by large corporations is not available to the majority of people, expensive to obtain or simply non-existent for languages not globally spread. The innovative potential of this technology is widely untapped. This particularly true for Africa, despite being home to a rich diversity of languages. 

 

Today’s Internet – and consequently the technologies and services built on top – is heavily skewed towards English, in a world where English is only spoken by 20% of the global population, and only 5% natively. If we want technology to become more inclusive, we need to make sure that future technologies – especially doors to access information and services – exist in many languages. We believe voice recognition technology can and must be one of these doors.  

 

In providing open datasets, three partners – Digital Umaganda, GIZ (Deutsche Gesellschaft für Internationale Zusammenarbeit GmbH) and Mozilla – aim to take away the onerous task of collecting and annotating data, which eventually reduces one of the main barriers to voice-based technologies and hopefully makes front-runner innovations accessible to more entrepreneurs. 

 

Creating some level-playing field with open Common Voice data is just the first step. In the near future we hope this will inspire demand for innovative voice-based solutions, allow for local value creation and help to build an ecosystem around voice data as an open resource, which not only large international companies eventually benefit from, but everyone.

With voice interaction available in their own language, we may provide millions of people access to information, make technologies more inclusive and ultimately foster a just, locally rooted yet global digital transformation. 

 

Digital Umuganda

DIGITAL UMUGANDA is an Artificial intelligence and common digital infrastructure company currently focusing on voice technologies to democratize access to information and services hence reducing the digital divide gap. This is done by being a platform for international commons initiatives such as Common voice  linking global efforts to local communities and contexts. Digital Umuganda projects aligns with the national digital smart master plans with a focus on projects with a sustainable development impact. 

In Rwanda, Digital Umuganda is working in partnership with Mozilla and GIZ to build a Kinyarwanda voice data set. With a Kinyarwanda datasets many Rwandese will have access to information and services in their local languages hence reducing barriers to access. It will also be an opportunity for local developers to build solutions using an open infrastructure they would otherwise not have access to. By having this open infrastructure, resource allocation duplication will be avoided and a voice ecosystem will be created. 

Contact

Audace Niyonkuru

Chief Executive Officer

audace804@gmail.com

+250788498484

 

GIZ – Deutsche Gesellschaft für Internationale Zusammenarbeit 


As a service provider in the field of international cooperation for sustainable development and international education work, GIZ is dedicated to shaping a future worth living around the world. GIZ has over 50 years of experience in a wide variety of areas, including economic development and employment promotion, energy and the environment, and peace and security. The diverse expertise of GIZ’s federal enterprise is in demand around the globe – from the German Government, European Union institutions, the United Nations, the private sector and governments of other countries. GIZ works with businesses, civil society actors and research institutions, fostering successful interaction between development policy and other policy fields and areas of activity. The GIZ’s main commissioning party is the German Federal Ministry for Economic Cooperation and Development (BMZ).

In the context of the GIZ Innovation Fund, GIZ has launched a project to actively support the crowdsourcing of open voice data collection in under-represented languages. By partnering with both the Mozilla Foundation and local organizations, GIZ aims at supporting local value creation and context-relevant innovation in the field of artificial intelligence. In Rwanda, the Digital Solutions for Sustainable Development (DSSD) program will support open data collection in Kinyarwanda. With this initiative, DSSD aims both at supporting the development of voice-based solutions and at strengthening local organizations and capacities of the Rwandan innovation ecosystem at large. GIZ support will encompass funding of data collection activities, but also provision of mentoring for network and community building, as well as business development.

Contact

Germany

Lea Gimpel

lea.gimpel@giz.de

+49 1577 892 1709

Rwanda

Jan Krewer

jan.krewer@giz.de

+250 782 751 686

 

Mozilla

 

Mozilla, the non-profit organization behind the popular web browser Firefox, is a pioneer and advocate for the Open Web for more than 20 years. Core to its mission is that the Internet is a global public resource, open and accessible to all. Mozilla works to ensure it stays open by building products, technologies and programs that put people in control of their online lives, and contribute to a healthier Internet. Today, hundreds of millions of people worldwide use Mozilla Firefox to experience the Web on computers, tablets and mobile devices. For more information, visit www.mozilla.org.

Project Common Voice

 

Launched in June 2017, Mozilla’s project “Common Voice” is taking a multifaceted approach to open innovation to democratize speech technologies. The project wants to build open and publicly available datasets of labelled audio — up to 10,000 hours of speech for as many different languages as possible, representing a broad diversity of accents, ages, gender, etc. — that anyone can use to train voice-enabled applications. 

 

Today, the latest Common Voice release represents the largest public domain transcribed voice dataset, with more than 2,400 hours of voice data and 28 languages represented, including English, French, German and Mandarin Chinese (Traditional), but also for example Welsh and Kabyle. To put that in perspective, the public collection of TED talks constitutes about 200 hours, while LibriSpeech, which is essentially public domain Books on Tape, represent about 1,000 hours. The dataset is downloaded hundreds of times a month. 

Since the project enabled multi-language support in June 2018, Common Voice has grown to be more global and more inclusive. Over the past months, communities have enthusiastically rallied around the project, launching data collection efforts in 30+ languages with 70+ more in progress on the Common Voice site. Eventually, Mozilla wants Common Voice to be a tool for any community to make speech technology available in their own language. 

Project Deep Speech

Common Voice complements Mozilla’s work in the field of speech recognition, which runs under the project name “Deep Speech” (https://github.com/mozilla/DeepSpeech), an open-source speech recognition engine, with an English model first released in November 2017. 

 

Together with a community of like minded developers, companies and researchers, Mozilla’s Machine Learning Group has applied sophisticated machine learning techniques and a variety of innovations to build an open-source speech-to-text engine that approaches human accuracy. 

 

Project Deep Speech is early stage, laying the groundwork for future adoption. Initial traction is strongest with researchers and startups. Deep Speech is built on a framework which allows for rapid prototyping and we will provide access to diverse training data sets.

 

Together with the growing Common Voice dataset Mozilla believes this technology can and will enable a wave of innovative products and services, and that it should be available to everyone.

Contact

Technical collaborations:
George Roter, Project Lead
Email: Groter@mozilla.com

Media requests: 

Alexander Klepel, Communications Lead

Email: aklepel@mozilla.com