The CMU_INDIC databases were constructed at the
Language Technologies Institute
at Carnegie Mellon University as phonetically balanced,
single speaker databases designed for corpus based speech synthesis
research. They are covering major languages spoken in the Indian
The distributions include the raw waveform files, with transcriptions
in the language's native script (etc/txt.done.data file), and also complete
built synthesis voices from these databases using CMU Clustergen statistical
parameteric speech synthesizer.
Complete android voices for CMU Flite
are voice built from these databases are available in the Google
Play store. You can hear voices built from these databases
CMU INDIC Databases
These packed versions contain only the waveform files, and the
- All 13 voices are available from packed
- do_indic a script to download and build a full voice from these databases (assuming FestVox build tools are all installed.
These datasets were collected and developed with help from Hear2Read. We acknowledge their contributions to making these practical languages for festvox. Special Thanks for to Suresh Bazaj.