Go to the first, previous, next, last section, table of contents.


14 Resources

In this chapter we will try to list some of the important resources available that you may need when building a voice in Festival. This list cannot be complete and comprehensive but we will to give references to meta-resources as well as direct references to information code, data that may be of use to you.

This document itself will be updated occasionally and it is worth checking to ensure that you have the latest copy.

Updates, new databases, new language support etc will happen intermittently, new voices will be released which may help you develop your own new voices.

http://www.festvox.org

has been set up as a resource center for voices in Festival offering databases, examples and repository for voice distribution. Checking that site regularly is a good thing to do.

Specifically

`http://www.festvox.org/examples/cmu_us_kal_diphone/'
Offers a complete example US English diphone databes as built using the walkthough in section 5.10 US/UK English Walkthrough. The originally recorded diphone databases is also available as is, at http://www.festvox.org/databases/cmu_us_kal_diphone/.
`http://www.festvox.org/examples/cmu_time_awb_ldom/'
Offers a complete example limited domain synthesis database as build using the walkthrough in section 7 Limited domain synthesis.

Other databases, lexicons etc will be installed on festvox.org as they become available.

There is also a mailing-list `festvox-talk@festvox.org' for discussing aspects of building voices. See http://www.festvox.org/maillist.html for details of joining it and the archive of messages already sent. Also, while traffic is low, feel free to mail the authors `awb@cs.cmu.edu' or `lenzo@cs.cmu.edu' and we will try to help where we can.

14.1 Festival resources

The Festival home page http://www.cstr.ed.ac.uk/projects/festival.html It is updated regularly as new developments happen.

The Festival Speech Synthesis System code and the Edinburgh Speech Tools library and related programs are available from

ftp://ftp.cstr.ed.ac.uk/pub/festival/

or in the US at

http://www.speech.cs.cmu.edu/festival/download.html

Note that precompiled versions of the system are also available from that site, though at time of writing only Linux binaries are available.

Festival comes with its own manual and html, postscript and GNU info format. It and a less comprehensive Speech Tools manual are pre-built in `festdoc-1.4.1.tar.gz'. The manuals are also available on line at

hhttp://www.speech.cs.cmu.edu/festival/manual-1.4.1/festival_toc.html
http://www.cstr.ed.ac.uk/projects/speech_tools/manual-1.2.0/speechtools_toc.html

You will likely need to reference these manuals often.

It will also be useful to have access to other voices development in Festival as seeing how others solve problems may make things clearer.

In addition to Festival itself a number of other projects throughout the world use Festival and have also released resources. The `Related Projects' links give urls to other organizations which you may find useful.

It is worth mentioning Oregon Graduate Institute here who have done a lot of work with the system and release other voices for it (US English and Mexican Spanish). See http://cslu.cse.ogi.edu/tts/ for more details.

A second project worth mention, is the MBROLA project dutoit96 http://tcts.fpms.ac.be/synthesis/mbrola.html, they offer a waveform synthesis technique dutoit93 and a number of diphone database for lots of different languages. MBROLA itself doesn't offer a front end, just phone, duration and F0 target to waveform synthesis. (However the do offer a full French TTS system too.) Their diphone databases complement Festival well and a number of projects use MBROLA databases for their waveform synthesis and Festival as the front end. If you lack resources to record and build diphone databases this is a good place to check for existing diphone databases for languages. Most of their databases have some use/distribution restrictions but they usually allow any non-commercial use.

14.2 General speech resources

The network is a vast resource of information but it is not always easy to find what you are looking for.

Indexes to speech related information are available. The comp.speech frequently asked questions maintain by Andrew Hunt, is an excellent constantly updated list of information and resrouces available for speech recognition and synthesis. It is available in html format from

Australia: http://www.speech.su.oz.au/comp.speech/
UK: http://svr-www.eng.cam.ac.uk/comp.speech/
Japan: http://www.itl.atr.co.jp/comp.speech/
USA: http://www.speech.cs.cmu.edu/comp.speech/

The Linguistics Data Consortium (LDC), although expensive, offers many speech resources including lexicons and databases suitable for synthesis work. There web page is www.ldc.upenn.edu. A similar organization is the European Language Resources Association http://www.icp.grenet.fr/ELRA/home.html which is based in Europe. Both these home pages have links to other potential resources.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
to be added:
  recording and EGG information
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


Go to the first, previous, next, last section, table of contents.