In this chapter we work through a full example of creating a voice given that most of the basic construction work (model building) has been done. Pariticularly this discusses the scheme files, and conventions for keeping a voices together and how you can go about packaging it for general use.
Ultimately a voice in Festival will consist of a diphone database, a lexicon (and lts rules) and a number of scheme files that offer the complete voice. When people other than the developer of a voice wish to use your newly developed voice it is only that small set of files that are required and need to be distributed (freely or otherwise). By convention we have distributed diphone group files, a single file holding the index, and diphone data itself, and a set scheme files that describe the voice (and its necessary models).
Although there are many ways to do this they following technique
is parhaps the easiest for distributed voices (and for developement too).
By default, festival, at start up will search the directories listed.
in the variable voice-path
. Each directory in that list will
be searched for subdirectories that define new voices. By default the
voice-path
is set to Festival's `lib/voices/' directory.
Voices are expected to be in the following format.
festival/lib/voices/LANGUAGE/VOX_NAME/festvox/VOX_NAME.scm
where LANGUAGE
is the language name and VOX_NAME
is the
name of the voice. Further, by convention, voice name consist of three
letter initials an underscore followed by the type of synthesis used by
the voice (e.g. diphone
or cluster
). (However this naming
perhaps should also encode the language, where a single speaker has been
used for multiple languages.) Thus the basic definition of the
kal_diphone
voice is standardly found in
festival/lib/voices/english/kal_diphone/festvox/kal_diphone.scm
When voices are defined thus, festival will automatically define
a function at start-up time that when called will selected the voice.
For a voice VOX_NAME
the function voice_VOX_NAME
will
be defined which when called will load the voice's definition
file and called the function voice_VOX_NAME
again which
is expected to be definied withint he voice definition file. This
technique is analogous to autoload
as used in Emacs and other
Lisps).
This means new voices may be added to Festival automatically without any chnage to the existsing system except that the new voice be unpacked in the appropriate position. The autoload mechanism means that the avoice will only actually be loaded if it is called thus not taking up any run time resources until it is actually needed.
The voice definition file itself should load in any other necessary
files, models etc and definition the voice selection function. As there
may be a number of files in a voice's `festvox/' directory the
first thing that normally is done in a voice definition file is
add it's `festvox/' directory to Festival's library path so that
the other files may be automatically loaded in, without having
to explicitly include a pathname in the file. We do this
by finding out when the voice is defined (with respect to the
current implementation of festival) and adding that path to
the Festival variable load-path
. For example for
the voice kal_diphone
in our voice deifnition file
`lib/voices/english/kal_diphone/festvox/kal_diphone.scm'
we have
(defvar kal_diphone_dir (cdr (assoc 'kal_diphone voice-locations)) "kal_diphone_dir The default directory for the kal diphone database.") (set! load-path (cons (path-append kal_diphone_dir "festvox/") load-path))
We define the variable kal_diphone_dir
as it will be useful
for any other pathname dependent file we wish to access for this voice.
This next part of the voice description file loads in any other files in the voice's `festvox/' or that are in the standard Festival lib.
(require 'radio_phones) (require 'kaldurtreeZ)
%%%%%%%%%%%%%%%%%%%%%%% Contination to be added %%%%%%%%%%%%%%%%%%%%%%%
Go to the first, previous, next, last section, table of contents.