Go to the first, previous, next, last section, table of contents.


Executable Programs

This section gives a brief description of the executable programs available with the speech tools. Most of these programs are simple wrap-around main() functions to library routines.

Many of these programs have man pages. Please consult the man pages for more detailed information. Most programs print a summary of their command line options when given the -help flag. Some programs are "finished", while others are still "in progress". The finished programs should be well documented and stable. The "in progress" programs are near completion but typically still require some work regarding user interfaces and documentation.

Data manipulation programs

ch_wave

Changes waveform file formats, performs re-sampling and scaling, prints information on waveform headers etc.

ch_track

Changes track file formats, converts track files into label files, smoothes tracks, re-samples tracks. Tracks are for F0, LPC coefficients, ceptra and such like.

ch_lab

Changes label file formats, converts label files into track files, performs one-to-one mapping of labels from one set to another, performs context sensitive label re-writing.

Audio Playback

na_play

Plays arbitrary waveform files on a variety of hardware audio devices. Can perform re-sampling to match audio device capability. `na_play' has support for a number of audio devices. Compile time options specify which devices are supported. Note you must actually have these devices on your machine before `na_play' can play any waveform.

`na_play', depending on compile-time options, supports the following audio devices, specified by the `-p' command.

The default audio is netaudio if it is supported. If not the platform specific auido mode is the default (e.g. sun16audio, linux16audio, freebsd16audio or mplayeraudio). If none of these is supported, sunaudio is the default. The Audio_Command method is always an option.

Signal Processing

pda

Pitch tracker based on super resolution pitch determination (srpd). Takes waveforms (of any type) as input and produces F0 contours.

icda

Pitch tracker with smoothing based on super resolution pitch determination (srpd). Takes waveforms (of any type) as input and produces F0 contours. Smoothing involes median smoothing of the pda output and interpolation through unvoiced regions.

sig2fv

Basic signal processing functions allowing generation of LPC coefficents, cepstra, mel cepstra etc at pitch synchronous and fixed intervals. Also allows generation of delta and delta coefficients.

sigfilter

Signal filter, used for generating LPC residuals amongst others.

Speech Recognition

viterbi (in progress)

A straightforward Viterbi decoder, using an ngram language model (which can be estimated using build_ngram, and a sequence of observation probability vectors.

build_ngram

Build ngram language models.

test_ngram

Test an ngram on text data.

dp

A general dynamic program aligner.

ch_ngram (in progress)

Modify ngrams, e.g. interpolation of two ngram models.

Statistical Analysis

wagon

A classification and regression tree building program following the techniques described in breiman84. See section Wagon

wagon_test

Program from testin CART trees, and predicting from CART trees.

ols

Ordinary least squares analysis (linear regression).

ols_test

Testing for OLS output.

scfg_parse, scfg_train, scfg_test, scfg_make

Suite of programs to build, train, test and parse stochastic context free grammars.

wfst_build and wfst_run

Suite of programs for building and running weighted finite state machines.

Intonation

tilt_analysis

Generate tilt descriptions of F0 contours.

tilt_synthesis

Generate F0 contours from tilt descriptions.


Go to the first, previous, next, last section, table of contents.