Edinburgh Speech Tools  2.4-release
 All Classes Functions Variables Typedefs Enumerations Enumerator Friends Pages
viterbi

Table of Contents

Combine n-gram model and likelihoods to estimate posterior probabilities

Synopsis

viterbi [observations file] -o [output file] [-ngram string] [-given string] [-vocab string] [-ob_type string] [-lm_floor float] [-lm_scale float] [-ob_floor float] [-ob_scale float] [-prev_tag string] [-prev_prev_tag string] [-last_tag string] [-default_tags ] [-observes2 string] [-ob_floor2 float] [-ob_scale2 float] [-ob_prune float] [-n_prune int] [-prune float] [-trace ]

viterbi is a simple time-synchronous Viterbi decoder. It finds the most likely sequence of items drawn from a fixed vocabulary, given frame-by-frame observation probabilities for each item in that vocabulary, and a ngram grammar. Possible uses include:

  • Simple speech recogniser back end

viterbi can optionally use two sets of frame-by-frame observation probabilities in a weighted-sum fashion. Also, the ngram language model is not restricted to the conventional sliding window type in which the previous n-1 items are the ngram context. Items in the ngram context at each frame may be given. In this case, the user must provide a file containing the ngram context: one (n-1) tuple per line. To include items from the partial Viterbi path so far (i.e. found at recognition time, not given) the special notation <-N> is used where N indicates the distance back to the item required. For example <-1> would indicate the item on the partial Viterbi path at the last frame. See Examples.

Pruning

Three types of pruning are available to reduce the size of the search space and therefore speed up the search:

  • Observation pruning
  • Top-N pruning at each frame
  • Fixed width beam pruning

Options

  • -ngram: string Grammar file, required
  • -given: string ngram left contexts, per frame
  • -vocab: string File with names of vocabulary, this must be same number as width of observations, required
  • -ob_type: string Observation type : likelihood .... and change doc"probs" or "logs" (default is "logs") Floor values and scaling (scaling is applied after floor value)
  • -lm_floor: float LM floor probability
  • -lm_scale: float LM scale factor factor (applied to log prob)
  • -ob_floor: float Observations floor probability
  • -ob_scale: float Observation scale factor (applied to prob or log prob, depending on -ob_type)
  • -prev_tag: string tag before sentence start
  • -prev_prev_tag: string all words before 'prev_tag'
  • -last_tag: string after sentence end
  • -default_tags: use default tags of !ENTER,!EXIT and !EXIT respectively
  • -observes2: string second observations (overlays first, ob_type must be same)
  • -ob_floor2: float
  • -ob_scale2: float
  • -ob_prune: float observation pruning beam width (log) probability
  • -n_prune: int top-n pruning of observations
  • -prune: float pruning beam width (log) probability
  • -trace: show details of search as it proceeds

Examples

Example 'given' file (items f and g are in the vocabulary), the ngram is a 4-gram.

<-2> g g
<-1> g f
<-1> f g
<-2> g g
<-3> g g
<-1> g f