Edinburgh Speech Tools  2.4-release
 All Classes Functions Variables Typedefs Enumerations Enumerator Friends Pages

Table of Contents

Make spectrograms


spectgen [input file] -o [output file] [-h ] [-itype string] [-n int] [-f int] [-ibo string] [-iswap ] [-istype string] [-c string] [-start float] [-end float] [-from int] [-to int] [-otype string] [-S float] [-o ofile] [-shift float] [-length float] [-sr float] [-slow ] [-w float] [-b float] [-raw ] [-order int]

spectgen is used to create spectrograms, which are 3d plots of amplitude against time and frequency. Spectgen takes a waveform and produces a track, where each channel represents one frequency bin.

By default spectgen produces a "wide-band" spectrogram, that is one with high time resolution and low frequency resolution. "Narrow-band" spectrograms can be produced by using the -shift and -length options.

Typical values for -shift and -length are:


  • -h: Options help
  • -itype: string Input file type (optional). If set to raw, this indicates that the input file does not have a header. While this can be used to specify file types other than raw, this is rarely used for other purposes as the file type of all the existing supported types can be determined automatically from the file's header. If the input file is unheadered, files are assumed to be shorts (16bit). Supported types are nist, est, esps, snd, riff, aiff, audlab, raw, ascii
  • -n: int Number of channels in an unheadered input file
  • -f: int Sample rate in Hertz for an unheadered input file
  • -ibo: string Input byte order in an unheadered input file: possibliities are: MSB , LSB, native or nonnative. Suns, HP, SGI Mips, M68000 are MSB (big endian) Intel, Alpha, DEC Mips, Vax are LSB (little endian)
  • -iswap: Swap bytes. (For use on an unheadered input file)
  • -istype: string Sample type in an unheadered input file: short, mulaw, byte, ascii
  • -c: string Select a single channel (starts from 0). Waveforms can have multiple channels. This option extracts a single channel for progcessing and discards the rest.
  • -start: float Extract sub-wave starting at this time, specified in seconds
  • -end: float Extract sub-wave ending at this time, specified in seconds
  • -from: int Extract sub-wave starting at this sample point
  • -to: int Extract sub-wave ending at this sample point
  • -otype: string " {ascii}" Output file type, if unspecified ascii is assumed, types are: none, esps, est, est_binary, htk, htk_fbank, htk_mfcc, htk_mfcc_e, htk_user, htk_discrete, ssff, xmg, xgraph, ema, ema_swapped, ascii, label
  • -S: float Frame spacing of output in seconds. If this is different from the internal spacing, the contour is resampled at this spacing
  • -o: ofile Output filename, defaults to stdout
  • -shift: float frame spacing in seconds for fixed frame analysis. This doesn't have to be the same as the output file spacing - the S option can be used to resample the track before saving default: 0.001
  • -length: float input frame length in milliseconds
  • -sr: float range in which output values should lie
  • -slow: slow FFT code
  • -w: float white cut off (0.0 to 1.0)
  • -b: float black cut off (0.0 to 1.0)
  • -raw: Don't perform any scaling
  • -order: int cepstral order