Edinburgh Speech Tools  2.4-release
 All Classes Functions Variables Typedefs Enumerations Enumerator Friends Pages
EST_TokenStream Class Reference

#include <include/EST_Token.h>

Public Member Functions

 ~EST_TokenStream ()
 will close file if appropriate for type
 
EST_Token get_upto (const EST_String &s)
 get up to { s} in stream as a single token.
 
EST_Token get_upto_eoln (void)
 get up to { s} in end of line as a single token.
 
EST_Tokenpeek (void)
 peek at next token
 
int fread (void *buff, int size, int nitems)
 Reading binary data, (don't use peek() immediately beforehand)
 
int open (const EST_String &filename)
 open a {EST_TokenStream} for a file.
 
int open (FILE *ofp, int close_when_finished)
 open a {EST_TokenStream} for an already opened file
 
int open (istream &newis)
 open a {EST_TokenStream} for an already open istream
 
int open_string (const EST_String &newbuffer)
 open a {EST_TokenStream} for string rather than a file
 
void close (void)
 Close stream.
 
stream access functions
EST_TokenStreamget (EST_Token &t)
 get next token in stream
 
EST_Tokenget ()
 get next token in stream
 
get the next token which must be the argument.
EST_Tokenmust_get (EST_String expected, bool *ok)
 
EST_Tokenmust_get (EST_String expected, bool &ok)
 
EST_Tokenmust_get (EST_String expected)
 
stream initialization functions
void set_WhiteSpaceChars (const EST_String &ws)
 set which characters are to be treated as whitespace
 
void set_SingleCharSymbols (const EST_String &sc)
 set which characters are to be treated as single character symbols
 
void set_PunctuationSymbols (const EST_String &ps)
 set which characters are to be treated as (post) punctuation
 
void set_PrePunctuationSymbols (const EST_String &ps)
 set which characters are to be treated as (post) punctuation
 
void set_quotes (char q, char e)
 set characters to be used as quotes and escape, and set quote mode
 
int quoted_mode (void)
 query quote mode
 

miscellaneous

int linenum (void) const
 returns line number of {EST_TokenStream}
 
int eof ()
 end of file
 
int eoln ()
 end of line
 
int filepos (void) const
 current file position in {EST_TokenStream}
 
int tell (void) const
 tell, synonym for filepos
 
int seek (int position)
 seek, reposition file pointer
 
int seek_end ()
 
int restart (void)
 Reset to start of file/string.
 
const EST_String pos_description ()
 A string describing current position, suitable for error messages.
 
const EST_String filename () const
 The originating filename (if there is one)
 
FILE * filedescriptor ()
 For the people who need the actual description (if possible)
 
EST_TokenStreamoperator>> (EST_Token &p)
 
EST_TokenStreamoperator>> (EST_String &p)
 
ostream & operator<< (ostream &s, EST_TokenStream &p)
 

Detailed Description

A class that allows the reading of {EST_Token}s from a file stream, pipe or string. It automatically tokenizes a file based on user definable whitespace and punctuation.

The definitions of whitespace and punctuation are user definable. Also support for single character symbols is included. Single character symbols {always} are treated as individual tokens irrespective of their white space context. Also a quote mode can be used to read uqoted tokens.

The setting of whitespace, pre and post punctuation, single character symbols and quote mode must be down (immediately) after opening the stream.

There is no unget but peek provides look ahead of one token.

Note there is an interesting issue about what to do about the last whitespace in the file. Should it be ignored or should it be attached to a token with a name string of length zero. In unquoted mode the eof() will return TRUE if the next token name is empty (the mythical last token). In quoted mode the last must be returned so eof will not be raised.

Author
Alan W Black (awb@c.nosp@m.str..nosp@m.ed.ac.nosp@m..uk): April 1996

Definition at line 235 of file EST_Token.h.


The documentation for this class was generated from the following files: