#include <include/EST_SCFG.h>
Public Member Functions | |
void | test_corpus () |
void | test_crossbrackets () |
void | load_corpus (const EST_String &filename) |
void | train_inout (int passes, int startpass, int checkpoint, int spread, const EST_String &outfile) |
Public Member Functions inherited from EST_SCFG | |
EST_SCFG () | |
EST_SCFG (LISP rules) | |
Initialize from a set of rules. | |
~EST_SCFG () | |
EST_read_status | load (const EST_String &filename) |
Load grammar from named file. | |
EST_write_status | save (const EST_String &filename) |
Save current grammar to named file. | |
void | set_rules (LISP rules) |
Set (or reset) rules from external source after construction. | |
LISP | get_rules () |
Return rules as LISP list. | |
int | distinguished_symbol () const |
void | find_terms_nonterms (EST_StrList &nt, EST_StrList &t, LISP rules) |
EST_String | nonterminal (int p) const |
Convert nonterminal index to string form. | |
EST_String | terminal (int m) const |
Convert terminal index to string form. | |
int | nonterminal (const EST_String &p) const |
Convert nonterminal string to index. | |
int | terminal (const EST_String &m) const |
Convert terminal string to index. | |
int | num_nonterminals () const |
Number of nonterminals. | |
int | num_terminals () const |
Number of terminals. | |
double | prob_B (int p, int q, int r) const |
The rule probability of given binary rule. | |
double | prob_U (int p, int m) const |
The rule probability of given unary rule. | |
void | set_rule_prob_cache () |
(re-)set rule probability caches | |
Additional Inherited Members | |
Public Attributes inherited from EST_SCFG | |
SCFGRuleList | rules |
The rules themselves. | |
A class used to train (and test) SCFGs is an extension of {EST_SCFG}.
This offers an implementation of Pereira and Schabes ``Inside-Outside reestimation from partially bracket corpora.'' ACL 1992.
A SCFG maybe trained from a corpus (optionally) containing brackets over a series of passes reestimating the grammar probabilities after each pass. This basically extends the {EST_SCFG} class adding support for a bracket corpus and various indexes for efficient use of the grammar.
Definition at line 254 of file EST_SCFG.h.
void EST_SCFG_traintest::test_corpus | ( | ) |
Test the current grammar against the current corpus print summary.
Cross entropy measure only is given.
Definition at line 559 of file EST_SCFG_inout.cc.
void EST_SCFG_traintest::test_crossbrackets | ( | ) |
Test the current grammar against the current corpus.
Summary includes percentage of cross bracketing accuracy and percentage of fully correct parses.
Definition at line 509 of file EST_SCFG_Chart.cc.
void EST_SCFG_traintest::load_corpus | ( | const EST_String & | filename | ) |
Load a corpus from the given file.
Each sentence in the corpus should be contained in parentheses. Additional parenthesis may be used to denote phrasing within a sentence. The corpus is read using the LISP reader so LISP conventions shold apply, notable single quotes should appear within double quotes.
Definition at line 207 of file EST_SCFG_inout.cc.
void EST_SCFG_traintest::train_inout | ( | int | passes, |
int | startpass, | ||
int | checkpoint, | ||
int | spread, | ||
const EST_String & | outfile | ||
) |
Train a grammar using the loaded corpus.
passes | the number of training passes desired. |
startpass | from which pass to start from |
checkpoint | save the grammar every n passes |
spread | Percentage of corpus to use on each pass, this cycles through the corpus on each pass. |
Definition at line 482 of file EST_SCFG_inout.cc.