#include <include/EST_SCFG.h>


Public Member Functions | |
| void | test_corpus () |
| void | test_crossbrackets () |
| void | load_corpus (const EST_String &filename) |
| void | train_inout (int passes, int startpass, int checkpoint, int spread, const EST_String &outfile) |
Public Member Functions inherited from EST_SCFG | |
| EST_SCFG () | |
| EST_SCFG (LISP rules) | |
| Initialize from a set of rules. | |
| ~EST_SCFG () | |
| EST_read_status | load (const EST_String &filename) |
| Load grammar from named file. | |
| EST_write_status | save (const EST_String &filename) |
| Save current grammar to named file. | |
| void | set_rules (LISP rules) |
| Set (or reset) rules from external source after construction. | |
| LISP | get_rules () |
| Return rules as LISP list. | |
| int | distinguished_symbol () const |
| void | find_terms_nonterms (EST_StrList &nt, EST_StrList &t, LISP rules) |
| EST_String | nonterminal (int p) const |
| Convert nonterminal index to string form. | |
| EST_String | terminal (int m) const |
| Convert terminal index to string form. | |
| int | nonterminal (const EST_String &p) const |
| Convert nonterminal string to index. | |
| int | terminal (const EST_String &m) const |
| Convert terminal string to index. | |
| int | num_nonterminals () const |
| Number of nonterminals. | |
| int | num_terminals () const |
| Number of terminals. | |
| double | prob_B (int p, int q, int r) const |
| The rule probability of given binary rule. | |
| double | prob_U (int p, int m) const |
| The rule probability of given unary rule. | |
| void | set_rule_prob_cache () |
| (re-)set rule probability caches | |
Additional Inherited Members | |
Public Attributes inherited from EST_SCFG | |
| SCFGRuleList | rules |
| The rules themselves. | |
A class used to train (and test) SCFGs is an extension of {EST_SCFG}.
This offers an implementation of Pereira and Schabes ``Inside-Outside reestimation from partially bracket corpora.'' ACL 1992.
A SCFG maybe trained from a corpus (optionally) containing brackets over a series of passes reestimating the grammar probabilities after each pass. This basically extends the {EST_SCFG} class adding support for a bracket corpus and various indexes for efficient use of the grammar.
Definition at line 254 of file EST_SCFG.h.
| void EST_SCFG_traintest::test_corpus | ( | ) |
Test the current grammar against the current corpus print summary.
Cross entropy measure only is given.
Definition at line 559 of file EST_SCFG_inout.cc.
| void EST_SCFG_traintest::test_crossbrackets | ( | ) |
Test the current grammar against the current corpus.
Summary includes percentage of cross bracketing accuracy and percentage of fully correct parses.
Definition at line 509 of file EST_SCFG_Chart.cc.
| void EST_SCFG_traintest::load_corpus | ( | const EST_String & | filename | ) |
Load a corpus from the given file.
Each sentence in the corpus should be contained in parentheses. Additional parenthesis may be used to denote phrasing within a sentence. The corpus is read using the LISP reader so LISP conventions shold apply, notable single quotes should appear within double quotes.
Definition at line 207 of file EST_SCFG_inout.cc.
| void EST_SCFG_traintest::train_inout | ( | int | passes, |
| int | startpass, | ||
| int | checkpoint, | ||
| int | spread, | ||
| const EST_String & | outfile | ||
| ) |
Train a grammar using the loaded corpus.
| passes | the number of training passes desired. |
| startpass | from which pass to start from |
| checkpoint | save the grammar every n passes |
| spread | Percentage of corpus to use on each pass, this cycles through the corpus on each pass. |
Definition at line 482 of file EST_SCFG_inout.cc.