资 源 简 介
This is a reimplementation of the Berkeley Parser. Some of the class files
are copied from the Berkeley parser with possible changes to class
names and contents. The original purpose of the reimplementation was
to develop ideas that cannot be easily implemented on top of the
original Berkeley Parser. The parser has subsequently evolved with
several enhancements to improve parsing performance. Here are some
highlights of the parser:
Training is parallelized to support multi-threading;
Parsing is parallelized, supports n-best extraction, constrained and unconstrained viterbi decoding, parsing score reporting, and parsing with multiple grammars through a product model;
Includes a featured lexical model to: 1) alleviate over-fitting via regularization, 2) handle OOV words using a featured OOV model P(POS_tag|word), and 3) exploit lexical features for grammar induction using a featured latent lexical model P(latent_tag|word,POS_tag).