Parsing Engine

Dan Bikel’s Parsing Engine

This collection of packages provides a framework for an extensible statistical parsing engine.

See:
          Description

Packages
danbikel.lisp Provides classes to create, read and manipulate symbolic expressions (S-expressions), including interned symbols.
danbikel.parser Provides the core framework of this extensible statistical parsing engine.
danbikel.parser.arabic Provides language-specific classes necessary to parse Arabic.
danbikel.parser.chinese Provides language-specific classes necessary to parse Chinese.
danbikel.parser.constraints Provides interfaces and classes to allow constrain-parsing.
danbikel.parser.english Provides language-specific classes necessary to parse English.
danbikel.parser.lang Provides default abstract base classes for the required interfaces of a language package.
danbikel.parser.ms Default package for model structure classes (subclasses of ProbabilityStructure).
danbikel.parser.util Utility classes for displaying and manipulating parse trees.
danbikel.switchboard Provides classes to implement a distributed client-server environment, with a central switchboard responsible for assigning clients to servers and for doling out objects to clients for processing.
danbikel.util Provides some basic utility classes.
danbikel.util.proxy Contains various InvocationHandler objects with static factory methods to provide proxy instances.

 

This collection of packages provides a framework for an extensible statistical parsing engine. Unlike previous approaches, the framework encapsulates all language- and Treebank-specific information in a separate module to be specified at run-time, providing a great deal of language-independence.

A quick note on extensible versus portable: This parser is extensible to new Treebanks and/or langauges, as opposed to being portable. The difference lies in what the developer must do to allow the parsing engine to work with a new Treebank and/or language, as well as whether the changes are recognized at compile-time or run-time. A portable engine implies that the developer would need to make changes to the engine itself to make sure that it can work in the new domain, and then recompile the entire engine to get a language- or Treebank-specific version. This extensible engine, on the other hand, requires only that the developer provide an additional set of classes—a language package—to be loaded at run-time, without having to touch the source code of the engine itself or re-compile it. Other ways in which this framework is extensible include, but are not limited to,


Parsing Engine

Author: Dan Bikel.