Parsing Engine

danbikel.parser
Class Settings

java.lang.Object
  extended bydanbikel.parser.Settings
All Implemented Interfaces:
Serializable

public class Settings
extends Object
implements Serializable

Provides static settings for this package, primarily via an internal Properties object. All recognized properties of this package and the supplied language packages are provided as publicly-accessible constants.

A settings file for a particular language must provide the property headTablePrefix + language.
A settings file for a particular language should normally also provide the property fileEncodingPrefix + language to override the default file encoding as determined by the locale of the Java VM. Settings files for a particular language and/or Treebank may contain any other settings required by a language package.

Variable expansion is performed on property values as in Java security policy files, with the additional provision that properties defined earlier in a settings file can be used as variable names in subsequent lines of the settings file. See Text.expandVars(Properties,StringBuffer) for what variables are allowed in the definitions of property values.

Upon intialization, this class attempts to read default parser settings from the file settings inside the default settings directory, $HOME/.db-parser, where $HOME is ther user's home directory, as defined by the system property user.home. If either the default settings directory or the default settings file is missing, this class will use fallback default settings from a resource that is bundled with this package.

To obtain a default settings file as a template for modification, run the main(java.lang.String[]) method of this class.

See Also:
headTablePrefix, fileEncodingPrefix, Language.encoding(), settingsDirOverride, settingsFileOverride, Serialized Form

Field Summary
static String addGapInfo
          Property to specify whether Training.addGapInformation(Sexp) threads gap information or simply leaves the training trees untouched.
static String baseNPsCannotContainVerbs
          The property to specify whether the containsVerb predicate should have an additional base case where it should simply return false for NPB nodes.
static String chartItemClass
          The property to specify the fully-qualified name of the subclass of Item to be used for chart items.
static String collinsDeficientEstimation
          The property to specify whether to perform deficient estimation of probabilities (as per Mike Collins' bug in his thesis parser).
static String collinsNPPruneHack
          The property to specify whether the chart should add 3 in natural log-space to the beam width for chart items whose root labels are either NP or NP-A, as is done by Collins' parser.
static String collinsRelabelHeadChildrenAsArgs
          The property to specify whether Training.identifyArguments(Sexp) should relabel head children as arguments.
static String collinsRepairBaseNPs
          The property to specify whether Training.repairBaseNPs(Sexp) alters the training tree or leaves it untouched.
static String collinsSkipWSJSentences
          The property to specify whether certain sentences are skipped during training on sections 02 through 21 of the Penn Treebank Wall Street Journal corpus in order to mimic Mike Collins' trainer on this now-standard training corpus.
static String constraintSetFactoryClass
          The property to specify the fully-qualified classname of the ConstraintSetFactory object to be used by the ConstraintSets static class.
static String countThreshold
          The property to specify the threshold below which TrainerEvent objects are discarded by the trainer.
static String decoderCellLimit
          The property to specify the limit on the number of chart items the decoder will have per cell in its chart.
static String decoderClass
          The property to specify the fully-qualified class name of the Decoder instance to be created for use by Parser and EMParser classes (and any other subclass of Parser).
static String decoderLocalCacheSize
          The property to specify the size of the cache used by the CachingDecoderServer instance used by the decoder when the decoderUseLocalProbabilityCache property is true.
static String decoderMaxPruneFactor
          The property to specify the maximum prune factor when performing beam-widening.
static String decoderOutputHeadLexicalizedLabels
          The property to specify whether node labels in trees output by the decoder include their lexical head information, which is normally only used internally by the decoder.
static String decoderPruneFactor
          The property to specify the factor by which the decoder should prune away chart entries.
static String decoderPruneFactorIncrement
          The property to specify the increment used when the decoder does beam-widening.
static String decoderRelaxConstraintsAfterBeamWidening
          The property to specify whether the decoder should relax all hard constraints (except the comma pruning rule, which is controlled by the decoderUseCommaConstraint setting) after performing all beam widening.
static String decoderServerClass
          The property to specify the fully-qualified class name of the DecoderServerRemote instance to be created for use by Parser and EMParser classes (and any other subclass of Parser).
static String decoderSubstituteWordsForClosedClassTags
          The property to specify whether the decoder should substitute a known word when the only tag for an unknown word is closed-class (i.e., the tag was never observed with the unknown word during training).
static String decoderUseCellLimit
          The property to specify whether the decoder should impose a limit on the number of chart items per cell in the chart.
static String decoderUseCommaConstraint
          The property to specify whether the decoder should employ a constraint on the way commas can appear in and around chart items.
static String decoderUseHeadToParentMap
          The property to specify whether the decoder should use the head-to-parent map derived during training.
static String decoderUseLocalProbabilityCache
          The property to specify whether the decoder should wrap its DecoderServerRemote instance with an instance of CachingDecoderServer, which caches probability lookups.
static String decoderUseOnlySuppliedTags
          The property to specify whether the decoder should only use the tags supplied with words in an input file when seeding the chart.
static String decoderUsePruneFactor
          The property to specify whether or not the decoding algorithm should prune away chart entries within a particular factor of the top-ranked chart entry in a given cell.
static String defaultModelClass
          The property to specify the default Model class to be created around ProbabilityStructure objects when their ProbabilityStructure.newModel() method is invoked.
static String derivedCountThreshold
          The property to specify the threshold below which Event objects are discarded by the databases contained with Model objects.
static String dontAddNewParams
          Indicates whether instances of Model, when smoothing parameters from a previous training run, should not add new parameters when deriving counts.
static String downcaseWords
          The property to specify whether words are downcased during training and decoding.
static String fileEncodingPrefix
          The prefix string used to specify a language's file encoding property.
static String gapModelStructureClass
          The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the gap-generation submodel.
static String gapModelStructureNumber
          The property to specify the model structure number to use when creating the ProbabilityStructure object for the gap-generation submodel.
static String globalModelStructureNumber
          The property to specify the model structure number to use when creating ProbabilityStructure objects.
static String headFinderClass
          The property to specify the fully-qualified name of the class that extends HeadFinder in a language package.
static String headFinderRandomProb
          The property to specify a probability that the method AbstractHeadFinder.defaultFindHead(Symbol,SexpList) should return a randomly-selected head-child index.
static String headFinderWarnDefaultRule
          The property to specify whether the method AbstractHeadFinder.defaultFindHead(Symbol,SexpList) issues a warning whenever it needs to use the default head-finding rule.
static String headModelStructureClass
          The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the head-generation submodel.
static String headModelStructureNumber
          The property to specify the model structure number to use when creating the ProbabilityStructure object for the head-generation submodel.
static String headTablePrefix
          The prefix string used to specify a language's head table property.
static String kBest
          The property to specify the maximum number of top-scoring theories to give as a parse.
static String keepAliveInterval
          The property to specify how often clients and servers should ping the "keep-alive" socket connected to the switchboard.
static String keepAliveMaxRetries
          The property to specify at most how many times the switchboard attempts to contact clients and servers before considering them dead (after an initial failure, thus making 0 a legal value for this property).
static String keepAllWords
          The property to specify whether or not the trainer keeps all words.
static String keepLowFreqTags
          The property to specify whether the trainer includes low-frequency words in its part of speech map.
static String language
          The property to specify the language to be parsed.
static String languagePackage
          The property to specify the language package to be used.
static String leftSubcatModelStructureClass
          The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the left subcat-generation submodel.
static String leftSubcatModelStructureNumber
          The property to specify the model structure number to use when creating the ProbabilityStructure object for the left-subcat-generation submodel.
static String lexPriorModelStructureClass
          The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the lexical prior submodel.
static String lexPriorModelStructureNumber
          The properth to specify the model structure to use when creating the ProbabilityStructure object for the lexical prior submodel.
static String maxEventChunkSize
          The property to specify how many events the trainer should read from an observations file before deriving counts (for use only when using a trainer output file; see Trainer.main(String[])).
static String maxParseTime
          The property to specify the maximum time, in milliseconds, that the decoder will attempt to deliver a parse on a sentence.
static String maxSentLen
          The property to specify the maximum length a sentence can be; sentences greater than this length will not be parsed.
static String modelDoPruning
          The property to specify whether to prune redundant parameters from every Model instance.
static String modelPruningThreshold
          The property to specify the pruning threshold when pruning is performed (ignored if modelDoPruning is false).
static String modelStructurePackage
          The property to specify the package for model structure classes.
static String modNonterminalModelStructureClass
          The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the modifying nonterminal-generation submodel.
static String modNonterminalModelStructureNumber
          The property to specify the model structure number to use when creating the ProbabilityStructure object for the modifying nonterminal-generation submodel.
static String modWordModelStructureClass
          The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the modifying word-generation submodel.
static String modWordModelStructureNumber
          The property to specify the model structure number to use when creating the ProbabilityStructure object for the modifying word-generation submodel.
static String nonterminalPriorModelStructureClass
          The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the nonterminal prior submodel.
static String nonterminalPriorModelStructureNumber
          The properth to specify the model structure to use when creating the ProbabilityStructure object for the nonterminal prior submodel.
static String numPrevMods
          The property to specify how many previous modifiers the trainer outputs for its top-level count files.
static String numPrevWords
          The property to specify how many head words of previous modifiers the trainer outputs for its top-level count files.
static String outputCollins
          The property to specify whether the trainer outputs top-level events in the format output by Michael Collins' trainer to System.out when training on a Treebank input file.
static String outputHeadToParentMap
          The property to specify whether the trainer should output the head-to-parent nonterminal map that it derives from its top-level observations.
static String outputModNonterminalMap
          The property to specify whether the trainer should output the modifying nonterminal map that it derives from its top-level observations.
static String outputSubcatMaps
          The property to specify whether the trainer should output the subcat maps that it derives from its top-level observations.
static String precomputeProbs
          The property to specify whether or not to pre-compute probabilities when training and use those pre-computed probabilities when decoding.
static String prevModMapperClass
          The property to specify the concrete type of the NonterminalMapper instance used by NTMapper to map nonterminals that are previously-generated modifiers of some head child nonterminal.
static String progName
          The official name of this program.
static String restorePrunedWords
          The property to specify that the decoder restores all pruned words.
static String rightSubcatModelStructureClass
          The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the right subcat-generation submodel.
static String rightSubcatModelStructureNumber
          The property to specify the model structure number to use when creating the ProbabilityStructure object for the right-subcat-generation submodel.
static String saveSmoothingParams
           
static String sbSocketTimeout
          The property to specify how long, in milliseconds, the SO_TIMEOUT value should be for the switchboard's RMI-client (caller) sockets.
static String sbUserSBMaxRetries
          The property to specify at most how many times switchboard users should try to acquire the switchboard from the bootstrap registry before giving up, either when first starting up or in the event of a switchboard crash.
static String sbUserTimeout
          The property to specify how long (in milliseconds) sockets stay alive on the client (switchboard) side for RMI calls to switchboard user objects (subclasses of AbstractSwitchboardUser).
static String serverDeathKillClients
          The property to specify whether the switchboard should kill all of a server's clients when it detects that the server has died.
static String serverFailover
          The property to specify whether parsing clients should have server failover; that is, whether they should request a new server from the switchboard if their current server fails.
static String serverMaxRetries
          The property to specify at most how many times parsing clients should re-try their servers in the event of a method failure before giving up.
static String serverRetrySleep
          The property to specify how many milliseconds to sleep between server re-tries.
static String settingsDirOverride
          The name of the property to override the location of the default settings directory, to be specified at run-time on the command line.
static String settingsFileOverride
          The name of the property to override the name of the default settings file, which is <defaultSettingsDir>/settings, where <defaultSettingsDir> is the default settings directory, as described in the documentation for settingsDirOverride.
static String shifterClass
          The property to specify the fully-qualified classname of the Shift object to be used by the Shifter static class.
static String smoothingParamsDir
          The property to specify the directory from which Model objects are to read smoothing parameters files.
static String subcatFactoryClass
          The property to specify the fully-qualified classname of the SubcatFactory object to be used by the Subcats static factory class.
static String topLexModelStructureClass
          The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the head word-generation submodel for head words of entire sentences.
static String topLexModelStructureNumber
          The property to specify the model structure number to use when creating the ProbabilityStructure object for the head word-generation submodel for heads of entire sentences.
static String topNonterminalModelStructureClass
          The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the head-generation submodel for heads whose parents are Training.topSym().
static String topNonterminalModelStructureNumber
          The property to specify the model structure number to use when creating the ProbabilityStructure object for the head-generation submodel for heads whose parents are Training.topSym().
static String trainerReportingInterval
          The property to specify the interval (in number of sentences) at which the trainer emits reports to System.err when training.
static String trainerShareCounts
          The property to indicate whether the trainer should share counts among various models' back-off levels.
static String trainingClass
          The property to specify the fully-qualified name of the class that extends Training in a language package.
static String treebankClass
          The property to specify the fully-qualified name of the class that extends Treebank in a language package.
static String unknownWordThreshold
          The property to specify the threshold below which words are considered unknown by the trainer.
static String useLowFreqTags
          The property to specify whether to use tags collected from low-frequency words by the trainer when seeding the chart, if the current word is a low-frequency word observed when training.
static String useSimpleModNonterminalMap
          The property to specify whether the decoder uses ModelCollection.simpleModNonterminalMap.
static String useSmoothingParams
          Indicates whether instances of Model should use smoothing parameters saved to a file from a previous training run, instead of deriving smoothing parameters.
static String version
          The official version of this program.
static String wordFactoryClass
          The property to specify the fully-qualified classname of the WordFactory object to be used by the Words static factory class.
static String wordFeaturesClass
          The property to specify the fully-qualified name of the class that extends WordFeatures in a language package.
static String writeCanonicalEvents
          The property to specify whether or not the ModelCollection class should write out the large hash map containing canonical versions of Event objects when it is serialized (that is, saved to a file).
 
Method Summary
static String get(String name)
          Gets the value of the specified property.
static boolean getBoolean(String setting)
          Returns the boolean value of the specified setting, as determined by Boolean.valueOf(String).
static boolean getBooleanProperty(String property, boolean defaultValue)
          Returns the boolean value of specified property, or the specified default value if the specified property does not exist.
static InputStream getDefaultsResource()
          Gets the fallback defaults from resource, thowing exception if resource unavailable (which is a very bad situation).
static double getDouble(String setting)
          Returns the double value of the specified setting, as determined by Double.parseDouble(String).
static InputStream getFileOrResourceAsStream(Class cl, String name)
          Attempts to locate the file or resource with the specified name in one of three places: as a file path relative to the default settings directory, or as a file path relative to the current working directory, or relative to nothing, if name is an absolute path as a resource gotten from the class loader of the specified class The default settings directory is described in the documentation for settingsDirOverride.
static int getInteger(String setting)
          Returns the integer value of the specified setting, as determined by Integer.parseInt(String).
static int getIntProperty(String property, int defaultValue)
          Returns the integer value of specified property, or the specified default value if the specified property does not exist.
static Properties getSettings()
          Returns a deep copy of the internal Properties object.
static void load(File file)
          Loads the properties from the specified file, using load(InputStream).
static void load(InputStream is)
          Loads the properties from the specified input stream, using Properties.load(InputStream).
static void load(String filename)
          Loads the properties from the file of the specified filename, using load(File).
static void main(String[] args)
          Prints the default settings contained in the resource supplied with this parsing software.
static void set(String name, String value)
          Sets the property name to value, using Properties.setProperty(String,String).
static void setSettings(Properties newSettings)
          Allows any class to set the settings of this class directly using the specified Properties object.
static void store(ObjectOutputStream os)
          Stores the properties of this class to the specified output stream.
static void store(OutputStream os)
          Stores the properties of this class to the specified output stream, using Properties.store(OutputStream,String).
static void store(OutputStream os, String header)
          Stores the properties of this class to the specified output stream, using Properties.store(OutputStream,String).
static void storeSorted(OutputStream os)
          Stores a sorted list of the settings and values of this class to the specified output stream
static void storeSorted(OutputStream os, String header)
          Stores a sorted list of the property-value pairs contained in this class to the specified output stream using the specified header.
static void storeSorted(Properties props, OutputStream os)
          Stores a sorted list of the specified property-value pairs to the specified output stream
static void storeSorted(Properties props, OutputStream os, String header)
          Stores a sorted list of the specified container of property-value pairs to the specified output stream using the specified header.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

progName

public static final String progName
The official name of this program.

See Also:
Constant Field Values

version

public static final String version
The official version of this program.

See Also:
Constant Field Values

settingsFileOverride

public static final String settingsFileOverride
The name of the property to override the name of the default settings file, which is <defaultSettingsDir>/settings, where <defaultSettingsDir> is the default settings directory, as described in the documentation for settingsDirOverride.
Note that using this setting to change the default settings file does not change the default settings directory, which is used by getFileOrResourceAsStream(Class,String).

See Also:
Constant Field Values

settingsDirOverride

public static final String settingsDirOverride
The name of the property to override the location of the default settings directory, to be specified at run-time on the command line. The default settings directory is $HOME/.db-parser, where $HOME is the user's home directory, as defined by the system property user.home.
The value of this constant is "parser.settingsDir".

Example UNIX usage:

     java -Dparser.settingsDir=/tmp ...
 

See Also:
Constant Field Values

modelStructurePackage

public static final String modelStructurePackage
The property to specify the package for model structure classes.

See Also:
globalModelStructureNumber, Constant Field Values

language

public static final String language
The property to specify the language to be parsed.

The value of this constant is "parser.language".

See Also:
Language, Constant Field Values

languagePackage

public static final String languagePackage
The property to specify the language package to be used.

The value of this constant is "parser.language.package".

See Also:
Language, Constant Field Values

wordFeaturesClass

public static final String wordFeaturesClass
The property to specify the fully-qualified name of the class that extends WordFeatures in a language package. If this property is set, it will override the default, which is
Settings.get(Settings.languagePackage) + ".WordFeatures"
The value of this constant is "parser.language.wordFeatures".

See Also:
Language, Constant Field Values

treebankClass

public static final String treebankClass
The property to specify the fully-qualified name of the class that extends Treebank in a language package. If this property is set, it will override the default, which is
Settings.get(Settings.languagePackage) + ".Treebank"
The value of this constant is "parser.language.treebank".

See Also:
Language, Constant Field Values

headFinderClass

public static final String headFinderClass
The property to specify the fully-qualified name of the class that extends HeadFinder in a language package. If this property is set, it will override the default, which is
Settings.get(Settings.languagePackage) + ".HeadFinder"
The value of this constant is "parser.language.headFinder".

See Also:
Language, Constant Field Values

trainingClass

public static final String trainingClass
The property to specify the fully-qualified name of the class that extends Training in a language package. If this property is set, it will override the default, which is
Settings.get(Settings.languagePackage) + ".Training"
The value of this constant is "parser.language.training".

See Also:
Language, Constant Field Values

headTablePrefix

public static final String headTablePrefix
The prefix string used to specify a language's head table property. For example, for English, the head table filename is available by calling
   Settings.get(Settings.headTablePrefix + "english");
 
The value of this constant is "parser.headtable.".

See Also:
HeadFinder, Constant Field Values

subcatFactoryClass

public static final String subcatFactoryClass
The property to specify the fully-qualified classname of the SubcatFactory object to be used by the Subcats static factory class.

See Also:
Subcat, Subcats, SubcatFactory, Constant Field Values

wordFactoryClass

public static final String wordFactoryClass
The property to specify the fully-qualified classname of the WordFactory object to be used by the Words static factory class.

See Also:
Word, Words, WordFactory, Constant Field Values

shifterClass

public static final String shifterClass
The property to specify the fully-qualified classname of the Shift object to be used by the Shifter static class.

See Also:
Shift, Shifter, DefaultShifter, Constant Field Values

constraintSetFactoryClass

public static final String constraintSetFactoryClass
The property to specify the fully-qualified classname of the ConstraintSetFactory object to be used by the ConstraintSets static class.

See Also:
ConstraintSets, ConstraintSet, Constraint, Constant Field Values

baseNPsCannotContainVerbs

public static final String baseNPsCannotContainVerbs
The property to specify whether the containsVerb predicate should have an additional base case where it should simply return false for NPB nodes.

See Also:
Constant Field Values

fileEncodingPrefix

public static final String fileEncodingPrefix
The prefix string used to specify a language's file encoding property. For example, for Chinese, the file encoding is available by calling
  Settings.get(Settings.fileEncodingPrefix + "chinese");
 
The value of this constant is "parser.file.encoding.".

See Also:
Language.encoding(), Constant Field Values

decoderClass

public static final String decoderClass
The property to specify the fully-qualified class name of the Decoder instance to be created for use by Parser and EMParser classes (and any other subclass of Parser).

The value of this constant is "parser.parser.decoderClass".

See Also:
Parser.getNewDecoder(int,DecoderServerRemote), Constant Field Values

decoderServerClass

public static final String decoderServerClass
The property to specify the fully-qualified class name of the DecoderServerRemote instance to be created for use by Parser and EMParser classes (and any other subclass of Parser). This property is used when the Parser/EMParser class instance is asked to create and/or use its own, internal server.

The value of this constant is "parser.parser.decoderServerClass".

See Also:
Parser.getNewDecoderServer(String), Constant Field Values

defaultModelClass

public static final String defaultModelClass
The property to specify the default Model class to be created around ProbabilityStructure objects when their ProbabilityStructure.newModel() method is invoked.

The value of this constant is "parser.probabilityStructure.defaultModelClass".

See Also:
ProbabilityStructure.newModel(), Constant Field Values

precomputeProbs

public static final String precomputeProbs
The property to specify whether or not to pre-compute probabilities when training and use those pre-computed probabilities when decoding.

The value of this constant is "parser.model.precomputeProbabilities".

See Also:
Constant Field Values

collinsDeficientEstimation

public static final String collinsDeficientEstimation
The property to specify whether to perform deficient estimation of probabilities (as per Mike Collins' bug in his thesis parser). Specifically, if this property is true, then all deleted-interpolation probabilities are estimated using a lambda weight for the final level of back-off, as per for the formula
λ1e1 + (1 − λ1) ⋅ (λ2e2 + (1 − λ2) ⋅ λ3e3)
where λi is the smoothing weight for backoff level i and ei is an estimate for backoff level i. If this property is false, then the formula is estimated in the correct fashion:
λ1e1 + (1 − λ1) ⋅ (λ2e2 + (1 − λ2) ⋅ e3)

The value of this constant is "parser.model.collinsDeficientEstimation".

See Also:
Constant Field Values

saveSmoothingParams

public static final String saveSmoothingParams
See Also:
Constant Field Values

dontAddNewParams

public static final String dontAddNewParams
Indicates whether instances of Model, when smoothing parameters from a previous training run, should not add new parameters when deriving counts.

See Also:
Constant Field Values

useSmoothingParams

public static final String useSmoothingParams
Indicates whether instances of Model should use smoothing parameters saved to a file from a previous training run, instead of deriving smoothing parameters.

See Also:
Constant Field Values

smoothingParamsDir

public static final String smoothingParamsDir
The property to specify the directory from which Model objects are to read smoothing parameters files.

See Also:
Constant Field Values

modelDoPruning

public static final String modelDoPruning
The property to specify whether to prune redundant parameters from every Model instance.

See Also:
Constant Field Values

modelPruningThreshold

public static final String modelPruningThreshold
The property to specify the pruning threshold when pruning is performed (ignored if modelDoPruning is false). The value of this property should be (the string representation of) a double.

See Also:
modelDoPruning, Constant Field Values

prevModMapperClass

public static final String prevModMapperClass
The property to specify the concrete type of the NonterminalMapper instance used by NTMapper to map nonterminals that are previously-generated modifiers of some head child nonterminal.

See Also:
Constant Field Values

writeCanonicalEvents

public static final String writeCanonicalEvents
The property to specify whether or not the ModelCollection class should write out the large hash map containing canonical versions of Event objects when it is serialized (that is, saved to a file). When decoding using caches instead of precomputed probabilities (see precomputeProbs), the use of the canonical events table saves time by allowing the decoder to put canonical events observed during training into the caches, instead of always having to create a canonical events table anew during decoding. Accordingly, when precomputeProbs is set to false, the value of this property should usually be true, except when debugging. When precomputeProbs is false and the value of this property is also false, then the ModelCollection object used during training will simply write out an empty canonical events table, to be read in when the ModelCollection object is de-serialized just prior to decoding, meaning that as events are cached, they will need to be copied on the fly to the canonical events table. Finally, when precomputeProbs is true, this property is ignored.

The value of this property should be (the string representation of) a boolean (conversion is performed by the method Boolean.valueOf).

The value of this constants is "parser.modelCollection.writeCanonicalEvents".

See Also:
precomputeProbs, ModelCollection, Constant Field Values

headFinderWarnDefaultRule

public static final String headFinderWarnDefaultRule
The property to specify whether the method AbstractHeadFinder.defaultFindHead(Symbol,SexpList) issues a warning whenever it needs to use the default head-finding rule. The value of this property should be (the string representation of) a boolean (conversion is performed by the method Boolean.valueOf).

The value of this constant is "parser.headfinder.warnDefaultRule".

See Also:
Constant Field Values

headFinderRandomProb

public static final String headFinderRandomProb
The property to specify a probability that the method AbstractHeadFinder.defaultFindHead(Symbol,SexpList) should return a randomly-selected head-child index. The value of this property should be (the string representation of) a double (conversion is performed by the method Double.parseDouble).
N.B.: Head-finding for NPs that are not NPBs is unaffected by this setting, meaning that heuristics are always used to find heads within non-NPB noun phrases. This is so that Training.addBaseNPs(Sexp) will always produce consistent results. The issue is that adding base NPs normally relies on head finding (see AbstractTraining.needToAddNormalNPLevel(Sexp,int,Sexp)).

The value of this constant is "parser.headfinder.randomProb".

See Also:
AbstractTraining.needToAddNormalNPLevel(Sexp,int,Sexp), Constant Field Values

addGapInfo

public static final String addGapInfo
Property to specify whether Training.addGapInformation(Sexp) threads gap information or simply leaves the training trees untouched. The value of this property should be (the string representation of) a boolean (conversion is performed by the method Boolean.valueOf).

The value of this constant is "parser.training.addGapInfo".

See Also:
Training.addGapInformation(danbikel.lisp.Sexp), Constant Field Values

collinsRelabelHeadChildrenAsArgs

public static final String collinsRelabelHeadChildrenAsArgs
The property to specify whether Training.identifyArguments(Sexp) should relabel head children as arguments. Such relabeling is unnecessary, since head children are already inherently distinct from other children; however, it is performed (and is possibly a bug) in Collins' parser, and so is available as a setting here in order to emulate that model.

See Also:
Training.identifyArguments(danbikel.lisp.Sexp), Constant Field Values

collinsRepairBaseNPs

public static final String collinsRepairBaseNPs
The property to specify whether Training.repairBaseNPs(Sexp) alters the training tree or leaves it untouched. If the value of this property is false, then the tree is untouched. The value of this property should be (the string representation of) a boolean (conversion is performed by the method Boolean.valueOf).

The value of this constant is "parser.training.collinsRepairBaseNPs".

See Also:
Training.repairBaseNPs(danbikel.lisp.Sexp), Constant Field Values

trainerShareCounts

public static final String trainerShareCounts
The property to indicate whether the trainer should share counts among various models' back-off levels.

See Also:
Constant Field Values

unknownWordThreshold

public static final String unknownWordThreshold
The property to specify the threshold below which words are considered unknown by the trainer. The value of this property must be (the string representation of) an integer.

The value of this constant is "parser.trainer.unknownWordThreshold".

See Also:
Trainer, Constant Field Values

countThreshold

public static final String countThreshold
The property to specify the threshold below which TrainerEvent objects are discarded by the trainer. The value of this property must be (the string representation of) a floating-point number.

The value of this constant is "parser.trainer.countThreshold".

See Also:
Trainer, Constant Field Values

derivedCountThreshold

public static final String derivedCountThreshold
The property to specify the threshold below which Event objects are discarded by the databases contained with Model objects. The value of this property must be (the string representation of) a floating-point number.

The value of this constant is "parser.trainer.derivedCountThreshold".

See Also:
Trainer, Constant Field Values

trainerReportingInterval

public static final String trainerReportingInterval
The property to specify the interval (in number of sentences) at which the trainer emits reports to System.err when training. The value of this property must be (the string representation of) an integer.

The value of this constant is "parser.trainer.reportingInterval".

See Also:
Trainer, Constant Field Values

keepAllWords

public static final String keepAllWords
The property to specify whether or not the trainer keeps all words. Normally, words falling below a threshold are mapped to the unknown word. The value of this property should be (the string representation of) a boolean (conversion is performed by the method Boolean.valueOf).

The value of this constant is "parser.trainer.keepAllWords".

See Also:
unknownWordThreshold, Trainer, Constant Field Values

keepLowFreqTags

public static final String keepLowFreqTags
The property to specify whether the trainer includes low-frequency words in its part of speech map. Normally, such low-frequency words get converted to word-feature vectors, and it is only those vector-tag pairs that get added to the part of speech map. If this property is set to be true, however, mappings from the original words to their parts of speech will also be added to the part of speech map.

The value of this constant is "parser.trainer.keepLowFreqTags".

See Also:
useLowFreqTags, Trainer, Constant Field Values

numPrevMods

public static final String numPrevMods
The property to specify how many previous modifiers the trainer outputs for its top-level count files. The value of this property must be (the string representation of) a non-negative integer.

The value of this constant is "parser.trainer.numPrevMods".

See Also:
Trainer, Constant Field Values

numPrevWords

public static final String numPrevWords
The property to specify how many head words of previous modifiers the trainer outputs for its top-level count files. The value of this property must be (the string representation of) a non-negative integer.

The value of this constant is "parser.trainer.numPrevWords".

See Also:
Trainer, Constant Field Values

globalModelStructureNumber

public static final String globalModelStructureNumber
The property to specify the model structure number to use when creating ProbabilityStructure objects. The model structure number will be appended to the end of the canonical names of model structure class name prefixes. For example, the canonical class name prefix for the head generation model structure is HeadModelStructure, so if this property has a value of "1", then the head model structure instantiated by Trainer.deriveCounts() will be the class HeadModelStructure1. The model structure numbers for specific model structure classes can be overridden by specifying one of the following, model structure-specific properties: If the user wishes to use model structure classes outside this package, the following properties may be used to specify fully-qualified model structure classnames, which will override all model structure number property settings (the canonical classname prefixes will not be used):

The value of this constant is "parser.trainer.globalModelStructureNumber".

See Also:
Constant Field Values

lexPriorModelStructureNumber

public static final String lexPriorModelStructureNumber
The properth to specify the model structure to use when creating the ProbabilityStructure object for the lexical prior submodel. This number will be appended to the canonical lexical prior model structure classname prefix, "danbikel.parser.LexPriorModelStructure", to form a classname, such as "danbikel.parser.LexPriorModelStructure1". This constant overrides the setting of the globalModelStructureNumber property.

The value of this constant is "parser.trainer.lexPriorModelStructureNumber".

See Also:
globalModelStructureNumber, lexPriorModelStructureClass, Constant Field Values

nonterminalPriorModelStructureNumber

public static final String nonterminalPriorModelStructureNumber
The properth to specify the model structure to use when creating the ProbabilityStructure object for the nonterminal prior submodel. This number will be appended to the canonical nonterminal prior model structure classname prefix, "danbikel.parser.LexPriorModelStructure", to form a classname, such as "danbikel.parser.NonterminalPriorModelStructure1". This constant overrides the setting of the globalModelStructureNumber property.

The value of this constant is "parser.trainer.nonterminalPriorModelStructureNumber".

See Also:
globalModelStructureNumber, nonterminalPriorModelStructureClass, Constant Field Values

topNonterminalModelStructureNumber

public static final String topNonterminalModelStructureNumber
The property to specify the model structure number to use when creating the ProbabilityStructure object for the head-generation submodel for heads whose parents are Training.topSym(). This number will be appended to the canonical top nonterminal model structure classname prefix, "danbikel.parser.TopNonterminalModelStructure", to form a classname, such as "danbikel.parser.TopNonterminalModelStructure1". This constant overrides the setting of the globalModelStructureNumber property.

The value of this constant is "parser.trainer.topNonterminalModelStructureNumber".

See Also:
globalModelStructureNumber, topNonterminalModelStructureClass, Constant Field Values

topLexModelStructureNumber

public static final String topLexModelStructureNumber
The property to specify the model structure number to use when creating the ProbabilityStructure object for the head word-generation submodel for heads of entire sentences. This number will be appended to the canonical top lexical model structure classname prefix, "danbikel.parser.TopLexModelStructure", to form a classname, such as "danbikel.parser.TopLexModelStructure1". This constant overrides the setting of the globalModelStructureNumber property.

The value of this constant is "parser.trainer.topLexModelStructureNumber".

See Also:
globalModelStructureNumber, topLexModelStructureClass, Constant Field Values

headModelStructureNumber

public static final String headModelStructureNumber
The property to specify the model structure number to use when creating the ProbabilityStructure object for the head-generation submodel. This number will be appended to the canonical head model structure classname prefix, "danbikel.parser.HeadModelStructure", to form a classname, such as "danbikel.parser.HeadModelStructure1". This constant overrides the setting of the globalModelStructureNumber property.

The value of this constant is "parser.trainer.headModelStructureNumber".

See Also:
globalModelStructureNumber, headModelStructureClass, Constant Field Values

gapModelStructureNumber

public static final String gapModelStructureNumber
The property to specify the model structure number to use when creating the ProbabilityStructure object for the gap-generation submodel. This number will be appended to the canonical gap model structure classname prefix, "danbikel.parser.GapModelStructure", to form a classname, such as "danbikel.parser.GapModelStructure1". This constant overrides the setting of the globalModelStructureNumber property.

The value of this constant is "parser.trainer.gapModelStructureNumber".

See Also:
globalModelStructureNumber, gapModelStructureClass, Constant Field Values

leftSubcatModelStructureNumber

public static final String leftSubcatModelStructureNumber
The property to specify the model structure number to use when creating the ProbabilityStructure object for the left-subcat-generation submodel. This number will be appended to the canonical left-subcat model structure classname prefix, "danbikel.parser.LeftSubcatModelStructure", to form a classname, such as "danbikel.parser.LeftSubcatModelStructure1". This constant overrides the setting of the globalModelStructureNumber property.

The value of this constant is "parser.trainer.leftSubcatModelStructureNumber".

See Also:
globalModelStructureNumber, leftSubcatModelStructureClass, Constant Field Values

rightSubcatModelStructureNumber

public static final String rightSubcatModelStructureNumber
The property to specify the model structure number to use when creating the ProbabilityStructure object for the right-subcat-generation submodel. This number will be appended to the canonical right-subcat model structure classname prefix, "danbikel.parser.RightSubcatModelStructure", to form a classname, such as "danbikel.parser.RightSubcatModelStructure1". This constant overrides the setting of the globalModelStructureNumber property.

The value of this constant is "parser.trainer.rightSubcatModelStructureNumber".

See Also:
globalModelStructureNumber, rightSubcatModelStructureClass, Constant Field Values

modNonterminalModelStructureNumber

public static final String modNonterminalModelStructureNumber
The property to specify the model structure number to use when creating the ProbabilityStructure object for the modifying nonterminal-generation submodel. This number will be appended to the canonical modifying nonterminal model structure classname prefix, "danbikel.parser.ModNonterminalModelStructure", to form a classname, such as "danbikel.parser.ModNonterminalModelStructure1". This constant overrides the setting of the globalModelStructureNumber property.

The value of this constant is "parser.trainer.modNonterminalModelStructureNumber".

See Also:
globalModelStructureNumber, modNonterminalModelStructureClass, Constant Field Values

modWordModelStructureNumber

public static final String modWordModelStructureNumber
The property to specify the model structure number to use when creating the ProbabilityStructure object for the modifying word-generation submodel. This number will be appended to the canonical modifying word model structure classname prefix, "danbikel.parser.ModWordModelStructure", to form a classname, such as "danbikel.parser.ModWordModelStructure1". This constant overrides the setting of the globalModelStructureNumber property.

The value of this constant is "parser.trainer.modWordModelStructureNumber".

See Also:
globalModelStructureNumber, modWordModelStructureClass, Constant Field Values

lexPriorModelStructureClass

public static final String lexPriorModelStructureClass
The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the lexical prior submodel. Specifying this property overrides the globalModelStructureNumber and lexPriorModelStructureNumber properties.

The value of this constant is "parser.trainer.lexPriorModelStructureClass".

See Also:
Constant Field Values

nonterminalPriorModelStructureClass

public static final String nonterminalPriorModelStructureClass
The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the nonterminal prior submodel. Specifying this property overrides the globalModelStructureNumber and nonterminalPriorModelStructureNumber properties.

The value of this constant is "parser.trainer.nonterminalPriorModelStructureClass".

See Also:
Constant Field Values

topNonterminalModelStructureClass

public static final String topNonterminalModelStructureClass
The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the head-generation submodel for heads whose parents are Training.topSym(). Specifying this property overrides the globalModelStructureNumber and topNonterminalModelStructureNumber properties.

The value of this constant is "parser.trainer.topNonterminalModelStructureClass".

See Also:
Constant Field Values

topLexModelStructureClass

public static final String topLexModelStructureClass
The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the head word-generation submodel for head words of entire sentences. Specifying this property overrides the globalModelStructureNumber and topLexModelStructureNumber properties.

The value of this constant is "parser.trainer.topLexModelStructureClass".

See Also:
Constant Field Values

headModelStructureClass

public static final String headModelStructureClass
The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the head-generation submodel. Specifying this property overrides the globalModelStructureNumber and headModelStructureNumber properties.

The value of this constant is "parser.trainer.headModelStructureClass".

See Also:
Constant Field Values

gapModelStructureClass

public static final String gapModelStructureClass
The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the gap-generation submodel. Specifying this property overrides the globalModelStructureNumber and gapModelStructureNumber properties.

The value of this constant is "parser.trainer.gapModelStructureClass".

See Also:
Constant Field Values

leftSubcatModelStructureClass

public static final String leftSubcatModelStructureClass
The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the left subcat-generation submodel. Specifying this property overrides the globalModelStructureNumber and leftSubcatModelStructureNumber properties.

The value of this constant is "parser.trainer.leftSubcatModelStructureClass".

See Also:
Constant Field Values

rightSubcatModelStructureClass

public static final String rightSubcatModelStructureClass
The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the right subcat-generation submodel. Specifying this property overrides the globalModelStructureNumber and rightSubcatModelStructureNumber properties.

The value of this constant is "parser.trainer.rightSubcatModelStructureClass".

See Also:
Constant Field Values

modNonterminalModelStructureClass

public static final String modNonterminalModelStructureClass
The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the modifying nonterminal-generation submodel. Specifying this property overrides the globalModelStructureNumber and modNonterminalModelStructureNumber properties.

The value of this constant is "parser.trainer.modNonterminalModelStructureClass".

See Also:
Constant Field Values

modWordModelStructureClass

public static final String modWordModelStructureClass
The property to specify the fully-qualified name of a class that extends ProbabilityStructure, to be instantiated by Trainer for the modifying word-generation submodel. Specifying this property overrides the globalModelStructureNumber and modWordModelStructureNumber properties.

The value of this constant is "parser.trainer.modWordModelStructureClass".

See Also:
Constant Field Values

collinsSkipWSJSentences

public static final String collinsSkipWSJSentences
The property to specify whether certain sentences are skipped during training on sections 02 through 21 of the Penn Treebank Wall Street Journal corpus in order to mimic Mike Collins' trainer on this now-standard training corpus. This property should not be set to true if training on any corpus other than the Penn Treebank Wall Street Journal corpus.

The value of this constants is "parser.trainer.collinsSkipWSJSentences".

See Also:
Constant Field Values

maxEventChunkSize

public static final String maxEventChunkSize
The property to specify how many events the trainer should read from an observations file before deriving counts (for use only when using a trainer output file; see Trainer.main(String[])). The value of this property should be (the string representation of) an integer.

The value of this constant is "parser.trainer.maxEventChunkSize".

See Also:
Constant Field Values

outputHeadToParentMap

public static final String outputHeadToParentMap
The property to specify whether the trainer should output the head-to-parent nonterminal map that it derives from its top-level observations. This property should be (the string representation of) a boolean.

The value of this constant is "parser.trainer.outputHeadToParentMap".

See Also:
Constant Field Values

outputSubcatMaps

public static final String outputSubcatMaps
The property to specify whether the trainer should output the subcat maps that it derives from its top-level observations. This property should be (the string representation of) a boolean.

The value of this constant is "parser.trainer.outputSubcatMaps".

See Also:
Constant Field Values

outputModNonterminalMap

public static final String outputModNonterminalMap
The property to specify whether the trainer should output the modifying nonterminal map that it derives from its top-level observations. This property should be (the string representation of) a boolean.

The value of this constant is "parser.trainer.outputModNonterminalMap".

See Also:
Constant Field Values

outputCollins

public static final String outputCollins
The property to specify whether the trainer outputs top-level events in the format output by Michael Collins' trainer to System.out when training on a Treebank input file. Note that in order to (near) perfectly emulate Collins' trainer, the property unknownWordThreshold should be set to 1.

The value of this constant is "parser.trainer.outputCollins".

See Also:
Constant Field Values

chartItemClass

public static final String chartItemClass
The property to specify the fully-qualified name of the subclass of Item to be used for chart items.

The value of this constant is "parser.chart.itemClass".

See Also:
Constant Field Values

collinsNPPruneHack

public static final String collinsNPPruneHack
The property to specify whether the chart should add 3 in natural log-space to the beam width for chart items whose root labels are either NP or NP-A, as is done by Collins' parser. That is, if the normal beam width is exp(-B), then this hack makes the beam for NP/NP-A chart items expand to exp(-(B+3)).

The value of this constant is "parser.chart.collinsNPPruneHack".

See Also:
Constant Field Values

kBest

public static final String kBest
The property to specify the maximum number of top-scoring theories to give as a parse. If this property is equals to 1, then the returned item from the decoder will be a single tree; if it is greater than 1, it will be a list of trees (i.e., a list of lists). This property should be (the string representation of) an integer greater than or equal to 1.

The value of this constant is "parser.decoder.kBest".

See Also:
Decoder.parse(danbikel.lisp.SexpList), Constant Field Values

maxSentLen

public static final String maxSentLen
The property to specify the maximum length a sentence can be; sentences greater than this length will not be parsed. If the value of this property is less than 1, the decoder will attempt to parse all sentences. This property should be (the string representation of) an integer.

The value of this constant is "parser.decoder.maxSentenceLength".

See Also:
Decoder, Constant Field Values

maxParseTime

public static final String maxParseTime
The property to specify the maximum time, in milliseconds, that the decoder will attempt to deliver a parse on a sentence. If this property is set to a value less than or equal to zero, the maximum parsing time will be infinite (i.e., there will not be a time-out). This property should be (the string representation of) an integer.

The value of this constant is "parser.decoder.maxParseTime".

See Also:
Decoder, Constant Field Values

useLowFreqTags

public static final String useLowFreqTags
The property to specify whether to use tags collected from low-frequency words by the trainer when seeding the chart, if the current word is a low-frequency word observed when training. If true, and if keepLowFreqTags was true during training, causes the decoder to attempt to find tags observed with this word in training, even if it was a low-frequency word. If false, the decoder will simply choose the first-best tag supplied by the input sentence, or, if the input does not contain pre-tagged words, will use all tags observed with the word's feature vector.

The value of this constant is "parser.decoder.useLowFreqTags".

See Also:
keepLowFreqTags, Decoder, Constant Field Values

decoderUsePruneFactor

public static final String decoderUsePruneFactor
The property to specify whether or not the decoding algorithm should prune away chart entries within a particular factor of the top-ranked chart entry in a given cell. The value of this property should be (the string representation of) a boolean (conversion is performed by the method Boolean.valueOf).

The value of this constant is "parser.decoder.usePruneFactor".

See Also:
Decoder, Constant Field Values

decoderPruneFactor

public static final String decoderPruneFactor
The property to specify the factor by which the decoder should prune away chart entries. The value of this property should be a floating point number that is the logarithm (base 10) of the desired factor (i.e., the factor employed will effectively be Math.pow(value, 10.0), where value is the value of this property). This form of pruning will only occur if the value of decoderUsePruneFactor is true.

The value of this constant is "parser.decoder.pruneFactor".

See Also:
Decoder, Constant Field Values

decoderMaxPruneFactor

public static final String decoderMaxPruneFactor
The property to specify the maximum prune factor when performing beam-widening. Beam-widening is when the decoder tries successively wider beams until there is a parse for the current sentence, or until the maximum prune factor is reached. The initial beam tried will be the value of decoderPruneFactor. If this property is not set (the value is the null object), then the maximum prune factor defaults to the value of decoderPruneFactor, and the decoder will not do beam-widening. The value of this property should be a floating point number that is the logarithm (base 10) of the desired factor (i.e., the factor employed will effectively be Math.pow(value, 10.0), where value is the value of this property).

See Also:
decoderPruneFactor, decoderPruneFactorIncrement, Constant Field Values

decoderPruneFactorIncrement

public static final String decoderPruneFactorIncrement
The property to specify the increment used when the decoder does beam-widening. The value of this property should be a floating point number that is the logarithm (base 10) of the desired beam increment (i.e., the increment employed will effectively be Math.pow(value, 10.0), where value is the value of this property).

See Also:
decoderMaxPruneFactor, Constant Field Values

decoderRelaxConstraintsAfterBeamWidening

public static final String decoderRelaxConstraintsAfterBeamWidening
The property to specify whether the decoder should relax all hard constraints (except the comma pruning rule, which is controlled by the decoderUseCommaConstraint setting) after performing all beam widening. Setting this property to true provides a means to making the parser more robust, as it should allow the decoder to always find at least some parse for any sentence.

For the purposes of this setting, a hard constraint is any implicit or explicit zero probability estimate that would cause the decoder to abandon an hypothesis (derivation). Note that setting this property to true has no effect on the constraint-satisfaction system provided by constraintSetFactoryClass.

See Also:
Constant Field Values

decoderUseCellLimit

public static final String decoderUseCellLimit
The property to specify whether the decoder should impose a limit on the number of chart items per cell in the chart. The value of this property should be (the string representation of) a boolean (conversion is performed by the method Boolean.valueOf).

The value of this constant is "parser.decoder.useCellLimit".

See Also:
Decoder, Constant Field Values

decoderCellLimit

public static final String decoderCellLimit
The property to specify the limit on the number of chart items the decoder will have per cell in its chart. This type of pruning will only occur if the value of decoderUseCellLimit is true. The value of this property should be (the string representation of) an integer.

The value of this constant is "parser.decoder.cellLimit".

See Also:
Decoder, Constant Field Values

decoderUseCommaConstraint

public static final String decoderUseCommaConstraint
The property to specify whether the decoder should employ a constraint on the way commas can appear in and around chart items. Specifically, the constraint is that for a chart item that represents the CFG rule
Z --> <.. X Y..>
two of its children X and Y are separated by a comma, then the last word in Y must be directly followed by a comma or must be the last word in the sentence.

See Also:
Constant Field Values

decoderUseOnlySuppliedTags

public static final String decoderUseOnlySuppliedTags
The property to specify whether the decoder should only use the tags supplied with words in an input file when seeding the chart. Normally, when a list of tags is supplied with every word in every input sentence, the supplied tags are only used with unknown words; for a known word, the possible tags are taken to be those with which the word was observed in training. A run-time error will occur if this setting is true but the input file of sentences does not contain at least one tag per word.

The value of this property should be (the string representation of) a boolean (conversion is performed by the method Boolean.valueOf).

The value of this constant is "parser.decoder.useOnlySuppliedTags".

See Also:
Constant Field Values

decoderSubstituteWordsForClosedClassTags

public static final String decoderSubstituteWordsForClosedClassTags
The property to specify whether the decoder should substitute a known word when the only tag for an unknown word is closed-class (i.e., the tag was never observed with the unknown word during training).

The value of this property should be (the string representation of) a boolean (conversion is performed by the method Boolean.valueOf).

The value of this constant is "parser.decoder.substituteWordsForClosedClassTags".

See Also:
Constant Field Values

decoderOutputHeadLexicalizedLabels

public static final String decoderOutputHeadLexicalizedLabels
The property to specify whether node labels in trees output by the decoder include their lexical head information, which is normally only used internally by the decoder. Even though this setting is grouped with the other decoder settings, it technically affects the implementation of CKYItem.toSexp().

The form of a lexicalized label will be

NT[isHead/word/tag]
where The bracket and delimiter characters '[', ']' and '/' are determined by the Treebank.nonTreebankLeftBracket(), Treebank.nonTreebankRightBracket() and Treebank.nonTreebankDelimiter() methods, respectively.

The words that were pruned before parsing and re-inserted after parsing (when the restorePrunedWords setting is true) will not be output with lexical information, since the parser never stochastically assigned head words to these nodes (but one could trivially map these preterminals to their lexicalized versions).

See Also:
CKYItem.toSexp(), Constant Field Values

decoderUseLocalProbabilityCache

public static final String decoderUseLocalProbabilityCache
The property to specify whether the decoder should wrap its DecoderServerRemote instance with an instance of CachingDecoderServer, which caches probability lookups. The value of this property should be (the string representation of) a boolean (conversion is performed by the method Boolean.valueOf).

See Also:
DecoderServerRemote, CachingDecoderServer, Constant Field Values

decoderLocalCacheSize

public static final String decoderLocalCacheSize
The property to specify the size of the cache used by the CachingDecoderServer instance used by the decoder when the decoderUseLocalProbabilityCache property is true. The value of this property is ignored when decoderUseLocalProbabilityCache is false. The value of this property should be (the string representation of) an integer.

See Also:
Constant Field Values

decoderUseHeadToParentMap

public static final String decoderUseHeadToParentMap
The property to specify whether the decoder should use the head-to-parent map derived during training. Use of this map potentially increases efficiency in the decoding process by causing the decoder to grow theories upward using only parents that occurred in training for a given chart item's root label. However, use of this map also decreases generality, for if the last back-off level of the head-generation model is more general than p(H | P) (if it is, for example, p(H)), then potentially any notnerminal can be the parent of any head nonterminal, meaning that the decoder should pursue all nonterminals as parents, which is its behavior when the value of this property is false.

See Also:
Constant Field Values

restorePrunedWords

public static final String restorePrunedWords
The property to specify that the decoder restores all pruned words. If this property is true and the decoder produces a parse for a sentence, then it is guaranteed that the number of tokens of the input sentence will be equal to the number of terminals of the parsed output sentence. The value of this property should be the (string representation of) a boolean (conversion is performed by the method Boolean.valueOf).

See Also:
Training.prune(danbikel.lisp.Sexp), Constant Field Values

useSimpleModNonterminalMap

public static final String useSimpleModNonterminalMap
The property to specify whether the decoder uses ModelCollection.simpleModNonterminalMap. This is the simpler of two mechanisms by which the decoder determines whether even to attempt to compute the probability of a modifying nonterminal in the context of some parent and head child and other syntactic context (this computation can be expensive, hence the decoder has two, less-expensive mechanisms to try to avoid such computation where possible). The simple modifying nonterminal map maps parent/head/side triples to possible modifying nonterminals that occurred in training. For example, if an NP occurred to the left of a VP whose parent was S in training, then the simple modifying nonterminal map would contain the mapping S,VP,left --> NP. Note that before such mappings are created, any argument augmentations on the parent and any gap augmentation on the head are removed. Note also that if the last level of back-off of the modifying-nonterminal generation model structure uses an even more reduced context than such a triple, then the simple modifying nonterminal map should not be used and this setting should be false. (Using the simple modifying nonterminal map is appropriate, however, with ModNonterminalModelStructure2, as well as several other modifying nonterminal model structures in the danbikel.parser.ms package.)

See Also:
Constant Field Values

downcaseWords

public static final String downcaseWords
The property to specify whether words are downcased during training and decoding. The value of this property should be (the string representation of) a boolean (conversion is performed by the method Boolean.valueOf).

The value of this constant is "parser.downcaseWords".

See Also:
Constant Field Values

sbSocketTimeout

public static final String sbSocketTimeout
The property to specify how long, in milliseconds, the SO_TIMEOUT value should be for the switchboard's RMI-client (caller) sockets.

See Also:
SwitchboardRemote.socketTimeout, Constant Field Values

keepAliveInterval

public static final String keepAliveInterval
The property to specify how often clients and servers should ping the "keep-alive" socket connected to the switchboard. The value of this property should be (the string representation of) an integer, representing milliseconds between pings.

See Also:
SwitchboardRemote.keepAliveInterval, Constant Field Values

keepAliveMaxRetries

public static final String keepAliveMaxRetries
The property to specify at most how many times the switchboard attempts to contact clients and servers before considering them dead (after an initial failure, thus making 0 a legal value for this property).

See Also:
SwitchboardRemote.keepAliveMaxRetries, Constant Field Values

serverDeathKillClients

public static final String serverDeathKillClients
The property to specify whether the switchboard should kill all of a server's clients when it detects that the server has died. This property should have the value "false" when servers are stateless. The value of this property should be (the string representation of) a boolean (conversion is performed by the method Boolean.valueOf).

See Also:
SwitchboardRemote.serverDeathKillClients, Constant Field Values

sbUserTimeout

public static final String sbUserTimeout
The property to specify how long (in milliseconds) sockets stay alive on the client (switchboard) side for RMI calls to switchboard user objects (subclasses of AbstractSwitchboardUser).

The value of this constant is "parser.switchboardUser.timeout".

See Also:
Constant Field Values

sbUserSBMaxRetries

public static final String sbUserSBMaxRetries
The property to specify at most how many times switchboard users should try to acquire the switchboard from the bootstrap registry before giving up, either when first starting up or in the event of a switchboard crash. If the value of this property is identical to the value of AbstractSwitchboardUser.infiniteTries, the parsing clients and decoder servers will indefinitely keep trying to re-acquire the switchboard in the event of its failure.

The value of this constant is "parser.switchboardUser.sbMaxRetries".

See Also:
Constant Field Values

serverFailover

public static final String serverFailover
The property to specify whether parsing clients should have server failover; that is, whether they should request a new server from the switchboard if their current server fails.

The value of this constant is "parser.switchboardUser.client.serverFailover".

See Also:
Failover, Constant Field Values

serverMaxRetries

public static final String serverMaxRetries
The property to specify at most how many times parsing clients should re-try their servers in the event of a method failure before giving up. If the value of this property is identical to the value of Retry.retryIndefinitely, clients will keep re-trying indefinitely.

The value of this constant is "parser.switchboardUser.client.serverMaxRetries".

See Also:
AbstractSwitchboardUser.SBUserRetry, Retry, Constant Field Values

serverRetrySleep

public static final String serverRetrySleep
The property to specify how many milliseconds to sleep between server re-tries.

The value of this constant is "parser.switchboardUser.client.serverRetrySleep".

See Also:
serverMaxRetries, Constant Field Values
Method Detail

getDouble

public static double getDouble(String setting)
Returns the double value of the specified setting, as determined by Double.parseDouble(String).

Parameters:
setting - the setting whose value is to be gotten
Returns:
the double value of the specified setting

getInteger

public static int getInteger(String setting)
Returns the integer value of the specified setting, as determined by Integer.parseInt(String).

Parameters:
setting - the setting whose value is to be gotten
Returns:
the integer value of the specified setting

getBoolean

public static boolean getBoolean(String setting)
Returns the boolean value of the specified setting, as determined by Boolean.valueOf(String).

Parameters:
setting - the setting whose value is to be gotten
Returns:
the boolean value of the specified setting

load

public static void load(String filename)
                 throws IOException
Loads the properties from the file of the specified filename, using load(File).

Parameters:
filename - the name of the file containing properties to load
Throws:
IOException - if load(File) throws an IOException

load

public static void load(File file)
                 throws IOException
Loads the properties from the specified file, using load(InputStream).

Parameters:
file - the file containing properties to load
Throws:
IOException - if creating a FileInputStream throws an IOException or if the call to load(InputStream) throws an IOException

load

public static void load(InputStream is)
                 throws IOException
Loads the properties from the specified input stream, using Properties.load(InputStream).

Parameters:
is - the input stream containing properties to load
Throws:
IOException

store

public static void store(OutputStream os)
                  throws IOException
Stores the properties of this class to the specified output stream, using Properties.store(OutputStream,String).

Parameters:
os - the output stream to which to write the properties contained in this class
Throws:
IOException

store

public static void store(OutputStream os,
                         String header)
                  throws IOException
Stores the properties of this class to the specified output stream, using Properties.store(OutputStream,String).

Parameters:
os - the output stream to which to write the properties contained in this class
header - the header text to put at the beginning of the properties file
Throws:
IOException - if there is an exception writing to the specified stream

store

public static void store(ObjectOutputStream os)
                  throws IOException
Stores the properties of this class to the specified output stream.

Parameters:
os - the output stream to which to write the properties contained in this class
Throws:
IOException - if there is an exception writing to the specified stream

storeSorted

public static void storeSorted(OutputStream os)
                        throws IOException
Stores a sorted list of the settings and values of this class to the specified output stream

Parameters:
os - the output stream to which to write a sorted list of the settings and values contained in this class
Throws:
IOException - if there is a problem writing to the specified output stream

storeSorted

public static void storeSorted(Properties props,
                               OutputStream os)
                        throws IOException
Stores a sorted list of the specified property-value pairs to the specified output stream

Parameters:
props - a container of property-value pairs
os - an output stream to which to write a sorted list of the property-value pairs contained in the specified Properties object
Throws:
IOException - if there is a problem writing to the specified output stream

storeSorted

public static void storeSorted(OutputStream os,
                               String header)
                        throws IOException
Stores a sorted list of the property-value pairs contained in this class to the specified output stream using the specified header.

Parameters:
os - the output stream to which to store a sorted list of the property-value pairs contained in this class
header - the header to write to the specified output stream before writing the sorted list of property-value pairs
Throws:
IOException - if there is a problem writing to the specified output stream

storeSorted

public static void storeSorted(Properties props,
                               OutputStream os,
                               String header)
                        throws IOException
Stores a sorted list of the specified container of property-value pairs to the specified output stream using the specified header.

Parameters:
props - a container of property-value pairs
os - the output stream to which to store a sorted list of the specified property-value pairs
header - the header to write to the specified output stream before writing the sorted list of property-value pairs
Throws:
IOException - if there is a problem writing to the specified output stream

set

public static void set(String name,
                       String value)
Sets the property name to value, using Properties.setProperty(String,String).

Parameters:
name - the name of the property to set
value - the value to which to set the property name

get

public static String get(String name)
Gets the value of the specified property.

Parameters:
name - the name of the property to get
Returns:
the value of the specified property

getSettings

public static Properties getSettings()
Returns a deep copy of the internal Properties object.


setSettings

public static void setSettings(Properties newSettings)
Allows any class to set the settings of this class directly using the specified Properties object.


getFileOrResourceAsStream

public static final InputStream getFileOrResourceAsStream(Class cl,
                                                          String name)
                                                   throws FileNotFoundException
Attempts to locate the file or resource with the specified name in one of three places:
  1. as a file path relative to the default settings directory, or
  2. as a file path relative to the current working directory, or relative to nothing, if name is an absolute path
  3. as a resource gotten from the class loader of the specified class
The default settings directory is described in the documentation for settingsDirOverride.

Parameters:
cl - the class that needs the file or resource
name - the name of the file or resource
Returns:
an InputStream of the specified file or resource, or null if the file or resource could not be found
Throws:
FileNotFoundException

getDefaultsResource

public static final InputStream getDefaultsResource()
                                             throws FileNotFoundException
Gets the fallback defaults from resource, thowing exception if resource unavailable (which is a very bad situation).

Throws:
FileNotFoundException

getIntProperty

public static int getIntProperty(String property,
                                 int defaultValue)
Returns the integer value of specified property, or the specified default value if the specified property does not exist.

Parameters:
property - the property or setting whose value is to be retrieved
defaultValue - the fallback default value for the specified property
Returns:
the integer value of specified property, or the specified default value if the specified property does not exist.

getBooleanProperty

public static boolean getBooleanProperty(String property,
                                         boolean defaultValue)
Returns the boolean value of specified property, or the specified default value if the specified property does not exist.

Parameters:
property - the property or setting whose value is to be retrieved
defaultValue - the fallback default value for the specified property
Returns:
the boolean value of specified property, or the specified default value if the specified property does not exist.

main

public static void main(String[] args)
Prints the default settings contained in the resource supplied with this parsing software.

Parameters:
args - ignored

Parsing Engine

Author: Dan Bikel.