Interface Parser<NT>

A Parser is an immutable object that is able to take a sequence of characters and return a parse tree according to some grammar.

Parsers are constructed by calling compile() with a grammar, which might be stored in a string, in a file, or read from a stream.

Once constructed, a Parser object is used by calling parse() on a string. Its result is a ParseTree showing how that string matches the grammar.

The type parameter NT should be an Enum type with the same (case-insensitive) names as the nonterminals in the grammar. This allows nonterminals to be referred to by your code with static checking and type safety. For example, if your grammar is:

const sumGrammar = "expression ::= constant '+' constant ;  constant ::= [0-9]+ ;"

then you should create a nonterminal enum like this:

enum SumGrammar { Expression, Constant };

and then use:

compile(sumGrammar, SumGrammar, SumGrammar.Expression)

to compile it into a parser.

The grammar of a grammar is as follows.

  @skip whitespaceAndComments {
    grammar ::= ( production | skipBlock )+
    production ::= nonterminal '::=' union ';'
    skipBlock ::= '@skip' nonterminal '{' production* '}'
    union :: = concatenation ('|' concatenation)*
    concatenation ::= repetition*
    repetition ::= unit repeatOperator?
    unit ::= nonterminal | terminal | '(' union ')'
  }
  nonterminal ::= [a-zA-Z_][a-zA-Z_0-9]*
  terminal ::= quotedString | characterSet | anyChar | characterClass
  quotedString ::= "'" ([^'\r\n\\] | '\\' . )* "'"
                 | '"' ([^"\r\n\\] | '\\' . )* '"'
  characterSet ::= '[' ([^\]\r\n\\] | '\\' . )+ ']'
  anyChar ::= '.'
  repeatOperator ::= [*+?] | '{' ( number | range | upperBound | lowerBound ) '}'
  number ::= [0-9]+
  range ::= number ',' number
  upperBound ::= ',' number
  lowerBound ::= number ','
  characterClass ::= '\\' [dsw]     // e.g. \d, \s, \w
  whitespaceAndComments ::= (whitespace | oneLineComment | blockComment)*
  whitespace ::= [ \t\r\n] 
  oneLineComment ::= '//' [^\r\n]* [\r\n]+ 
  blockComment ::= '/*' [^*]* '*' ([^/]* '*')* '/'
interface Parser<NT> {
    parse(str): ParseTree<NT>;
}

Type Parameters

  • NT

    a Typescript Enum with one symbol for each nonterminal used in the grammar, matching the nonterminals when compared case-insensitively (so ROOT and Root and root are the same).

Methods

Methods

  • Parses a string based on the grammar internally represented by the parser.

    Parameters

    • str: string

      string to parse

    Returns ParseTree<NT>

    ParseTree representing a successful parse of the string

    Throws

    ParseError if string cannot be parsed, describing approximately where the parsing error occurred