Error Handling and RecoveryAll syntactic and semantic errors cause parser exceptions to be thrown. In particular, the methods used to match tokens in the parser base class (match et al) throw MismatchedTokenException. If the lookahead predicts no alternative of a production in either the parser or lexer, then a NoViableAltException is thrown. The methods in the lexer base class used to match characters (match et al) throw analogous exceptions. ANTLR will generate default error-handling code, or you may specify your own exception handlers. Either case results (where supported by the language) in the creation of a try/catch block. Such try{} blocks surround the generated code for the grammar element of interest (rule, alternate, token reference, or rule reference). If no exception handlers (default or otherwise) are specified, then the exception will propagate all the way out of the parser to the calling program. ANTLR's default exception handling is good to get something working, but you will have more control over error-reporting and resynchronization if you write your own exception handlers. Note that the '@' exception specification of PCCTS 1.33 does not apply to ANTLR 2.0. ANTLR Exception HierarchyANTLR-generated parsers throw exceptions to signal recognition errors or other stream problems. All exceptions derive from ANTLRException. The following diagram shows the hierarchy:
The typical main or parser invoker has try-catch around the invocation: try { ... } catch(TokenStreamException e) { System.err.println("problem with stream: "+e); } catch(RecognitionException re) { System.err.println("bad input: "+re); } Lexer rules throw RecognitionException, CharStreamException, and TokenStreamException. Parser rules throw RecognitionException and TokenStreamException. Modifying Default Error Messages With ParaphrasesThe name or definition of a token in your lexer is rarely meaningful to the user of your recognizer or translator. For example, instead of seeing Error: line(1), expecting ID, found ';' you can have the parser generate: Error: line(1), expecting an identifier, found ';' ANTLR provides an easy way to specify a string to use in place of the token name. In the definition for ID, use the paraphrase option: ID options { paraphrase = "an identifier"; } : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')* ; Note that this paraphrase goes into the token types text file (ANTLR's persistence file). In other words, a grammar that uses this vocabulary will also use the paraphrase. Parser Exception HandlingANTLR 2.0 generates recursive-descent recognizers. Since recursive-descent recognizers operate by recursively calling the rule-matching methods, this results in a call stack that is populated by the contexts of the recursive-descent methods. Parser exception handling for grammar rules is a lot like exception handling in a language like C++ or Java. Namely, when an exception is thrown, the normal thread of execution is stopped, and functions on the call stack are exited sequentially until one is encountered that wants to catch the exception. When an exception is caught, execution resumes at that point. In ANTLR 2.0, parser exceptions are thrown when (a) there is a syntax error, (b) there is a failed validating semantic predicate, or (c) you throw a parser exception from an action. In all cases, the recursive-descent functions on the call stack are exited until an exception handler is encountered for that exception type or one of its base classes (in non-object-oriented languages, the hierarchy of execption types is not implemented by a class hierarchy). Exception handlers arise in one of two ways. First, if you do nothing, ANTLR will generate a default exception handler for every parser rule. The default exception handler will report an error, sync to the follow set of the rule, and return from that rule. Second, you may specify your own exception handlers in a variety of ways, as described later. If you specify an exception handler for a rule, then the default exception handler is not generated for that rule. In addition, you may control the generation of default exception handlers with a per-grammar or per-rule option. Specifying Parser Exception-HandlersYou may attach exception handlers to a rule, an alternative, or a labeled element. The general form for specifying an exception handler is: exception [label] catch [exceptionType exceptionVariable] { action } catch ... catch ... where the label is only used for attaching exceptions to labeled elements. The exceptionType is the exception (or class of exceptions) to catch, and the exceptionVariable is the variable name of the caught exception, so that the action can process the exception if desired. Here is an example that catches an exception for the rule, for an alternate and for a labeled element: rule: a:A B C | D E exception // for alternate catch [RecognitionException ex] { reportError(ex.toString()); } ; exception // for rule catch [RecognitionException ex] { reportError(ex.toString()); } exception[a] // for a:A catch [RecognitionException ex] { reportError(ex.toString()); } Note that exceptions attached to alternates and labeled elements do not cause the rule to exit. Matching and control flow continues as if the error had not occurred. Because of this, you must be careful not to use any variables that would have been set by a successful match when an exception is caught. Default Exception Handling in the LexerNormally you want the lexer to keep trying to get a valid token upon lexical error. That way, the parser doesn't have to deal with lexical errors and ask for another token. Sometimes you want exceptions to pop out of the lexer--usually when you want to abort the entire parsing process upon syntax error. To get ANTLR to generate lexers that pass on RecognitionException's to the parser as TokenStreamException's, use the defaultErrorHandler=false grammar option. Note that IO exceptions are passed back as TokenStreamIOException's regardless of this option. Here is an example that uses a bogus semantic exception (which is a subclass of RecognitionException) to demonstrate blasting out of the lexer: class P extends Parser; { public static void main(String[] args) { L lexer = new L(System.in); P parser = new P(lexer); try { parser.start(); } catch (Exception e) { System.err.println(e); } } } start : "int" ID (COMMA ID)* SEMI ; class L extends Lexer; options { defaultErrorHandler=false; } {int x=1;} ID : ('a'..'z')+ ; SEMI: ';' {if ( expr ) throw new SemanticException("test", getFilename(), getLine());} ; COMMA:',' ; WS : (' '|'\n'{newline();})+ {$setType(Token.SKIP);} ; When you type in, say, "int b;" you get the following as output: antlr.TokenStreamRecognitionException: test Version: $Id: //depot/code/org.antlr/release/antlr-2.7.0/doc/err.html#3 $ |