VT420 Parser

A complete parser manual is also vailable.



With the increased popularity of PCs among the MIT community, one of the most needed networking applications is a secure telnet client for PCs. The VT420 parser I had been working on will be used by a new kerborized telnet client for DOS, Microsoft Windows, Windows NT, and Windows 95. The client is being developed by McGill University in Canada, and will be available to MIT once it is released. The source code for the parser, however, is available immediately.

The main job of the parser is, of course, to parse the data stream by identifying the ASCII characters and terminal commands and reporting them to the higher-level code. In order to identify the terminal commands, the parser has to match the data against the possible syntaxes the commands can have. The five types of command syntaxes in VT data stream are:

Both CSI and DCS sequences can have multiple optional parameters, and may have several commands combined in a single sequence. Finally, there are exceptions that do not fit in there five categories. In the total, the parser recognizes nearly 300 commands.

An important issue is the integration of the parser in an application the way that allows parsing to occur while the data is being received. To satisfy this scheme, the parser has been implemented as a finite state machine that preserves its state between the calls. On each call the parser is presented with the a fragment of the data stream. If the parser has determined that a command sequence has been split across the data fragments, it will wait till the very last fragment of the sequence is passed to complete the parsing. By moving all essential state variables to a dynamically allocatable structure, this implementation of the parser is able to parse data streams from several independent processes concurrently.

Unlike the real VT420 terminal, the parser has an enhanced error-recovery and error-diagnostic features. Instead of silently ignoring errors, the parser may optionally be set to report errors to the higher-level code for logging or a notice to the user. The parser may also optionally provide the application with the dumps of the unknown or invalid control sequences received, which could be very useful in debugging the host software. Also, unlike the real VT terminal, the parser does not ignore compound commands due to an error in some one part of it. Instead, it only ignores (or reports, if required) the offending part. Finally, the parser implements the undocumented by DEC locator-controlling codes (like mouse or trackball) and non-standard control sequences for the interaction with the file-system, which are implemented by a few terminal emulators.

Two of the high-points of this implementation are the parser's adaptability and portability. For each of the five syntaxes the parser utilizes a separate table, which defines the commands and allows to set the defaults, flags, type and number of arguments, and other parameters. The portability is achieved by coding the parser entirely in ANSI C. That is why I have been able to compile and run the parser with a testing application on both DOS and UNIX platforms.

With the agreement of McGill University, a formal announcement of this implementation of the VT420 parser has been made to the relevant Usenet groups on Wednesday, August 16, 1995.

If you have any comments, suggestions, or have discovered a bug or a misfeature, please email to the Dos and Windows development group at dosdev@mit.edu


For information about the video terminal products, I would recomend browsnig an archive maintained by Richard Shuford.
Igor Lyubashevskiy / igorlord@mit.edu