If you've ever tried to create web pages that contain mathematical notation for expressions and equations, you've learned just how inconvenient HTML (HyperText Markup Language) is for such tasks. If you haven't attempted this yet, then be forwarned that providing mathematical notation on a web page isn't as easy as providing ordinary text. In this "how to" document we will look at some of the options for placing non-interactive mathematical content on the web.
Here is the crux of the problem. There are (at least) three important requirements for effectively representing and/or presenting mathematical content:
The idea behind the first requirement is that mathematical expressions often contain some special symbols that are used to represent operations or constants or other concepts within the intellectual domain being discussed. While it may be possible to substitute other symbols for the preferred ones, such as using an "L" in place of a lambda, this is undesirable and creates a burden on the author to explain the unusual choice of symbols.
The second requirement tries to capture the notion that the placement of symbols in expressions is not arbitrary; the symbols are combined in specific ways to signify relationships or to reflect the algebraic or analytic steps that could be performed. For example, since we expect "1+2*3" to evaluate to 7, we wouldn't consider it acceptable for a browser to display "+123*". Even though all the correct symbols are present the placement of the symbols doesn't capture the intended interpretation of the expression. Each piece plays a specific role in the overall expression, and when we read and write expressions it is the relative placement of those pieces that gives us visual cues to their roles.
Finally the third requirement reflects the reality that not all expressions are presented with all the symbols in a linear order; fractions and subscripts and integral limits and suchlike all cause us to want to place symbols in a particular location with respect to other symbols. As the expressions increase in size and complexity, linearization becomes increasingly impractical and illegible. We want two-dimensional control over the rendering of the expression so that we can efficiently create visual cues about the relationships between the parts of the expression.
All this may sound straight-forward but HTML, which derives from SGML (Standard Generalized Markup Language), was originally intended to deal mostly with the second requirement. The idea of a markup language was to insert some special codes --- the markup tags --- in a text file to indicate the structural relationships between the pieces of the document. Markup languages were not expected to replace WYSIWYG (What You See Is What You Get) editors and professional layout tools; the programs used to view markup files (like web browsers) were to have a great deal of lattitude in rendering the structured content in order to account for differences in document display across heterogeneous computing environments and to accomodate user-specified preferences. This has sometimes been interpreted as a prohibition on creating markup tags whose only purpose was to specify certain visual renderings, something an author might desire in order to address our third requirement of exerting two-dimensional control over the presentation of mathematical expressions.
Markup languages based on SGML didn't preclude the inclusion of mathematical notation or graphics, but until the explosion of interest in the web most of the R&D effort was directed at providing mechanisms for organizing and indexing plain text. Since (artistic arguments aside) the meaning of text is not dependent upon appearance-oriented controls, tags that support rendering concerns have historically been viewed a bit negatively by the established HTML/SGML community. The preferred way to support complex mathematical content would have been to substantially extend the specification of HTML to capture the structure of expressions; browsers implementing the extensions would be expected to somehow deal with the third requirement in a way that was consistent with the markup structure of the mathematical expressions. A "math mode" extension that was a hybrid of our second and third requirements was considered during the HTML 3.0 draft [1]; unfortunately little headway was made on the proposal and no mention of it remains in the current HTML 4.0 draft [2]. For our first requirement there is some support for special symbols within HTML by using markups called "entities", but the list of officially recognized entities is not sufficient for the variety of mathematical usages and as the list is static (it is part of a standard) there is no way for you to augment it to include the missing symbols that you would want to use.
Periodically an effort is launched to revise HTML to improve its features and to cope with the compatability pressures of an active commercial browser market. While the math mode of the HTML 3 draft didn't survive --- except for the <sup> and <sub> tags that provide simple superscripts and subscripts --- there does seem to be a resurgence of this effort within the World Wide Web Consortium as something called "MathML", the Mathematical Markup Language [3]. The one aspect of math support that is present in the HTML 4 draft is an increased number of special-character entities. This still has the disadvantage of being a static list, but it is least a much longer list containing many commonly-used mathematical symbols. Given current progress both in HTML standardization and independent vendor efforts to enable specialized font support in web pages [4] [5], the future looks good for eventually satisfying our first requirement in a portable (ie: non-browser-specific) way; perhaps someday MathML may result in something that will address all three requirements.
Getting back to the purpose of this article, there are many ways to try and cope with mathematical content on the web, so let's look at some of them and consider their strengths and weaknesses.
Perhaps the simplest mechanism for dealing with equations is to treat HTML somewhat like a typewriter with memory. You can use the <pre> tags to contain text that you want presented literally the way you typed it, with no attempt by the browser to adjust or reformat it. This works much like the LaTeX verbatim environment and as such is only useful for simple notation and simple layouts. The layout is performed by manually positioning text instead of by interpreting the structure of the expression. The result is crude, doesn't readily scale to large expressions, and modifying or reorganizing the preformatted text is very error-prone and time-consuming.
Given the obvious limitations of preformatted text, the next step that web-page authors can take is to present their mathematical content as graphic images on the web. This can be done simply by scanning printed pages, but with the large file sizes, the resulting binary-image printing problems, and the performance bottlenecks of slow internet connections and heavily-loaded web servers, you quickly have to move on to some better solution method. There are various conversion utilities---some of them available as freeware---that take a particular document format, extract ordinary text as text with HTML markup, and intersperse the text with small graphic images for those portions too complex for HTML (such as mathematical equations). Probably the best-known utility of this type is latex2html [6]. For LaTeX written specifically for latex2html processing the program is simple enough to use once installed and configured, but you have to be comfortable with the steps required to ftp, uncompress, untar, build, and install software in a Unix environment in order to use it. It only supports a subset of LaTeX so you will end up either substantially editing the source LaTeX, the resulting HTML, or the latex2html configuration information in order to get good results.
Many browsers allow you to launch an external program to display particular types of files; these programs are often referred to as "viewers". The way they work is as follows:
External viewers are often used to make it easier to place files on the web that have some complicated kind of document format. For example, in the Unix environment you might modify your configuration of Netscape or Mosaic to launch "xdvi" whenever you try to download the DVI output of TeX or LaTeX. There are few limitations on external viewers; usually you just need some way to tell your browser how to launch the program and pass along the file that has been downloaded. The trickier part is in getting the mime information correct, but the details of that are beyond the scope of this article. Refer to the documentation available with your particular browser for more information.
Note that how good the appearance of the mathematical content is, and how easy it is to work with, now has nothing to do with the browser and is dependent entirely on the software used to produce the document and on the external viewer. While this sounds like a possible win, practically speaking it can be a clunky solution that requires an appropriate viewer program to be available on all the different kinds of computing platforms used by the readers of those web pages, and requires those readers to possess some relatively subtle knowledge about how web browsers are configured. It also has the disadvantage of losing the integration of the math content with other material that is being provided in HTML, comparable to situations where you have to continually bounce between different reference books for pieces of related information.
Plug-ins are a refinement of the notion of an external viewer. A plug-in is a program that knows how to integrate its functionality with the web browser so completely that you might not even notice it is there. Usually no separate window will appear; whatever the plug-in will do for you, it displays within the confines of the browser window. While plug-ins provide a more integrated solution than external viewers, they tend to be available on very few platforms; typically a given plug-in will only work on Mac, or on Windows, but rarely both and almost never in a Unix environment. With this caveat, there are some plug-ins that you really should look at if you want to place mathematical content on the web. None of these is in the public domain, but all can be downloaded for free.
One of the most flexible multi-platform solutions is provided by Adobe's Acrobat Reader [7]. Depending on your choice of computing environment (including Mac, MS Windows, and several versions of Unix), Acrobat Reader is available as either a plug-in or as an external viewer. The advantage of Acrobat is that you can convert any postscript file into Acrobat's Portable Document Format (PDF) using the "distiller" functionality (only available in the commercial versions of Acrobat). The distiller is easy to use and will let you create small document files that can provide web-like navigation features and good renderings of mathematical notation. Whenever you want to revise the PDF files you just revise the original documents (in Word, FrameMaker, LaTeX, or whatever you prefer), generate fresh postscript, and distill the postscript into PDF again.
Notes:
This is an impressive plug-in that supports a large subset of LaTeX. Unfortunately it only exists on 32-bit MS Windows platforms [8], but if you only need a single-platform solution and you have large existing collection of LaTeX documents, then techexplorer is something to really pay attention to. Even without hand-editing my original LaTeX files I found techexplorer did a decent job of displaying them, which I have found is rarely the case with latex2html. I wish it had support for style files as they form the mainstay of most LaTeX-based publications, but otherwise I was quite pleased with this program.
Notes:
As a Java applet WebEQ is the most cross-platform solution available for placing mathematical notation on the web [9]. It supports a very small subset of TeX and LaTeX called WebTeX which is mostly restricted to features that are used for displaying mathematical expressions. WebEQ actually began its existance by trying to implement the now-defunct HTML 3.0 math mode, and in the future the developers hope to add support for MathML. In the meantime, however, you'll need to learn WebTeX and the basics of how to integrate Java applets into web pages in order to use WebEQ. Here is an example HTML page for displaying an equation containing radical signs. It assumes that the directory containing this web page has itself a subdirectory named "classes" containing the webeq fonts and compiled Java classes. The "size" parameter is the approximate point size of the font that will be used for display. The "eq" parameter is the equation or expression to be rendered.
<html> <head> <title>WebEQ Test</title> </head> <body> <applet codebase="classes" code="geom.webeq.app.mdraw" width=500 height=100> <param name=size value=36> <param name=color value="#ffffff"> <param name=eq value="y=\sqrt{x}+\root{3}{2+\mu}"> </applet> </body> </html>which would result in the web page displaying (after a delay to load all the applet materials) in some suitable font:
Note that your browser must be Java-enabled to see the above result.
This may interest those of you who use Microsoft Word, particularly within either 16-bit or 32-bit MS Windows environments. The Word Viewer will display Word documents over the web, and those documents could contain whatever mathematical notation you've constructed within Word [10]. So far I've found that the viewer plug-in itself works well, but I haven't had much luck with getting the Equation Editor included with Word in Microsoft Office 97 to do anything useful without crashing. Even without the use of this utility or Word itself, the Word Viewer could still be useful for anybody that has a word processor that can save files in Rich Text Format (RTF); the viewer is equally happy reading either RTF or Word documents.
I began this column by saying there are at least three important requirements for effectively representing and/or presenting mathematical content. The reason for the qualification "at least" is because there are, of course, a host of intellectual or pedagogical objectives that have little to do with the mechanics of placing mathematical notation on the web. Ultimately it is these other concerns you want to be able to devote your energy to; if that were not the case we all might as well still be grinding our own ink and cutting quills to use as pens. While web-based technologies still have a little ways to go before authors will have the simplicity of preparation and flexibility of presentation that we all desire, attainment of that goal is not far off in the future. We have some workable tools now and they are only going to get better over time.