Document Style of Technical Paper in HTML

last update 14:00 July 27,1997


Conceptual Image


Hiroshi Sakuta
hsakuta@mit.edu
Room :1-380,Civil & Environmental Eng. Dept.
MIT,77 Massachusetts Ave.
Tel :253-8925

I appreciate Prof. Connor and Carlos.


1. Introduction

2.Techincal Documents in Conventional Style

3.New Style for Technical Documents in HTML


1. Introduction

As Web is in use for engineering, a lot of technical papers are waiting to be rewritten and released in HTML. Though HTML documents are applied to many category of Web site, engineering documents have some different features about them.

1)Figures


Highly accurate figures are required, because their details may have a critical meaning. A web browser can not show higher resolution pictures than CRT .
i.e.:Photo of material micro structure, and x-ray pic. of patient's lung

2)Symbols

There are irregular symbols and usage of symbols in technical documents.
i.e.:Suffix, Superfix, Hat, Underscore, Prime, Vector characters, Special characters(Greek character set,Roman numeric),and Mathematical symbols and formula (Integral, root, etc.,)
While SGML-ISO document standard is proposed to solved such a problems, the best way to present Mathematical formulas is through the use of "Tex" even now.

3)Graphs

Graphs/charts have detailed shape, inclination and also logical structure.
They include special characters and graphical symbols(rectangular, solid circle etc.), too. While it is logic existence, once it is drawn on paper or screen, it has changed to a graphical image. Of cource, there is a good solution for it in Internet programming.

4)Strong indexing

The logic structure of a technical document usually reflects to document style and structure, then the reader has to trace and follow them through many paths. If there is no index or no symbol list in it, he may understand less than usual.
Fortunately, HTML documents have some types of indexing system in its tag system.
HTML is a Mark-up Language, of course, but there is no facility to control the logic structure of the document. It is the reason why an engineer can not make his technical document by HTML as the primary information. HTML browsers allow discrepancy of indexes and recognizes the tags as well as possible even if it is incorrect. In fact, there is no automatic indexing tool in HTML so as to the author knows.

2.Techincal Documents in Conventional Style

2.1 Main text

The main text is the back bone of the document and leads readers to acknowledge the point of argument. In short document like journal paper, there is abstract on the top and no chapter contents. In general, the main text means the text itself and footnotes and equations.

2.2 Lists

In conventional printed technical documents(abb. Paper Doc.), three types of lists are placed on the top,
1) Chapter contents:
Taking the hierarchy of the document, the numbers of chapters and sections are fixed. There are custom of ordering in arrangement of them( i.e. 1.Introduction, 2.Objects, 3.Experimental Apparatus etc.)
2) Figure and Table lists:
They exist almost for the publisher. Many readers including the author never refer it. It is a kind of certification of rational writing.
3) Notations:
Notations applied in the document are listed and explained. In well edited documents, the notions are unified through it or covered chapter(s).
4) References:
Referred bibliographies are listed. Sometimes, in case of too many, each chapter has the list at its tail. The list is including its contents and notations for citing. Half of them are usually the authors' papers.

2.3 Figures and Tables

Generally, figures and tables are related to a part of the main text and embedded in it to place as close as easy to see and recognize in one sight. When they show some graphical and logical statement, the equations or tables are able to be found nearby. The layout of main text and figures/tables indicates how the document is satisfied and beautifully printed. The reason why the layout is regarded important is that every documents should be printed and binded as a book.

2.4 After publication

Every document contains errors like as every proram code has bug(s) in it. Conscience authors or publishers try to revise the book or document. But once the document diffuses to the sea of readers, who can follow every reader? The best effort is publishing 2nd revised version or declaring errors in next report.

2.5 Document in Tex

Many engineers and scientists use Tex or its induced system to write their documents, which is proposed and implemented by Professor Knuth in Stanford Univ. as printing tools for high quality mathematical equation and paper, and has been supported by a lot of scientists and engineers. The reason why Tex has been applied for more than this two decades, is that Tex is the only constructual documentation system. It means not only that Tex can arrange the numbering of chapters, but that it can control the figure/table lists and referring. The author may not care of correspondence of the citing words in main text and the numbering of figures. It is, however said that tex is not friendly to average user of computer. In a sense, Tex is a too restricted word processor, and almost a programming language. Average users have to learn this programming language at first and writing the program source. After writing, the computer requires them to fix bug(s) in their Tex source. After debugging, every user can get free from troubles in details in editing, though he has to create enough rational documents, of course which does not always mean rational contents ;-).
Hyper text and mark-up documents are originated in this system. They have their source, in which control markers are put in necessary place, and need some processing to get easily recognized documents for human. They can be output as printed material beautifully similar to the conventional documents. Oh, we can not be free from the custom of book reading. The sight glasses software for viewing the document are called as "Previewer" or "Browser" nowadays. Te output are usually applied to "Camera-ready" manuscripts.

3.New Style for Technical Documents in HTML

Before I refer to a style of technical documents, I will introduce a concept of reading and recognition of papers or books.
Almost all of printed matter in the world are not based on technical theme, and we immerse into imaginary world through the document like being absorbed in novels. The reader should be continuing calm reading and still without movement. I would like to call it as "Static-Reading". On the contrary, in a technical document the reader should pursuit and trace the logic in it and aware of the real world. He has to be doubtfully and reread previous parts of the document. He can not be in imaginary world. His eyes move busily, he turns over again and again to compare the figures, and he may put another journal beside to check the logic. I want to define this style as "Dynamic-Reading". In dynamic- reading, the reader might need efforts to continue reading because in a sense the work is creative. He has to construct a logic structure in his brain. He have to row to go, while the reader in static reading is sitting calm in a canoe without oar on a flow. The author is considering that the dynamic reading needs a dynamic document style.
We are not accustomed with neither dynamic reading nor writing dynamic documents so much. Messy documents composed with many windows and link tags without control will confuse the readers, then I rearrange the components of document and its structure.

3.1 Advantage in HTML

Link system:
To follow the logic flow of the document, the author can make link arbitrarily the main text to figures or other parts of text. The important feature of HTML document is reverse linkage which can not be implemented or is very troublesome work in Paper doc. If the reader is interested in a figure, he can link the main text where it is referred.
Revision control:
Time for publishing:

3.2 Components of Document

The components of document are following for technical documents.

3.3 Lists

Lists are Link lists in HTML documents, and they play important roles in dynamic reading.

3.4 Windows required and their link system

a) Main window
The main window is devided in three frames which are the main identification frame like main title, chapter contents list, and text itself. They may be associated with Paper doc.
b) Fig/Tab list window
c) Fig/Tab windows
d) Notations/equations windows
e) Index window
f) Reference window

3.5 Software

Requirement
automatic link tag generator
debugging of structure
Applicable current software
Developed software
  1. Corresponder
  2. Graph template

3.6 Flow of writing or rewriting technical documents in HTML

    Source text:

    Initial work for writing the document may be similar to one of Tex source

    Chaptering of text:
    Making graphical image of special symbols and equations:
    Making graphs and figures:
    Making database for corresponder:
    Run Corresponder :

3.7 Future role of journal publishers