Document Style
of Technical Paper in HTML
last update 14:00 July 27,1997
Conceptual Image
Hiroshi
Sakuta
hsakuta@mit.edu
Room :1-380,Civil & Environmental Eng. Dept.
MIT,77 Massachusetts Ave.
Tel :253-8925
I appreciate Prof. Connor
and Carlos.
1. Introduction
2.Techincal Documents in Conventional Style
3.New Style for Technical Documents in HTML
1. Introduction
- As Web is in use for engineering, a lot of technical papers are waiting
to be rewritten and released in HTML. Though HTML documents are applied
to many category of Web site, engineering documents have some different
features about them.
1)Figures
Highly accurate figures are required, because their details may have a
critical meaning. A web browser can not show higher resolution pictures
than CRT .
i.e.:Photo of material micro structure, and x-ray pic. of patient's lung
2)Symbols
- There are irregular symbols and usage of symbols in technical documents.
i.e.:Suffix, Superfix, Hat, Underscore, Prime, Vector characters, Special
characters(Greek character set,Roman numeric),and Mathematical symbols
and formula (Integral, root, etc.,)
While SGML-ISO document
standard is proposed to solved such a problems, the best way to present
Mathematical formulas is through the use of "Tex" even now.
3)Graphs
- Graphs/charts have detailed shape, inclination and also logical structure.
They include special characters and graphical symbols(rectangular, solid
circle etc.), too. While it is logic existence, once it is drawn on paper
or screen, it has changed to a graphical image. Of cource, there is a good
solution for it in Internet programming.
4)Strong indexing
- The logic structure of a technical document usually reflects to document
style and structure, then the reader has to trace and follow them through
many paths. If there is no index or no symbol list in it, he may understand
less than usual.
Fortunately, HTML documents have some types of indexing system in its tag
system.
- HTML is a Mark-up Language, of course, but there is no facility to
control the logic structure of the document. It is the reason why an engineer
can not make his technical document by HTML as the primary information.
HTML browsers allow discrepancy of indexes and recognizes the tags as well
as possible even if it is incorrect. In fact, there is no automatic indexing
tool in HTML so as to the author knows.
2.Techincal Documents in Conventional
Style
- At first, let's rearrange the components of technical documents to
recognize the relationship, again.
2.1 Main text
- The main text is the back bone of the document and leads readers to
acknowledge the point of argument. In short document like journal paper,
there is abstract on the top and no chapter contents. In general, the main
text means the text itself and footnotes and equations.
2.2 Lists
- In conventional printed technical documents(abb. Paper Doc.), three
types of lists are placed on the top,
- 1) Chapter contents:
- Taking the hierarchy of the document, the numbers of chapters and sections
are fixed. There are custom of ordering in arrangement of them( i.e. 1.Introduction,
2.Objects, 3.Experimental Apparatus etc.)
- 2) Figure and Table lists:
- They exist almost for the publisher. Many readers including the author
never refer it. It is a kind of certification of rational writing.
- 3) Notations:
- Notations applied in the document are listed and explained. In well
edited documents, the notions are unified through it or covered chapter(s).
- 4) References:
- Referred bibliographies are listed. Sometimes, in case of too many,
each chapter has the list at its tail. The list is including its contents
and notations for citing. Half of them are usually the authors' papers.
2.3 Figures and Tables
- Generally, figures and tables are related to a part of the main text
and embedded in it to place as close as easy to see and recognize in one
sight. When they show some graphical and logical statement, the equations
or tables are able to be found nearby. The layout of main text and figures/tables
indicates how the document is satisfied and beautifully printed. The reason
why the layout is regarded important is that every documents should be
printed and binded as a book.
2.4 After publication
- Every document contains errors like as every proram code has bug(s)
in it. Conscience authors or publishers try to revise the book or document.
But once the document diffuses to the sea of readers, who can follow every
reader? The best effort is publishing 2nd revised version or declaring
errors in next report.
2.5 Document in Tex
- Many engineers and scientists use Tex or its induced system to write
their documents, which is proposed and implemented by Professor Knuth in
Stanford Univ. as printing tools for high quality mathematical equation
and paper, and has been supported by a lot of scientists and engineers.
The reason why Tex has been applied for more than this two decades, is
that Tex is the only constructual documentation system. It means not only
that Tex can arrange the numbering of chapters, but that it can control
the figure/table lists and referring. The author may not care of correspondence
of the citing words in main text and the numbering of figures. It is, however
said that tex is not friendly to average user of computer. In a sense,
Tex is a too restricted word processor, and almost a programming language.
Average users have to learn this programming language at first and writing
the program source. After writing, the computer requires them to fix bug(s)
in their Tex source. After debugging, every user can get free from troubles
in details in editing, though he has to create enough rational documents,
of course which does not always mean rational contents ;-).
- Hyper text and mark-up documents are originated in this system. They
have their source, in which control markers are put in necessary place,
and need some processing to get easily recognized documents for human.
They can be output as printed material beautifully similar to the conventional
documents. Oh, we can not be free from the custom of book reading. The
sight glasses software for viewing the document are called as "Previewer"
or "Browser" nowadays. Te output are usually applied to "Camera-ready"
manuscripts.
3.New Style for Technical Documents in HTML
- Before I refer to a style of technical documents, I will introduce
a concept of reading and recognition of papers or books.
- Almost all of printed matter in the world are not based on technical
theme, and we immerse into imaginary world through the document like being
absorbed in novels. The reader should be continuing calm reading and still
without movement. I would like to call it as "Static-Reading".
On the contrary, in a technical document the reader should pursuit and
trace the logic in it and aware of the real world. He has to be doubtfully
and reread previous parts of the document. He can not be in imaginary world.
His eyes move busily, he turns over again and again to compare the figures,
and he may put another journal beside to check the logic. I want to define
this style as "Dynamic-Reading". In dynamic- reading,
the reader might need efforts to continue reading because in a sense the
work is creative. He has to construct a logic structure in his brain. He
have to row to go, while the reader in static reading is sitting calm in
a canoe without oar on a flow. The author is considering that the dynamic
reading needs a dynamic document style.
- Literature = immerse into imaginary world : static reading
- Technical document = trace real world : dynamic reading
- We are not accustomed with neither dynamic reading nor writing dynamic
documents so much. Messy documents composed with many windows and link
tags without control will confuse the readers, then I rearrange the components
of document and its structure.
3.1 Advantage in HTML
- Link system:
- To follow the logic flow of the document, the author can make link
arbitrarily the main text to figures or other parts of text. The important
feature of HTML document is reverse linkage which can not be implemented
or is very troublesome work in Paper doc. If the reader is interested in
a figure, he can link the main text where it is referred.
Revision control:
- HTML documents can be published in Internet directly without print-out
or other procedures. It means that it is easy to revise the documents.
Time for publishing:
- Paper docs need many processes, which are exchange draft between the
author and the publisher, some time corrections, and making the book. If
there is difference of the days taking the processes, it may affect some
industrial ownership.
3.2 Components of Document
The components of document are following for technical documents.
- 1) Main text and footnote(comment)
- a) plain character text(unicode/alphanumeric)
b) Non ascii character text(include equations)
- 2) Logical figure
- a) Graphs
b) Diagrams/Flowcharts
c) Table(logical figure composed with characters and boundary lines)
3.3 Lists
Lists are Link lists in HTML documents, and they play important roles
in dynamic reading.
- a) Chapter(Contents)
b) Fig/Pic/Tab
c) Notation/Symbol/Equation
d) Term/Index
e) Reference
3.4 Windows required and their link system
- a) Main window
- The main window is devided in three frames which are the main identification
frame like main title, chapter contents list, and text itself. They may
be associated with Paper doc.
- b) Fig/Tab list window
- c) Fig/Tab windows
- d) Notations/equations windows
- e) Index window
- f) Reference window
3.5 Software
- Requirement
- automatic link tag generator
- debugging of structure
- Applicable current software
- Editor
- Browser
- Image retouch software
- Eqation handler
- Developed software
- Corresponder
- Graph template
3.6 Flow of writing or rewriting technical documents
in HTML
- Source text:
Initial work for writing the document may be similar to one of Tex source
Chaptering of text:
Making graphical image of special symbols and equations:
Making graphs and figures:
Making database for corresponder:
Run Corresponder :
3.7 Future role of journal publishers