Design

The effective design of the Physical Markup Language must balance a myriad of competing design issues and constraints. Since we have essentially eliminated most of the information and structure on the tagged object, all the complexity of description has moved to the networked database. Issues such as syntax, data types, complexity, extensibility, security, application domains, units of measure and more, must be weighted to effective achieve the objectives set out for the language.

In the following sections, we will consider a variety of design issues, key assumptions and other considerations in the formation of the PML. This is in not an exhaustive list, but a starting point in the language design.

We must remember numerous languages and standards have developed in the past, yet few see wide spread adoption. We wish to avoid the pitfalls of the past, and develop a standard, which is simple, convenient and effective.

Generality

The objective the Physical Markup Language is to be a universal standard for describing physical objects, processes and environments. Clearly given the broad scope of this objective, the language cannot be overly detailed or specific. In the classic choice between depth and breath, the proposed PML will lean toward a more general standard, rather than industry specific implementations.

There are number of reasons for this decision. First, a broad language will address the largest number of industries. Second, software developed for the language will have the greatest potential market. The quality and capability of the code will likely be superior to any specific implementation. This is analogous to Web browsers, such as AOL's Netscape™ or Microsoft's Internet Explorer™, both of which are generally superior to similar applications targeted for specific industries. The more generic software also tends to be more robust and less expensive than focused applications.

Third, physical objects and systems do indeed have common, underlying characteristics. Since most physical objects of interest to industry and commerce are those designed and built by humans, they tend to have shared features, such as shape, symmetry, materials and function, as well as business processes, ownership and transaction. Furthermore, many industries, such as healthcare, manufacturing, defense, logistics, transportation, disposal and many others, describe similar characteristics in different ways. By offering a unifying language, these characteristics can be shared and translated across industry groups, multiplying the amount of available information. Automated, industry specific translators may be written allowing the shared information to be presented in familiar ways.

Finally, a broad descriptive language will encourage a greater degree of industry cooperation and facilitate information sharing for mutual benefit. Often data, such as between a retailer and supplier, are not available simply because of lack of standards.

Simplicity

Many standards are not adopted because of their inherent complexity and steep learning curves involved in acquisition and implementation. Although the Standard General Markup Language (SGML) has existed for many years, it has not seen wide spread adoption in part because of its size and complexity [ref].

Its derivative, the Hypertext Markup Language (HTML), has, of course, seen phenomenal growth, in part because of its simplicity and because of the tools and viewers available for the standard [ref]. The Extensible Markup Language (XML), also based on the Standard General Markup Language, has seen increasing growth as a tool for tagging data content [ref]. The XML is a simple subset of the larger SGML and is readily accessible to the casual programmer.

Thus complex standards and languages - even though powerful and effective - have slow learning curves and limited audiences than smaller, simple languages. Therefore, even though the initial PML may be limited in scope, we propose a relatively simple language easily understood and adopted by a larger population.

Adoption pathway

Rather than a monolithic, immutable standard, we will assume the Physical Markup Language will proceed through a number of iterations. In fact, rather than a deficiency, this process can be advantageous. While a simple standard is being learned and adopted, modifications and extensions can be developed. In this way familiarity with the language can proceed along with its capability. In fact, this process may be necessary, since a complex language would not be learned and a simple language would not be sufficient.

Although the HyperText Markup Language (HTML) was a simple language and easily understood, it was, in its initial version, quite limited in scope and in power. Multiple versions and extensions followed once the significance and utility of the language were understood. Extensions, such as Cascading Style Sheets (CSS), Dynamic HTML, Flash Media and so on, were added to the basic capability.

In the same way, we intend the initial PML specification to be limited in depth and power. By design, we will incrementally introduce extensions to increase its scope and functionality.

Comprehensive data types

We may consider the Physical Markup Language to have different 'types' of data - static, temporal, dynamic and algorithmic. These types will not be defined explicitly in the specification, but are useful distinctions when discussing the language.
Static data is information, which essentially remains constant through the life of the object, such as material composition, geometry and physical properties.
Temporal data is that information which changes discreetly and intermittently throughout an object's life. These may include configuration or location. For example, the location of an object on a shelf or whether a part is attached to an assembly, are examples of this type of data. These data must be associated with a time and duration to record the temporal configuration of the environment.

Dynamic data is information that varies continuously. The temperature of a shipment of fruit or an EKG from a heart monitor are examples of dynamic data. Unlike most database systems these data must be cached and transmitted intermittently to limit the network bandwidth and to provide only the most relevant and necessary information.

Finally, algorithmic data includes simulation models, system processes and software associate with a physical object. Not all physical properties can be described by a simple number. For example, the expiration data on a perishable item may be a complex calculation involving temperature history, humidity and ambient light. Cooking instructions could be another example. Heating profiles depend on personal preference, food type and quantity, atmospheric pressure, ambient temperature and oven type.

These designations - static, temporal, dynamic and algorithmic - are simple different views of the same data. A static description such as the shape of a glass would be temporal if it hit the floor. The variation of viewpoint just depends of time scale and complexity of description. Therefore, we will allow time variation on all data descriptions.

Abstract nomenclature

Clearly if we hope for a broad application of this language, we cannot expect familiar names for all physical properties. For example, "harvest time" for produce or "assembly time" for an automobile, may be replaced by a more generic "configuration" plus "timestamp." Generally, we will use abstract names to describe a wider range of physical systems and processes, rather than industry specific descriptions.

Why use abstract notation? The answer is - when we consider the primary objective of the language - to provide a convenient, high-level description for software and application development. More generic terms allow more powerful, general-purpose software to analysis similar configurations independent of industry specific nomenclature.

Robust operation

Unlike most Web pages, PML files will be much more dynamic and have a greater degree of connection to other network files and data streams. Object position, physical state and material descriptions will likely be in multiple data files scattered over the network. General physical properties, such as material and chemical information, will likely be stored in common repositories. Material Safety Data Sheets (MSDS) are good examples of this type of data.

The PML language, together with associated tools and applications, will have to operate robustly with incomplete and intermittent information. Its operation may be similar to streaming image systems do today.

Facilitate data archives

Although Web pages change frequently, PML data files will change even more rapidly. History files and efficient archiving will therefore be critical important. The temperature history of a perishable item, administration of drug or stress on structure must be carefully recorded and maintained.

The PML data format will have to provide simple and convenient methods for associating time with data and for denoting periodic and continuous data.

Standard units of measure

For much of recorded history, physical states of matter have been compared to known references. From cubits to nanometers from stones to dekagrams, multiples of common standards provide the means of communicating physical properties. A difficulty arises when different countries, groups, organizations and people use different and competing standards.

Our desire for the Physical Markup Language to be a global standard must be weighed against the utility and convenience for the user. In particular we must decide on a method for recording data and units, and converting it from one system to another as necessary.

Fundamental physical properties of matter - length, mass, time, force, velocity, density, magnetic field, luminosity and temperature - must be described precisely to be communicated effectively. Many physical properties are not independent. Speed, for example, is the ratio of length to time. Certain quantities must be selected as fundamental, while others derived.

Fortunately, these issues have been resolved by standards bodies, such as the International Bureau of Weights and Measures (Le Système International d'Unités - SI) in conjunction with others such as the National Institute of Standards and Technology in the United States. The seven quantities selected as the basis of the International System of Units, abbreviated SI, were selected, and are shown in TABLE 1. Furthermore, all other units can be described by multiples or ratios of these units. Pressure, for example, is given by m-1 · kg · s-2. Finally, names for common combinations, such as Pascals for the pressure given above, are provided under the SI system.

Although the above discussion is fine for scientific precision of weights and measures, we have the practical problem describing physical properties in the multiple common systems people use today. Considering the options, we may allow PML to use any standard - International System, British or other. We may also allow any designation of unit, such as "kilograms," "kgs" or "Kg." This makes the creation of PML files easy, since any standard of measure written in any language and with any abbreviation may be used. The software tools that must process these data files, however, must be complex, since they must recognize and translate any arbitrary designations.

On the other hand, if we rigidly dictate a particular standard in a single language, we have difficulty in readability and usage. Each PML application must translate units into their common, local standards. In the whole, translating from a known standard to another is easier then converting from an unknown, arbitrary language.

From this reason, it seems likely PML will adopt a single system for weights and measures, with particular designations, and rely on the software tools to provide common translations. Furthermore, common translation software can be accessed and shared from the network. This creates smaller, more easily understood data files, which are precise and accessible. Further, we will rely on the years of effort by the many standards bodies to prescribe these systems.

Fundamental and derived data

Many schemes used to store information include redundant and derived data. As much as possible, the PML language should not provide any data that can be calculated or inferred from other data. Unit conversion for example may be computed by a client application, remote server, or perhaps by a dedicated conversion/computation system.

Standard Syntax

Rather than reinvent a new syntax for the Physical Markup Language, we propose to use the extensible Markup Language (XML). Although different syntactic representations could be used, XML has been well defined and in general use as a simple method for embedded meta-data in flexible database structures.

Furthermore, the extensions, such as the XML Query specification, provide a uniform and simple method for accessing data through Simple Query Language (SQL) notation [ref]. In addition, general utilities, tools and validation software exist to parse, modify and access XML files.

The Physical Markup Language (PML) will therefore be - at least initially - an XML scheme, described in any of the common schema languages, such as the Document Type Declaration (DTD), Resource Description Framework (RDF) and others [ref].

Global language

As with current trends in standards development and network languages, we will attempt to craft PML as a global standard and avoid national terms and descriptions. We will rely on existing standards bodies, such as the Uniform Code Council (UCC), the European Article Number (EAN) Association, the American National Standards Institute (ANSI) and the International Standards Organization (ISO), as well as commercial consortium and industry groups, to aid in the definition of the language.

Facilitate application development

One of the primary purposes of the Physical Markup Language is to facilitate the development of software applications. Therefore, we must design PML with consideration for the needs and requirements of application programmer.

Almost all the issues discussed so far relate to this objective. Widely adopted, simple languages encourage application development and ease the programming task. Extensions and enhancements to an established language will be paralleled by modifications to existing code. Simple, unambiguous nomenclature reduces the complexity of the PML parser and uniform units for weights and measures ease the burden of software translators. Finally, common, globally accepted syntax, such as XML, together with software libraries, such as the JAVA DOM and SAX packages, provide useful tools for the software developer [ref].

The design of the Physical Markup Language will accommodate the application developer and provide the systems and tools to facilitate their efforts. As future versions of the PML become available, we will streamline the semantics to speed software upgrades and new applications.