1994 Working Paper Abstracts

1994 Working Paper Abstracts

TDQM-94-01: May 1994 An Empirical Investigation of Data Quality Dimensions: A Data Consumer's Perspective
by Richard Wang, Diane Strong and Lisa Guarascio

** This paper is superceded by TDQM-94-10

TDQM-94-02: May 1994 Modeling Quality Requirements in Conceptual Database Design
by Veda Storey and Richard Wang

Quality requirements have been largely overlooked in previous database research. This paper presents a quality-entity-relationship (QER) approach for incorporating quality requirements (database quality data and product quality data) of an application, into conceptual database design. The underlying premise of this research is that quality requirements should be distinct from application requirements. The QER approach extends traditional conceptual design methodologies and leads to a new direction for further investigation. Practitioners can benefit from this research because they can incorporate the research results directly into current database design practices. Moreover, the QER conceptual design output can be directly translated into a relational schema that can be implemented in existing database management systems.

TDQM-94-03: June 1994 Anchoring Data Quality Dimensions in Ontological Foundations
by Yair Wand and Richard Wang

** This paper is superceded by TDQM-96-07

TDQM-94-04: July 1994 Using Dominant Quality Curves to Produce Optimal Target Information
by Ward Page, and Peter Kaomea

During Desert Storm combat operations, it was clearly highlighted that tactical decisions in strike and amphibious warfare rely upon timely and accurate information on the location and condition of enemy targets. However, the U. S. tactical databases stored order-of-battle information and land-based targeting information that were accurate only to within about one mile and as old as 24 hours. To deliver higher quality information, it would be useful to dynamically re-configure a system in such a way that its information output can be customized to meet the user's requirementsin an optimal manner.

In this paper, we present such a system, called the Quality Tactical Image Exploitation (QTIX) System, which is being developed as part of the U.S. Navy's primary afloat command and control system. We show how target information can be optimally determined given quality constraints such as image currency and resolution, quality tradeoffs such as that between accuracy and timeliness, and other requirements. In particular, we show how the concept of a dominant quality curve can be exploited to allow theuser to select, from thousands of images and scores of image recognition algorithms, the image/algorithm pair that will produce the best target information. Although this research is presented in the context of the QTIX system, it can also be applied to many other situations.

TDQM-94-05: August 1994 Systems Approaches to Improving Data Quality
by Levant Orman, Veda Storey, and Richard Wang

The use of information systems in organizations is restricted bythe quality of the data that appears in these systems. Correctness, completeness, precision, timeliness, and usability are defined as important dimensions in the assessment of the level of data quality that exists. Three approaches to incorporating and improving data quality are identified:

building semantically rich data models
reinforcing databases with a large number of database constraints
restricting the use of data to predefined processes.

Each of these is discussed and illustrated. Building upon thiswork, future research will investigate and demonstrate how to take advantage of these approaches for improving the quality of organizational databases.

TDQM-94-06: August 1994 Modeling Information Manufacturing Systems to Determine Information Product Quality
by Donald Ballou, Richard Wang, Harold Pazer and Giri Kumar Tayi

Many of the concepts and procedures of product quality control can be applied to the problem of producing better quality information outputs. From this perspective, information outputs can be viewed as information products, and many information systems can be modeled as information manufacturing systems. The use of information products is becoming increasingly prevalent both within and across organizational boundaries.

This paper presents a set of ideas, concepts, models, and procedures appropriate to information manufacturing systems that can be used to determine the quality of information products delivered, or transferred, to information customers. The systems analyzed involve predefined processes applied to a predefined set of data units. These systems produce information products on a regular or as-requested basis. To measure the timeliness, quality, and cost of information products, the model systematically tracks relevant parameters. This is facilitated through an Information Manufacturing Analysis Matrix which relates data units and various system components. These measures can then be used to analyze potential improvements to the information manufacturing system under consideration. A synthetic example is given to illustrate the various features of the information manufacturing system and demonstrate how it can be used to analyze and improve the system. Following that is an actual application, which, although not as involved as the synthetic example, does demonstrate the applicability of the model and its associated concepts and procedures.

TDQM-94-07: September 1994 Beyond Accuracy: How Organizations are Redefining Data Quality
by Diane Strong, Yang Lee, and Richard Wang

High quality data are accurate, accessible, and useful. Conventional approaches to data quality focus only on the first attribute. In a field study of forty-two data quality projects in three organizations, we discovered that data accuracy is a necessary but not sufficient condition for high data quality. Data accessibility and data relevance to the contexts of data consumers' tasks are also necessary attributes of high data quality. Furthermore, conventional approaches (e.g., edit checks and integrity constraints) for increasing data accuracy do not increase these other attributes of data quality. We document how organizations are redefining data quality, and we use this redefinition to substantially revise conventional approaches to data quality.

Published in Communications of the ACM.

** This paper is superceded by TDQM-96-02.

TDQM-94-08: September 1994 Information Systems and Organizational Success: A Quantitative Modeling Approach
by Rajat Chakraborty

In today's world, organizations are growing ever more dependent on their information systems for the data they need to survive. Over the years, many methodologies have been developed to implement technology to support organizational processes. However, these methodologies fail to address the complete picture. In our research, we have developed a modeling approach that allows us to encompass the entire organization in functional layers. They are: Critical Success Factors, Business Processes, Data Flow through the organization, and Information Infrastructure. These layers are then connected such that they are intimately related to each other. In this manner, we are able to simulate the effects of changes in one layer throughout the entire organization. The technique allows us to optimize organizational success through manipulation of the internal and external variables of the individual layer. This unique feature of our approach is a powerful new tool in the analysis and design of highly efficient information systems.

Case Study: The modeling approach was used to simulate the US Navy Over The Horizon Targeting (OTHT) system. The OTHT system is a massive worldwide information system that supports naval warfare around the globe. This system is designed to provide Naval Battle Groups with a reliable tactical picture of areas that extend far beyond the range of their on-board sensors. The information greatly enhances the Group's ability to carry out its mission. For obvious reasons, this organization and its systems were a prime candidate for analysis using our approach.

TDQM-94-09: September 1994 Valuation of Data Quality: A Decision Analysis Approach
by Peter Kaomea

This research develops and demonstrates a model to compute the value of data qualities for decision scenarios. The motivation for developing this valuation model is twofold. By comparing the relative values of various data qualities, an analyst can determine which qualities, if any, should be improved in a given information system. Furthermore, this analysis can be useful in determining the price that should be paid to make such improvements.

The model developed here can be used to compute the value of a change in data quality of a unit of data content for a given context. It is important to consider all these aspects of data in a valuation model. Data of a given content only has meaning on the backdrop of a context in which it is used. The higher the quality of the data, the more valuable it is likely to be. The model is demonstrated by valuing the quality of data in the 1988 shooting of an Iraqi air liner by the USS Vincennes.

TDQM-94-10: October 1994 Beyond Accuracy: What Data Quality Means to Data Consumers
by Richard Wang, Diane Strong and Lisa Guarascio

Poor data quality (DQ) can have substantial social and economic impacts. Although firms are improving data quality with practical approaches and tools, their improvement efforts tend to focus narrowly on accuracy. We believe that data consumers have a much broader data quality conceptualization than IS professionals realize. The purpose of this paper is to develop a framework that captures the aspects of data quality that are important to data consumers.

A two-stage survey and a two-phase sorting study was conducted to develop a hierarchical framework for organizing data quality dimensions. This framework captures dimensions of data quality that are important to data consumers. Intrinsic DQ denotes that data have quality in their own right. Contextual DQ highlights the requirement that data quality must be considered within the context of the task at hand. Representation DQ and accessibility DQ emphasize the importance of the role of systems. These findings are consistent with our understanding that high quality data should be intrinsically good, contextually appropriate for the task, clearly represented, and accessible to the data consumer.

Our framework has been used effectively in industry and government. Using this framework, these IS managers were able to better understand and meet their data consumers' data quality needs. The salient feature of this research study is that quality attributes of data are collected from data consumers instead of defined theoretically or based on researchers' experience. Although exploratory, this research provides a basis for future confirmatory studies, and for studies that measure data quality along the dimensions of this framework.

Also in Journal of Management Information Systems / Spring 1996, Vol. 12, No. 4, pp. 5-34.

** This paper supercedes TDQM-94-01.

TDQM-94-11: December 1994 An Ontological and Semantical Approach to Source Receiver Interoperability
by Jacob Lee and Michael Siegel

In this paper, we propose an approach to address the issue of semantic interoperability between a data source and a data receiver in the framework of the context interchange architecture. A key component in this architecture is the context mediator, an intelligent agent which performs data conversions between the source and the receiver. In this paper, we are concerned with what a context mediator 'knows' and how it 'behaves' from an external perspective. We introduce the notion of a conversion axiom, which is a formal and declarative means of specifying knowledge required by the mediator to perform desired conversions. We also state, in formal terms, the behavior of the mediator which uses conversion knowledge to translate data from the source context to the receiver context. Formal characterizations of what a context mediator should know and its external behavior provide a specification for the subsequent design of the knowledge representation and reasoning processes internal to the mediator. This approach draws upon ideas and concepts discussed in Mario Bunge's Ontology and Semantics.

Accepted for publication in the WITS-'94 Conference Proceedings, Vancouver, Canada, December 1994