MIT Reports to the President 1994-95

Artificial Intelligence Laboratory

The central goal of the Artificial Intelligence Laboratory is to develop a computational theory of intelligence extending from the manifestation of behavior to operations at the neural level. Current work focuses especially on understanding the role of vision, language, and motor computation in general intelligence.

The Laboratory also seeks to develop high-impact practical applications in areas such as information transportation, enhanced reality, the human-computer interface, modeling and simulation, image understanding, decision analysis and impact prediction, model-based engineering design, medicine, and supporting computing technology.

Professor Patrick H. Winston works on the problem of learning from precedents. Professor Robert C. Berwick studies fundamental issues in natural language, including syntactic and semantic acquisition. Professor Lynn Andrea Stein works on integrated architectures for intelligence. Professor W. Eric L. Grimson, Professor Berthold K. P. Horn, and Professor Tomaso A. Poggio do research in computer vision. Professor Rodney A. Brooks, Professor Tomas Lozano-Perez, Professor Gill Pratt, Professor Marc H. Raibert, and Dr. J. Kenneth Salisbury work on various aspects of robotics. Professor Randall Davis and Dr. Howard E. Shrobe work on expert systems that use both functional and physical models. Professor Carl E. Hewitt studies distributed problem-solving and parallel computation. Dr. Thomas Knight and Professor William J. Dally work on new computing architectures. Professor Gerald J. Sussman and Professor Harold Abelson lead work aimed, in part, at creating sophisticated problem-solving partners for scientists and engineers studying complex dynamic systems.

The Laboratory's 178 members include 17 faculty members, 26 academic staff, 29 research and support staff, and 106 graduate students active in research activities funded by the Advanced Research Projects Agency, Air Force Office of Sponsored Research, Alphatech, Inc., Apple Computer, Army Research Office, Bear Stearns Company, Digital Equipment Corporation, Fujitsu, Hitachi, Hughes Research Foundation, International Business Machines, Jet Propulsion Laboratory, Korean Atomic Energy Research Institute, Loral Systems Company, M & M Mars, Inc., Matsushita Electric, Mazda Motor Corporation, MCC Corporation, Mitsubishi, NASA, National Science Foundation, Office of Naval Research, Panasonic Technologies, Sandia National Laboratory, Sharp Corporation, Siemens, Sperry Rand, and Sumitomo Metal Industries.

VISION

Enhanced Reality and Object Recognition

Professor Grimson's group has developed a system that registers television images with information produced by Magnetic Resonance Imaging machines. This reality-enhancing system, developed collaboratively with researchers at the Brigham and Women's Hospital and other Boston area researchers, makes it possible to superimpose previously recorded Magnetic Resonance Imaging information with a current view of the head of a person with a brain tumor.

Using reality-enhancement, surgeons view images in which the skin, skull, and outer brain layers seem to fade away, revealing a tumor's size, location, and proximity to vital brain areas. The system has been put to use in about 20 experimental cases already by surgeons in presurgical planning as they determine how best to accomplish their objectives while avoiding collateral damage. Routine use on a daily basis is expected soon. Similar methods are under development for other surgical applications, such as sinus surgery, endoscopic surgery, and surgeries performed under surgical microscopes.

Professor Grimson's group also works on methods for recognizing objects in cluttered, noisy, and unstructured environments. Recent efforts have focused on formal methods for evaluating alternative recognition methods, on the role of visual attention in recognition, and on the development of efficient methods for indexing into large libraries of objects. These efforts have been integrated into a system that uses a movable eye-head to find objects hidden in a room by focusing attention on interesting points in the room, and then using grouping and recognition methods to identify such objects. New thrusts include applying existing methods to automatic target recognition and visual database search.

Motion Vision and Low-Level Integration

Professor Horn and his students work on problems in motion vision. Because recovery of information about the world from a single cue such as motion parallax, binocular stereo disparity, or shading in images tends to be unreliable, Professor Horn works on the integration of information from multiple cues at a low level, building systems that interlace iterations of different schemes for recovering shape. Preliminary results in integrating motion vision and shape from shading, and in integrating binocular stereo and shape from shading, both show great promise.

On another front, a new special-purpose early-vision analog VLSI chip will be completed soon. This chip determines an image's focus of expansion-that point in the image towards which the camera appears to be moving. It does this without the need to detect image features. The result can be used to compute time to impact and possibly to recover shape information. The chip is expected to operate at 1000 frames per second. Although it has just a 32 x 32 array of sensors and processing elements, it is expected to be able to recover the focus of expansion with sub-pixel resolution. Work is also starting on the development of a chip that can deal with arbitrary combinations of translational and rotational motions, provided that the scene being viewed is approximately planar. This chip will be an order of magnitude more complex than the previous one and require considerable innovation before the circuitry can be fitted into the available space.

As part of an effort in the "Intelligent Highway'' arena, the focus-of-expansion chip may be adapted to become a component in a time-to-impact warning system. A time-to-impact chip differs from a focus-of-expansion chip in that the focus-of-expansion chip factors out distances and velocities to recover the direction of motion, whereas a time-to-impact chip takes the direction of motion to be of less interest, while distances and velocities are considered significant. The time-to-impact chip will not be able to determine distances to objects or velocities in absolute terms, but it will obtain the ratio of these quantities, which yields the time to impact. Using such a chip, a warning system will be useful particularly in warning drivers of rapidly approaching objects in their so-called blind spot.

Imaging in Highly Scattering Media

Professor Horn is also exploring diaphanography, the science of recovering the spatial distribution of absorbing material in a highly scattering medium from optical measurements on the surface of a volume. This is a third domain of image understanding, which stands alongside the usual optical domain of opaque objects in a transparent medium (where pin-hole optics rules) and tomography (where the observed quantities are line integral of absorbing density).

In highly scattering media, the directionality of injected photon flux is quickly lost and the situation is best modeled as a diffusional process. The governing equation is Poisson's equation, and convenient physical models include heat flow in a solid and current in a conductive medium that also has leakage paths to ground. Forward analysis-computing how a photon flux will distribute itself given a distribution of absorbing material-is not hard to compute numerically (although computationally expensive). For simple situtations, closed form solutions are also available. The inverse problem-recovering the distribution of absorbing material from a surface image-presents a challenge, however. Under certain combinations of scattering constant, average background absorption, and spacial scale, the problem is not well posed mathematically, and Professor Horn is trying to discover under what circumstances the problem is better posed, and how spatial resolution in the reconstruction is likely to vary with depth.

Potential medical applications include mammography and the imaging of testicular cancers, thyroid tumors, and of brain tumors in infants. Also, in some drug manufacturing and vaccine production, the techniques may enable the dermination of whether an egg contains a live embryo. Because imaging using heat-flow analysis is a related problem, other potential applications include flaw detection in aircraft parts and geological mineral exploration.

Vision, Machine Learning, and Networks

Professor Poggio's research effort at the Artificial Intelligence Laboratory, and at the associated Center for Biological and Computational Learning, has many focii, including theory and mathematical issues, algorithms for learning, use of prior information to augment the example set, the use of virtual examples, applications in various fields, and neuroscience.

During the past year, Dr. Federico Girosi and Mr. Nicholas Chan studied how to use a priori knowledge in the problem of learning from examples. They first considered the case in which it is known that the function to be approximated is invariant with respect to some transformation group, for example the rotations group, and analyzed this problem in the framework of regularization theory. For certain cases, they derived an analytical solution, and showed that it is the same that is obtained if the a priori knowledge is used to create new, "virtual'' examples, as in the technique of "hints'' proposed by Abu-Mostafa (1993). They also studied numerically the performances of certain approximation techniques and devised a technique to quantify the amount of information embedded in the a priori knowledge.

Dr. Partha Niyogi considered the problem of actively selecting the points at which a function should be sampled in order to obtain the best learning performances. Using arguments from the theory of optimal recovery of Micchelli and Rivlin, he developed an adaptive sampling strategy for arbitrary approximation schemes. He demonstrated the application of his general formulation to two special cases of functions on the real line: monotonically increasing functions and functions with bounded derivatives.

Other work has focused on applications in computer graphics, very low bandwidth video conferencing, video Email, feature detection, target detection, object recognition, visual database search, face detection, time series analysis, and pricing models in finance. For example, Mr. David Beymer and Professor Poggio have developed a technique for generating virtual views from a single view of a new face through transformations learned from face prototypes. The technique has application in face recognition systems and in special-effects image generation.

In the area of object recognition, Mr. Mike Jones, Mr. Beymer, and Professor Poggio have formulated and implemented new techniques, called "vectorizers'' to find dense, robust correspondence between two images of objects of the same class. The approach is example-based and relies on a new representation of images in terms of a "shape'' vector and a "texture'' vector. In this way a linear vector space structure is obtained which is the necessary, but often forgotten, condition for using techniques such as principal components analysis.

In the area of very-low-bandwidth video communication, Mr. Kay Kah Sung and Professor Poggio are developing an object detection system that can be trained to locate faces in cluttered pictures. Professor Poggio and Messrs. Beymer, Ezzat, Lines are developing learning networks for the analysis and synthesis of images that could be used for trainable man-machine interfaces and very efficient computer graphics.

On the biological side, physiological experiments (in collaboration with Professor Logothetis at Baylor) with monkeys have provided exciting results that suggest that the mammalian visual system may use view-based recognition strategies similar to those predicted by the model of Poggio and his coworkers.

ROBOTICS

COG

During the past two years, Professor Brooks and Professor Stein have embarked on a new project-to build a humanoid robot with human-like capabilities. The purpose of this project is to learn what constraints on the organization of human intelligence are imposed by having a body. Work over the last year has proceeded on a number of subsytems of Cog, with the first experimental results just now emerging.

Working with Professor Brooks, Mr. Matthew Marjanovic developed a learning system, based on the work of Professor Michael Jordan of the Brain and Cognitive Science Department, which allows Cog to compensate for its head motion in order to keep a stable view of the world as the head moves. Ms. Yoky Matsuoka built a fully self contained hand for Cog, with three fingers and opposing thumb attached to a palm, and all covered with a touch sensitive skin. She then developed a novel system of three cooperating neural networks, each with a different architecture, which allows the hand to classify objects that it is grasping, and adjust its grasp strategies appropriately, and is also able to recognize when a grasped object is slipping. Mr. Robert Irie developed a sound processing system which takes inputs from Cog's ears and locates the sources of sound in space. It then correlates this with visual events in the world and learns the mapping between the sound coordinate system and the visual coordinate system. Ms. Cynthia Ferrell completed work on a very high performance eye system for Cog, which lets Cog saccade at human speeds to gather information at high resolution over a wide viewing area.

Working with Professor Stein, Mr. Mike Wessler has designed and built two binocular active vision heads with distinct visual systems. Mr. Wessler's "Reubens,'' performs real-time tracking of moving targets, saccades to motion, and learns to calibrate its motor system based on visual feedback. Mr. Brian Scassellati has built "Charlotte'' which derives a model of image "contours'' from low-level image features such as intensity edges and depth cues as well as expectation-driven factors such as those studied in gestalt psychology. One goal is to explain phenomena such as illusory contours and the "filling in'' of the blind spot.

Autonomous Robots

Professor Brooks previous work has concentrated on small mobile robots, and that work has been developed further.

Ms. Anita Flynn completed her Ph.D. work on the design and analysis of small piezo-electric motors. These motors have much better energy densities than same sized electromagnetic motors. Besides using conventional fabrication techniques to build motors in the few-millimeter size, range Dr. Flynn developed prototype thin-film motors on silicon wafers. These motors hold great promise for micro-surgical applications. Along with UROP students Mr. Dean Franck, Mr. Art Shectman, and Mr. James McLurkin, Dr. Flynn developed prototype small robots for moving about with human intestines, and for carrying out simple manipulation tasks. These robots are precursors to work on building actual surgical robot prototypes.

The remainder of the mobile robot work has been done in support of sending small rovers on planetary exploration missions. Mr. Michael Binnard completed work on a pneumatically driven six legged robot. He demonstrated the energy advantages of such a system even on a planet like Mars where the atmospheric pressure is only a few millibars. Under the co-supervision of Professor Grimson, Ms. Liana Lorigo has programmed a small tracked vehicle to navigate visually in our small planetary surface enclosure which is filled with gravel and rocks. Under the co-supervision of Professor Pratt, Mr. Matthew Williamson has developed a novel small manipulator for this tracked vehicle. It relies on a series of elastic actuators and is able to do geological sampling tasks such as hammering, prying, and digging.

Manipulation and Locomotion

Professor Pratt's group has been working on robotic systems that are specialized for interaction with natural environments. Unlike traditional industrial robots, these robots are far less concerned with positional accuracy and maximal slew rate. Instead, they must have smooth and stable force-controlled interactions with unknown environments, including tolerance to unexpected shock. Such requirements have led Professor Pratt and his students to develop the concept of series-elastic actuators, which differ from conventional actuators in their purposeful placement of mechanical elasticity between the transmission and the load. This has both been shown theoretically, and demonstrated in prototype hardware, to provide a significant improvement in terminal characteristics. Best of all, this series-elastic design methodology may soon allow for an order of magnitude reduction in the cost of high-quality force actuators because inexpensive gear trains may be used. Given the high number of degrees of freedom required for natural robots, this cost reduction is important.

Robots that interact with natural environments also need control systems that differ from those used in industrial robots, and developing such control systems is a second thrust of Professor Pratt's group. Path planning and accuracy are unimportant, as the environment is dynamic and unpredictable. Rather, what is needed is an appropriate behavioral abstraction to reduce control bandwidth and increase smoothness. In much the same way that early vision extracts discrete features from, and lowers the bandwidth of visual information, this "late motor'' processing converts discrete behaviors into smooth interactions and expands the bandwidth of information used to modulate motor action. Professor Pratt's research in this area is presently focused on Virtual Model Control, where simulations of physical systems are used as the processing mechanism for control. Recent efforts have shown this method to be superior to conventional inverse kinematics for describing the control of robot posture.

To support experimental work, Professor Pratt and his students have been constructing a bipedal walking robot and an arm for COG.

Hands and Haptics

Dr. Salisbury's group has focused on three areas: sensor guided grasping, study of human and robot hands, and the development of haptic interfaces and rendering techniques.

In conjunction with Professor Slotine from the Department of Mechanical Engineering, Dr. Salisbury's group has developed a system that can grasp stationary and freely moving objects.

Dr. Salisbury's study of human and robot hands, in collaboration with Dr. Srinivasan at the Research Laboratory for Electronics, has focused on the development of touch perception algorithms to enable robots to deduce contact conditions from simple force-sensing fingertips. Work currently is in progress on a new multi-finger hand, and on algorithms for continuously reorienting objects held in the fingertips. The group is also developing precision force actuators for dual use in robot palpation experiments and human psychophysical experiments.

Significant progress also has been made in hardware and program development for the PHANToM haptic interface. This device exerts precisely controlled force vectors on user's fingertips or a stylus so as to present touch (or "haptic") information to the user. This permits users to interact mechanically with virtual objects, permitting perception of properties including touch, shape, texture and motion. A larger "tool-handle'' version of the device has been built and is being used to test a variety of haptic rendering algorithms. These devices are used both in our laboratory as well as in the Research Laboratory of Electronics as components of the Virtual Environment Training Technology program.

ASSOCIATE SYSTEMS

MIT Reports to the President 1994-95