Exception Handling in Multi-Agent Systems

Project Overview

This project addresses the issues of how we can help develop multi-agent systems that are more robust and adaptive in the face of complex, dynamic and error-prone environments. This work is led by Chris Dellarocas and Mark Klein, with participation by members of the MIT Artificial Intelligence Laboratory, and is funded by the DARPA ISO Agent-Based Systems Program, POC Major Douglas Dyer.

The Challenge

A critical challenge to creating effective agent-based systems is allowing them to operate effectively when, as is typical for many other domains, the operating environment is complex, dynamic, and error-prone. In such domains, we can expect to utilize a highly diverse set of agents, some with fairly sophisticated coordination capabilities, but many simply encapsulating legacy applications. New tasks, agents and other resources can appear and disappear in unpredictable ways. Communication channels can fail or be compromised, agents can "die" or make mistakes, inadequate responses to the appearance of new tasks or resources can lead to missed opportunities or inappropriate resource allocations, unanticipated agent inter-dependencies can lead to systemic problems like deadlocks, and so on. The net result is the potential for problems such as clogged networks, wasted resources, poor performance, system shutdowns, and security vulnerabilities.

Until now, the standard approach to this problem has been to "compile in" complicated and carefully coordinated exception handling behaviors into all problem-solving agents. This is, however, fundamentally problematic, since the causes, manifestations and resolutions for agent system exceptions are inherently systemic and context-sensitive rather than localizable to any particular agent. Agent developers must thus anticipate all the contexts in which the agent may be used, but this is extremely difficult. No systematic methodology is available, however, to help developers identify all relevant exception types and resolution strategies. Making changes in exception handling behavior is difficult because it potentially requires coordinated changes in multiple agents. Agents become much harder to maintain, understand and reuse because the relatively simple normative behavior of an agent becomes obscured by a potentially large body of code devoted to handling exceptional conditions. Finally, it is unrealistic to expect that all agents will have sophisticated exception handling capabilities built in

Our Vision

We address these problems by creating a generic exception handling service that can be "plugged", with little or no customization, into existing agent systems to add the ability to function in exception-prone environments. This service can be viewed as a kind of "coordination doctor"; it knows about the different ways multi-agent systems can get "sick", actively looks system-wide for symptoms of such "illnesses", and prescribes specific interventions instantiated for this particular instance from a body of general treatment procedures. Agents need only implement their normative behavior plus a minimal set of interfaces that assume only that each agent can report on its own behavior and modify its own actions to at least some extent. The exception handling service, itself implemented as agents, uses these interfaces plus a highly reusable knowledge base of generic exception handling expertise to detect when things go wrong in the agent ensemble and take the appropriate corrective actions.

People
Papers

The following white paper describes the project in more detail.

 


Back to top