Acoustic Source Separation in Human & Machine.

Durlach, et al.

The envisioned program addresses the problem of separating out acoustic sources in complex acoustic environments. This program, which involves both engineering and auditory science, includes interrelated efforts to develop: (1) improved designs of source-separation machines, (2) a model of auditory scene analysis in humans, and (3) working prototypes of acoustic source-separation machines. Given the team that intends to participate in this program, which includes members of both the Massachusetts Institute of Technology (MIT) and Boston University (BU), it is anticipated that the results obtained in this program will have substantial impact on all three areas.

Research on source-separation design is aimed at developing a coherent, integrated approach to source separation and focuses on the joint exploitation of spatial separation of the sources (spatial filtering), statistical independence of the sources (independent component analysis), and constraints on the time-frequency structure of the source signals (sparse signal representation). Attempts are made to develop algorithms that simultaneously exploit various types of differences among the sources, operate in real time, and are relatively robust with respect to noise, reverberation, and various types of uncertainties in the available a priori information. Although most existing algorithms for source separation make use of microphone arrays to sample the acoustic field at more than one point in space, research has shown that humans achieve substantial source separation monaurally. Thus, source separation methods that make use of both multiple and single sensors (for both the cases M greaterthanorequalto N and M lessthan N, where M is the number of sensors and N is the number of sources) are considered.

Research on human auditory scene analysis (an analysis that is very closely related to source separation) is expected both to benefit from the work on the other components of our program and to contribute to this other work. Inasmuch as previous results in this area suggest that substantial source separation can be achieved with the use of only one ear and most past work on machine source separation assumes an array of spatially separated sensors, it seems likely that study of monaural source separation in humans is likely to have substantial payoff for designing machine source-separation systems, as well as for basic understanding of human auditory perception. In the same vein, the research on human source separation will include study of how resistant performance (both monaural and binaural) is to the presence of reverberation in the environment, a condition that has generally been found highly problematic for source-separation by machines.

Engineering efforts to implement various source-separation techniques and to develop working prototypes focus on three major application areas: (a) "battlefield systems" to assist military personnel in conducting military actions in battlefield environments, (b) "intelligent-room systems" to facilitate interactions among multiple people and information subsystems within the confines of an enclosed space, and (c) "personal-aid systems" to assist either hearing-impaired individuals in everyday life or normal-hearing individuals who are working in noisy environments but still require auditory communication with selected sources in that environment. Two factors that present great challenges to successful source separation in certain of the application areas concern the extent to which acoustic reverberation is present in the environment and the extent to which the a priori information about the sensing system is inadequate. Especially challenging is the goal of constructing personal-aid systems (for either military or civilian use) that achieve increased spatial resolution by embedding a microphone array in the user's clothing. Overall, the envisioned program includes projects that range from relatively modest challenge (or risk) to relatively high challenge (or risk).