The impact of masking the formants of a vowel on how it is perceived

In class, we have learned that vowels are characterized by their unique formants, which are created by the resonant frequencies of the vocal tract. In this illusion, we investigated how the perception of vowel sounds is affected by changes in their formant frequencies by adding noise that is specifically tuned to mask the formants of the particular vowel sound, and how this could vary according to the context the vowel sounds are present in.

To do so, we mixed audio samples of two standard IPA vowels, which can be perceived as a mixture in which one of the constituent vowels was perceptually dominant. In other words, when these two samples were pronounced together, listeners were more likely to perceive only one of the constituent vowels, despite both vowels being present in the signal. Then, a noise was added that was tailored to selectively mask the vowel formants of the dominant vowel. We hypothesized that this manipulation would alter the perception of the fused sound, making it resemble the non-dominant vowel more.

NOTE: Some of these audio files do not play properly on Firefox and Chrome browsers. Please listen to them using Safari, or alternatively right-click the audios to download them and listen to them using a local media player (e.g. VLC).

Experiment 1: Fusing a pair of vowels

Vowel 1

Audio:

Frequency spectrum:


Vowel 2

Audio:

Frequency spectrum:

Results

Fused sound without noise:
Fused sound with background noise, which effectively masked the frequency range between 5kHz and 7kHz:

Experiment 2: Fusing a pair of words


Next, we attempted a similar experiment on a minimal pair of words (the two words differ only in one sound at the same position) for the same two vowels. In other words, when these two words were pronounced together, listeners were more likely to perceive one word (the one with the dominant vowel) over the other, despite both words being present in the signal. Then, a noise was added that was tailored to selectively mask the vowel formants of the word containing the dominant vowel. We hypothesized that this manipulation would similarly alter the perception of the fused sound, making it resemble the non-dominant sound more, and the effect should be more significant than in the mixture of vowel sounds alone due to the Ganong effect, because the surrounding consonants provides additional context.

Words

In our illustration, we used two words with the same letters except for the vowel, but and bat

Bat:

But:

Results

Fused sound without noise:

According to a small group of our test subjects, the audio above exhibits a greater resemblance to the word bat as opposed to but. Hence, we decided to add noise that is tailored to mask the vowel formants present in the word bat.

Fused sound without noise:

The addition of tailored noise resulted in a word that more closely resembles the word but.

Acknowledgements


HTML is loading comments...