Back to 2020 gallery

What Role Does Attention and Fundamental Freqency Range Play with Harmonicity in Speech Segregration?

by Maria De La Torre

Background and Motivation

Alterations to harmonicity can impair the intelligibility of concurrent natural speech (Pophman et al., 2018). Their experiments found replacing harmonic excitation with either inharmonic frequency components or simulated whispering causes intelligibility of speech and when asked to attend to a specific speaker, participants would erroneously report words, indicating these changes in harmonicity make it more difficult to follow a target speaker.

This illusion aims to investigate whether the range of fundamental frequency of two different speakers would have an effect on the difficulty of following a target speaker and identifying speech errors and whether these effects remain if harmonic structure is altered.

Illusion Simulation

Since the range of fundamental frequency for a male and female speaker are different, I have one female and one male speaker saying one of two different common speech errors (Dell. 1986).

The two sentences are:

  • The sun is in the sky
  • The sky is in the sky
  • In the table below, sounds in which both speakers say the first sentence, and in which either the female or male voice are saying the second sentence are shown. To check whether these effects of attention and fundamental frequency range remain when the harmonic structure is altered, the following variations of harmonicity are shown:

  • Original: The initial recoding with no alterations
  • Harmonic: A synthetic harmonic reconstruction (intended to be very close to the original)
  • Inharmonic: A synthetic inharmonic reconstruction, where each harmonic by a random proportion of the fundamental frequency
  • Whisper:A synthetic simulated whisper with noise replacing the voiced component and high-pass filtered
  • Trials

    For this illusion, it would be interesting to observe the effects of attention and fundamental frequency range on speech intelligibility through three different trials.

    Trial 1: Listen to the sounds in the original column and note whether the change in harmonicity has an effect on distinguishing the two speakers.

    Are the speech-errors more intelligible when the female voice or when the male voice is changed?

    Trial 2: Listen to all the sounds but this time try to pay attention to the male speaker and identify which speaker is saying the erroneous sentence.

    Trial 3: Repeat Trial 2 but this time, paying attention to the female speaker.

    (*) Are the speech errors easier to recognize for one of the three trials?

    The next trial is to observe what role does attention and fundamental frequency play among with harmonicity.

    Trial 4: Repeat Trial 1 - Trial 3 first for the inharmonic reconstruction and then for the whisper.

    Did your answer to question (*) change for the inharmonic or whisper version of the sound?


    Original Harmonic Inharmonic Whisper
    Both Same



    Female Change



    Male Change



    References

    Popham, S., Boebinger, D., Ellis, D. P., Kawahara, H., & McDermott, J. H. (2018). Inharmonic speech reveals the role of harmonicity in the cocktail party problem. Nature communications, 9(1), 1-13.

    Dell, G. S. (1986). A spreading-activation theory of retrieval in sentence production. Psychological review, 93(3), 283.

    Comments

    Katarina Bulovic

    a) For the original column, I thought it was a little easier to detect the speech errors in the male-change condition, but both were not too difficult to detect. I do think that directing my attention to the female or male voice helped me detect that voice's speech errors, especially the male voice. For the harmonic column, my results were about the same as the original column. For the inharmonic and whisper columns, it became harder to determine which was the female and which was the male voice. I especially had trouble distinguishing them in the female-change condition. It did make it slightly easier if I focused on only listening to the male voice.
    b) As a researcher, this does support the hypothesis that the effects of attention and fundamental frequency range persist in all four of these cases, because I got similar results for all four columns (female-change was harder to detect, and focusing on only the male voice made it slightly easier).

    Cesar Duran

    Cool Experiment! I felt like the female voice dominated over the male, and therefore the speech-errors were much more prominent for the female change versus the male change. What I found interesting was that during the female error, it was very difficult to convince myself that the male voice wasn't saying the erroneous sentence also, regardless of harmonicity (but most difficult in the inharmonic and whisper column). Perhaps it is due to the sentences being more synced up for the female change, while there is a slight offset during the male change. Paying attention to specifically the male voice, it became easier to distinguish the two voices in the female change, but it was still somewhat difficult. I felt like the 'k' sound in sky overpowers the lower 'u' sound in sun, resulting in me still not fully convinced that the male voice was saying the correct sentence. During the third trial, paying attention to the female voice resulted in the male voice essentially being drowned out for me. I feel like paying attention to the male voice would result in higher speech error detection, but solely because the male speech errors were more difficult to pick up on. In the inharmonic female change, it was impossible to distinguish the male voice, even if I focus on the male voice.

    This illusion does seem to successfully demonstrate the fundamental frequency's role in speech segregation. It shows that female fundamental frequency being higher results in easier speech identification, therefore answering the question.