Voice Quality Study Group

Given (1) the strong mutual interest in developing improved methods to assess voice quality at LL and the MGH Center for Laryngeal Surgery and Voice Rehabilitation and (2) the fact that there are already SHBT students at both institutions actively engaged in voice quality projects, it was decided to establish the Voice Quality Study Group (VQSG) that would include the SHBT students and interested staff members at the two institutions.

The main goals of the VQSG are:

  1. to establish a common baseline of knowledge about the current state-of-the-art in voice quality assessment,
  2. to share information about current and planned research projects so that such projects can benefit from the group's comments and discussions,
  3. to share databases, technologies, etc., when appropriate to advance the work of both groups, and
  4. to develop collaborative projects that can take advantage of the strengths, knowledge, resources, and interests of both groups so that such combined efforts will have the potential to accomplish more than each group could by working totally on their own.

Session 1 - April 7, 2005

Gerratt, B. R. and J. Kreiman (2001). "Measuring vocal quality with speech synthesis." Journal of the Acoustical Society of America. 110(5, pt. 1): 2560-2566.

Session 2 - April 21, 2005

Kreiman, J., B. R. Gerratt, et al. (1993). "Perceptual Evaluation of Voice Quality - Review, Tutorial, and a Framework for Future-Research." Journal of Speech and Hearing Research 36(1): 21-40.

Session 3 - May 5, 2005

(1) Braida, L. D., J. S. Lim, et al. (1984). "Intensity perception. XIII. Perceptual anchor model of context-coding." J Acoust Soc Am 76(3): 722-31. (2) Gerratt, B. R., J. Kreiman, et al. (1993). "Comparing Internal and External Standards in Voice Quality Judgments." Journal of Speech and Hearing Research 36(1): 14-20.

Session 4 - May 26, 2005

J. Kreiman and B. R. Gerratt, "Perception of aperiodicity in pathological voice," Journal of the Acoustical Society of America, vol. 117, pp. 2201-2211, 2005.

Session 5 - June 9, 2005

Granqvist, S. and L. Leng (2003). "The Visual Sort and Rate method for perceptual evaluation in listening tests." Paper I from doctoral thesis.

Session 6 - July 7, 2005

J. Kaiser, "Some observations on vocal tract operation from a fluid flow point of view," in Vocal Fold Physiology: Biomechanics, Acoustics, and Phonatory Control, I. R. Titze and R. C. Scherer, Eds. Denver Center for the Performing Arts, CO, 1983, pp. 358-386.

Session 7 - July 21, 2005

C. H. Shadle, A. Barney, and P. O. A. L. Davies, "Fluid flow in a dynamic mechanical model of the vocal folds and tract. II. Implications for speech production studies," Journal of the Acoustical Society of America, vol. 105, pp. 456-466, 1999.

Session 8 - August 2, 2005

Ibid.

Session 9 - August 18, 2005

M. H. Krane, "Aeroacoustic production of low-frequency unvoiced speech sounds," Journal of the Acoustical Society of America, vol. 118, pp. 410-427, 2005.

Session 10 - September 8, 2005

R. S. McGowan and M. S. Howe, "Compact Green's functions extend the acoustic theory of speech production," in submission, June 2005.

Session 11 - September 22, 2005

P. Bangayan, C. Long, A. A. Alwan, J. Kreiman, and B. R. Gerratt, "Analysis by synthesis of pathological voices using the Klatt synthesizer," Speech Communication, vol. 22, pp. 343-368, 1997.

Session 12 - October 6, 2005

General questions and discussion on current topics.

Session 13 - October 20, 2005

Shrivastav and Sapienza, "Objective measures of breathy voice quality obtained using an auditory model," Journal of the Acoustical Society of America, vol. 114, pp. 2217-2224, 2003.

Session 14 - November 3, 2005

Book Chapter: Moore BCJ (1995). Frequency analysis and masking. In BCJ Moore (ed.), Hearing. Academic: London, pp. 161-205.

Session 15 - November 17, 2005

T. Dau, B. Kollmeier, and A. Kohlrausch, "Modeling auditory processing of amplitude modulation.1. Detection and masking with narrow-band carriers," Journal of the Acoustical Society of America, vol. 102, pp. 2892-2905, 1997.

Session 16 - December 1, 2005

Ibid.

Session 17 - December 15, 2005

H. M. Hanson, K. N. Stevens, H. K. J. Kuo, M. Y. Chen, and J. Slifka, "Towards models of phonation," Journal of Phonetics, vol. 29, pp. 451-480, 2001. All but Section 4.

Session 18 - January 12, 2006

H. M. Hanson, K. N. Stevens, H. K. J. Kuo, M. Y. Chen, and J. Slifka, "Towards models of phonation," Journal of Phonetics, vol. 29, pp. 451-480, 2001. Section 4.

Session 19 - January 26, 2006

I. R. Titze, B. H. Story, G. C. Burnett, J. F. Holzrichter, L. C. Ng, and W. A. Lea, "Comparison between electroglottography and electromagnetic glottography," Journal of the Acoustical Society of America, vol. 107, pp. 581-588, 2000.

Session 20 - February 9, 2006

M. Rothenberg, "New inverse filtering technique for deriving glottal air-flow waveform during voicing," Journal of the Acoustical Society of America, 53 (6), pp. 1632-1645, 1973.

Session 21 - February 23, 2006

D. Y. Wong, J. D. Markel, and A. H. Gray, "Least-Squares Glottal Inverse Filtering from the Acoustic Speech Waveform," IEEE Transactions on Acoustics Speech and Signal Processing, vol. 27, pp. 350-355, 1979.

Session 22 - March 9, 2006

Roian Egnor from Harvard will be talking a bit about amplitude modulations ("stuttering") that she has seen in bird calls.

Session 23 - April 6, 2006

J. Slifka, "Some Physiological Correlates to Regular and Irregular Phonation at the End of an Utterance," Journal of Voice, 2005, In Press.

Session 24 - April 20, 2006

D. A. Berry, "Mechanisms of modal and nonmodal phonation," Journal of Phonetics, vol. 29, pp. 431-450, 2001.

Session 25 - May 4, 2006

S. M. Lulich, A. Bachrach, and N. Malyska, "A role for the second subglottal resonance in lexical access," pre-submission draft, 2006.

Session 26 - May 18, 2006

D. Mehta and T. F. Quatieri, "Pitch-Scale Modification using the Modulated Aspiration Noise Source," in submission to the International Conference on Spoken Language Processing (Interspeech), Pittsburgh, PA, 2006.

Session 27 - June 1, 2006

Discussion of voice files.

Session 28 - June 15, 2006

T. Böhm, "Is utterance-final glottalization a cue for speaker recognition by humans?," poster, 151st ASA, 2006.

Session 29 - June 22, 2006

No paper this week. Further discussions with Tamás Böhm.

Session 30 - June 29, 2006

N. Malyska and T. F. Quatieri, "Analysis of Nonmodal Phonation using Minimum Entropy Deconvolution," accepted to the International Conference on Spoken Language Processing (Interspeech), Pittsburgh, PA, 2006.

Session 31 - July 13, 2006

Lowell, Soren Y. and Brad H. Story. 2006. Simulated effects of cricothyroid and thyroarytenoid muscle activation on adult-male vocal fold vibration. JASA 120(1):386-397.

Session 32 - July 27, 2006

No paper this week. Discussion of MGH Voice Center visit.

Session 33 - August 10, 2006

Lucero, J. C. and L. L. Koenig (2005). "Phonation thresholds as a function of laryngeal size in a two-mass model of the vocal folds." Journal of the Acoustical Society of America 118(5): 2798-2801.

Session 34 - October 12, 2006

A review of in vivo high-speed video imaging of the human vocal folds.

Session 35 - October 26, 2006

A discussion with irregular phonation regions as guest stars.

Session 36 - November 9, 2006

D. Cairns and J. H. L. Hansen, "Nonlinear analysis and classification of speech under stressed conditions," Journal of the Acoustical Society of America, vol. 96, pp. 3392-3400, 1994.

Session 37 - December 7, 2006

Cairns and J. Y. D. Heman-Ackah, et al. "Cepstral peak prominence: a more reliable measure of dysphonia," Ann Otol Rhinol Laryngol vol. 112, pp. 324-333, 2003.

Session 38 - January 18, 2007

Murphy, P. J. (2000). "Spectral characterization of jitter, shimmer, and additive noise in synthetically generated voice signals," Journal of the Acoustical Society of America 107, 978-988.

Session 39 - February 1, 2007

Ibid.

Session 40 - February 15, 2007

Murphy, P. J. (1999). "Perturbation-free measurement of the harmonics-to-noise ratio in voice signals using pitch synchronous harmonic analysis," Journal of the Acoustical Society of America 105, 2866-2881.

Session 41 - March 1, 2007

S. Bielamowicz, J. Kreiman, B. R. Gerratt, M. S. Dauer, and G. S. Berke, "Comparison of voice analysis systems for perturbation measurement," Journal of Speech and Hearing Research, vol. 39, pp. 126-134, 1996.

Session 42 - March 15, 2007

Deliyski, D. (2007). "Clinical implementation of laryngeal high-speed videoendoscopy: Challenges and evolution." Folia Phoniatrica et Logopaedica, in press.

Session 43 - March 29, 2007

Murphy, P. (2006). "On first rahmonic amplitude in the analysis of synthesized aperiodic voice signals." Journal of Acoustical Society of America 120(5), 2896-2907.

Session 44 - April 12, 2007

Field trip to the MGH Voice Center.

Session 45 - May 10, 2007

Fitch, W. T. and M. D. Hauser (2002). Unpacking "Honesty": Vertebrate Vocal Production and the Evolution of Acoustic Signals. Acoustic Communication. A. Simmons, R. R. Fay and A. N. Popper. New York, Springer. Physical and Anatomical Constraints on Signal Production (Section 2), pp. 5-26.

Session 46 - May 24, 2007

Iseli, M., Y.-L. Shue, et al. (2007). "Age, sex, and vowel dependencies of acoustic measures related to the voice source." Journal of the Acoustical Society of America 121(4): 2283-2295.

Session 47 - June 7, 2007

Fitch, W. T. and M. D. Hauser (2002). Unpacking "Honesty": Vertebrate Vocal Production and the Evolution of Acoustic Signals. Acoustic Communication. A. Simmons, R. R. Fay and A. N. Popper. New York, Springer. Morphological Diversity in the Vertebrate Vocal Production System (Section 2.2), pp. 13-20.

Session 48 - June 21, 2007

Hillman RE, Heaton JT, Masaki A, Zeitels SM, Cheyne HA.(2006). "Ambulatory monitoring of disordered voices." Ann Otol Rhinol Laryngol. Nov;115(11):795-801.

Session 49 - July 5, 2007

Howe, Michael S. and Richard S. McGowan. "Vortex shedding and the voice source." Presented at the Acoustical Society of America, Salt Lake City, June 2007.

Session 50 - August 2, 2007

D. Rudoy, D. N. Spendley and P. J. Wolfe, "Conditionally linear Gaussian models for estimating vocal tract resonances," Proceedings of Interspeech, Antwerp, Belgium, 2007.

Session 51 - August 16, 2007

Encore of...Howe, Michael S. and Richard S. McGowan. "Vortex shedding and the voice source." Presented at the Acoustical Society of America, Salt Lake City, June 2007.

Session 52 - September 13, 2007

Discussion of edge detection on vocal fold kymographic images.

Session 53 - October 4, 2007

Murphy, P. J. and Akande, O.O. (2007). "Noise estimation in voice signals using short-term cepstral analysis," Journal of the Acoustical Society of America, 121(3), 1679-1690.

Session 54 - November 1, 2007

Deshmukh, O., C. Y. Espy-Wilson, et al. (2005). "Use of temporal information: Detection of periodicity, aperiodicity, and pitch in speech." IEEE Transactions on Speech and Audio Processing 13(5): 776-786.

Session 55 - December 6, 2007

Rahman, M. S., and Shimamura, T. (2005). "Formant frequency estimation of high-pitched speech by homomorphic prediction," Acoust. Sci. & Tech. 26, 502-510.

Session 56 - January 24, 2008

Kreiman, J., Gerratt, B. R., and Ito, M. (2007). "When and why listeners disagree in voice quality assessment tasks," Journal of the Acoustical Society of America 122, 2354-2364.

Session 57 - February 21, 2008

(1) Kreiman, J., Gerratt, B. R., and Ito, M. (2007). "When and why listeners disagree in voice quality assessment tasks," Journal of the Acoustical Society of America 122, 2354-2364. (2) Shrivastav, C. M. Sapienza and V. Nandur, "Application of psychometric theory to the measurement of voice quality using rating scales," Journal of Speech, Language, and Hearing Research, 48 (2), pp. 323-335, 2005.

Session 58 - March 7, 2008

Shrivastav, C. M. Sapienza and V. Nandur, "Application of psychometric theory to the measurement of voice quality using rating scales," Journal of Speech, Language, and Hearing Research, 48 (2), pp. 323-335, 2005.

Session 59 - April 10, 2008

Böhm, T. and S. Shattuck-Hufnagel (2008). "Does a speaker's habitual utterance-final voice quality help listeners recognize the voice?" Phonetica submitted.

Session 60 - May 15, 2008

Dilley, L., Shattuck-Hufnagel, S., and Ostendorf, M. (1996). "Glottalization of word-initial vowels as a function of prosodic structure," Journal of Phonetics, 24, 423-444. and Discussion of speaker identification using phrase-final glottalization as a cue, with a new listener training paradigm.

Session 61 - May 29, 2008

Zañartu, M., Mongeau, L., and Wodicka, G. R. (2007), "Influence of acoustic loading on an effective single mass model of the vocal folds", Journal of the Acoustical Society of America, 121(2), 1119-1129. and Titze, I. R. (2008), "Nonlinear source–filter coupling in phonation: Theory" Journal of the Acoustical Society of America, 123(5), 2733-2749.

Session 62 - June 12, 2008

Ibid.

Session 63 - June 19, 2008

Ibid.

Session 64 - July 15, 2008

Titze, I., T. Riede, et al. (2008). "Nonlinear source-filter coupling in phonation: Vocal exercises." Journal of the Acoustical Society of America 123(4): 1902-1915.

Session 65 - August 7, 2008

Fulop, S. A. and K. Fitz (2006). "Algorithms for computing the time-corrected instantaneous frequency (reassigned) spectrogram, with applications." Journal of the Acoustical Society of America 119(1): 360-371.

Session 66 - September 4, 2008

Group qualitative analysis of laryngeal high-speed videoendoscopy with synchronous data waveforms.

Session 67 - September 18, 2008

Ibid.

Session 68 - October 2, 2008

Rudoy, D., T. F. Quatieri, and P. J. Wolfe (2009). "Time-varying autoregressive tests for multiscale speech analysis." International Conference on Acoustics, Speech, and Signal Processing. Submitted for review.

Session 69 - October 16, 2008

Lulich, S. M. "Closed phase subglottal coupling and estimation of vocal fold mechanics from the speech signal." Voice Quality Study Group. Manuscript.

Session 70 - October 30, 2008

Quatieri, T. F., K. Brady, et al. (2006). "Exploiting nonacoustic sensors for speech encoding." IEEE Transactions on Audio, Speech, and Language Processing 14(2): 533-544.

Session 71 - November 13, 2008

Švec, J. G., Šram, F., and Schutte, H. K. (2007). "Videokymography in voice disorders: What to look for?" Ann. Otol. Rhinol. Laryngol. 116(3), 172-180.

Session 72 - December 11, 2008

Yanguas, L. R., Quatieri, T. F., and Goodman, F. (1999). "Implications of glottal source for speaker and dialect identification," in IEEE International Conference on Acoustics, Speech, and Signal Processing (Phoenix, AZ), pp. 813-816.

Session 73 - January 8, 2009

Plumpe, M. D., T. F. Quatieri, et al. (1999). "Modeling of the glottal flow derivative waveform with application to speaker identification." IEEE Transactions on Speech and Audio Processing 7(5): 569-586.

Session 74 - January 22, 2009

Walker, J., and Murphy, P. (2005). "Advanced methods for glottal Wave extraction," in Nonlinear Analyses and Algorithms for Speech Processing. International Conference on Non-Linear Speech Processing, NOLISP 2005, Barcelona, Spain, April 19-22, 2005, Revised Selected Papers, edited by M. Faundez-Zanuy (Springer-Verlag Berlin Heidelberg, Barcelona, Spain), pp. 139-149.

Session 75 - February 5, 2009

Fröhlich, M., D. Michaelis, et al. (2001). "SIM - simultaneous inverse filtering and matching of a glottal flow model for acoustic speech signals." Journal of the Acoustical Society of America 110(1): 479-488.

Session 76 - February 20, 2009

(day change to Fridays!)

Jinachitra, P. and J. O. Smith, III (2005). Joint estimation of glottal source and vocal tract for vocal synthesis using Kalman smoothing and EM algorithm. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY.

Session 77 - March 6, 2009

Zañartu, M. (2009). An impedance-based inverse filtering scheme of speech signals applied to a clinical monitoring system. PhD thesis proposal, Purdue University.

Session 78 - March 20, 2009

Ibid.

Session 79 - April 3, 2009

VQSG turns 4!

Pierrehumbert, J. B., and Frisch, S. A. (1997). "Synthesizing allophonic glottalization," in Progress in Speech Synthesis, edited by J. P. H. van Santen, R. W. Sproat, J. P. Olive, and J. Hirschberg (Springer-Verlag, New York). and Hillenbrand, J. M., and Houde, R. A. (1996). "Role of F0 and amplitude in the perception of intervocalic glottal stops," J. Speech Hear. Res. 39(6), 1182-1190.

Session 80 - May 1, 2009

Titze, IR, Story, BH. Rules for controlling low-dimensional vocal fold models with muscle activation. J Acoust Soc Am 2002;112:1064-76.

Session 81 - May 15, 2009

Ibid.

Session 82 - June 12, 2009

Discussion with Kimberly Dietz on her summer project: Naturalness of nonmodalities during pitch-scale modification.

Session 83 - July 2, 2009

Discussion with Miriam Makhlouf: Acoustic characteristics of speakers with spasmodic dysphonia.

Session 84 - July 31, 2009

Discussion with Tom Baran on Time-scale modification on speech with irregularities

Session 85 - August 14, 2009

Grillo, E. and Verdolini, K. (2008). "Evidence for distinguishing pressed, normal, resonant, and breathy voice qualities by laryngeal resistance and vocal efficiency in vocally trained subjects," J. Voice 22(5), 546-552.

Session 86 - September 1, 2009

Discussion with Maria Berezina on her summer research: Autoregressive modeling of voiced speech.

Session 87 - October 2, 2009

Discussion with Frank Tompkins from Patrick Wolfe's lab on their latest INTERSPEECH paper: Approximate intrinsic Fourier analysis of speech.

Session 88 - October 15, 2009

Ibid.

Session 89 - October 29, 2009

Lulich, S. M. (2009). "Subglottal resonances and distinctive features," Journal of Phonetics, doi:10.1016/j.wocn.2008.10.006.

Session 90 - November 13, 2009

Ibid.


If there is anything we can do to facilitate our discussions, please let us know. Thank you for your interest in the Voice Quality Study Group.

Daryush, dmehta777@mit.edu
(Please remove the '777' when emailing. This is an anti-spam measure.)