Summary: A new study reveals how our brain differentiates between music and speech using simple acoustic parameters. The researchers found that slower, even sounds were perceived as music, while faster, uneven sounds were perceived as speech.
These insights could optimize therapeutic programs for language disorders such as aphasia. The research provides a deeper understanding of auditory processing.
Key facts:
- Simple parameters: The brain uses basic acoustic parameters to distinguish music from speech.
- Therapeutic potential: The findings could improve therapies for language disorders such as aphasia.
- Details of the study: The study involved over 300 participants listening to synthesized audio clips.
source: NYU
Music and speech are among the most common types of sounds we hear. But how do we identify what we think are the differences between the two?
An international team of researchers has mapped this process through a series of experiments, yielding insights that offer a potential means of optimizing therapeutic programs that use music to restore the ability to speak when dealing with aphasia.
This language disorder affects more than 1 in 300 Americans each year, including Wendy Williams and Bruce Willis.
“Although music and speech are different in many ways, ranging from pitch to timbre to sound texture, our results show that the auditory system uses strikingly simple acoustic parameters to distinguish between music and speech,” explains Andrew Chang, a postdoctoral researcher at NYU Department of Psychology and the lead author of the article that appears in the journal PLOS Biology.
“In general, slower and steady sound clips of plain noise sound more like music, while faster and jagged clips sound more like speech.”
Scientists measure the speed of signals using precise units: hertz (Hz). A higher number of Hz means a higher number of repetitions (or cycles) per second than a lower number. For example, people usually walk at a pace of 1.5 to 2 steps per second, which is 1.5-2 Hz.
Stevie Wonder’s 1972 hit “Superstition” has a pitch of approximately 1.6 Hz, while Anna Karina’s 1967 hit “Roller Girl” has a pitch of 2 Hz. Speech, in contrast, is typically two to three times faster than that at 4-5 Hz.
It is well documented that the volume or loudness of a song over time – what is known as “amplitude modulation” – is relatively stable at 1-2 Hz. In contrast, the amplitude modulation of speech is usually 4-5 Hz, which means that its volume changes frequently.
Despite the ubiquity and familiarity of music and speech, scientists previously lacked a clear understanding of how we effortlessly and automatically identify a sound as music or speech.
To better understand this process in their PLOS Biology study, Chang and colleagues conducted a series of four experiments in which more than 300 participants listened to a series of audio segments of synthesized music- and speech-like noise with varying rates of amplitude modulation and regularity.
Audio noise clips only allow volume and speed detection. Participants were asked to judge whether these ambiguous noise clips, which they were told were noise-masked music or speech, sounded like music or speech.
Observing the pattern of participants sorting hundreds of noise clips as music or speech revealed how much each rate and/or regularity feature influenced their judgment between music and speech. It’s the auditory version of “seeing faces in a cloud,” the scientists conclude: If there’s a certain characteristic in the sound wave that matches listeners’ idea of what music or speech should be like, even a clip of white noise can sound like music or speech.
The results showed that our auditory system uses surprisingly simple and basic acoustic parameters to distinguish between music and speech: for participants, clips with slower rates (<2Hz) and more regular amplitude modulation sounded more like music, while clips with more high speeds (~4Hz) and the more irregular amplitude modulation sounded more like speech.
Knowing how the human brain differentiates between music and speech could potentially benefit people with hearing or language disorders such as aphasia, the authors note.
Melodic intonation therapy, for example, is a promising approach for teaching people with aphasia to sing what they want to say, using their intact “musical mechanisms” to bypass impaired speech mechanisms. Therefore, knowing what makes music and speech similar or different in the brain can help design more effective rehabilitation programs.
The other authors of the paper are Xiangbin Teng of the Chinese University of Hong Kong, M. Florencia Asaneo of the National Autonomous University of Mexico (UNAM) and David Popel, professor in the Department of Psychology at New York University and managing director of the Ernst Strungmann Institute for Neuroscience in Frankfurt , Germany.
Financing: The research was supported by a grant from the National Institute on Deafness and Other Communication Disorders, part of the National Institutes of Health (F32DC018205), and Leon Levy Fellowships in Neuroscience.
About this auditory neuroscience research news
Author: James Devitt
source: NYU
Contact: James Devitt – New York University
Image: Image credited to Neuroscience News
Original research: Free access.
“The human auditory system uses amplitude modulation to distinguish music from speech” by Andrew Chang et al. PLOS Biology
Summary
The human auditory system uses amplitude modulation to distinguish music from speech
Music and speech are complex and diverse auditory signals that are at the core of human experience. The mechanisms underlying each domain have been widely studied.
But what perceptual mechanism transforms sound into music or speech and how main acoustic information is needed to distinguish between them remain open questions.
Here, we hypothesized that amplitude modulation (AM), an essential temporal acoustic feature driving the auditory system through levels of processing, is critical for discriminating between music and speech.
Specifically, unlike paradigms using naturalistic acoustic cues (which can be challenging to interpret), we used a noise-research approach to disentangle the auditory mechanism: If AM frequency and regularity are critical for the perceptual discrimination of music and speech , the judgment of artificial noise – synthesized ambiguous audio signals must be consistent with their AM parameters .
In 4 experiments (n = 335), signals with a higher peak AM frequency are generally rated as speech, lower as music. Interestingly, this principle was used consistently by all listeners for speech judgments, but only by musically sophisticated listeners for music.
In addition, signals with a more regular AM are judged as music over speech, and this feature is more critical for music judgment, regardless of musical complexity.
The data suggest that the auditory system can rely on a low-level acoustic property as basic as AM to distinguish music from speech, a simple principle that has provoked both neurophysiological and evolutionary experiments and speculation.