Report

Distinct sensitivity to spectrotemporal modulation supports brain asymmetry for speech and melody

Philippe Albouy https://orcid.org/0000-0001-8549-6954 [email protected], Lucas Benjamin https://orcid.org/0000-0002-9578-6039, Benjamin Morillon https://orcid.org/0000-0002-0049-064X, and Robert J. Zatorre https://orcid.org/0000-0003-1730-7423Authors Info & Affiliations

Science

28 Feb 2020

Vol 367, Issue 6481

pp. 1043-1047

DOI: 10.1126/science.aaz3468

Speech versus music in the brain

To what extent does the perception of speech and music depend on different mechanisms in the human brain? What is the anatomical basis underlying this specialization? Albouy et al. created a corpus of a cappella songs that contain both speech (semantic) and music (melodic) information and degraded each stimulus selectively in either the temporal or spectral domain. Degradation of temporal information impaired speech recognition but not melody recognition, whereas degradation of spectral information impaired melody recognition but not speech recognition. Brain scanning revealed a right-left asymmetry for speech and music. Classification of speech content occurred exclusively in the left auditory cortex, whereas classification of melodic content occurred only in the right auditory cortex.

Science, this issue p. 1043

Abstract

Does brain asymmetry for speech and music emerge from acoustical cues or from domain-specific neural networks? We selectively filtered temporal or spectral modulations in sung speech stimuli for which verbal and melodic content was crossed and balanced. Perception of speech decreased only with degradation of temporal information, whereas perception of melodies decreased only with spectral degradation. Functional magnetic resonance imaging data showed that the neural decoding of speech and melodies depends on activity patterns in left and right auditory regions, respectively. This asymmetry is supported by specific sensitivity to spectrotemporal modulation rates within each region. Finally, the effects of degradation on perception were paralleled by their effects on neural classification. Our results suggest a match between acoustical properties of communicative signals and neural specializations adapted to that purpose.

Speech and music represent the most cognitively complex, and arguably uniquely human, use of sound. To what extent do these two domains depend on separable neural mechanisms? What is the basis for such specialization? Several studies have proposed that left hemisphere neural specialization of speech (1) and right hemisphere specialization of pitch-based aspects of music (2) emerge from differential analysis of acoustical cues in the left and right auditory cortices (ACs). However, domain-specific accounts suggest that speech and music are processed by dedicated neural networks, the lateralization of which cannot be explained by low-level acoustical cues (3–6).

Despite consistent empirical evidence in its favor, the acoustical cue account has been computationally underspecified: Concepts such as spectrotemporal resolution (7–9), time integration windows (10), and oscillations (11) have all been proposed to explain hemispheric specializations. However, it is difficult to test these concepts directly within a neurally viable framework, especially using naturalistic speech or musical stimuli. The concept of spectrotemporal receptive fields (12) provides a computationally rigorous and neurophysiologically plausible approach to the neural decomposition of acoustical cues. This model proposes that auditory neurons act as spectrotemporal modulation (STM) rate filters, based on both single-cell recordings in animals (13, 14) and neuroimaging in humans (15, 16). STM may provide a mechanistic basis to account for lateralization in AC (17), but a direct relationship among acoustical STM features, hemispheric asymmetry, and behavioral performance during processing of complex signals such as speech and music has not been investigated.

We created a stimulus set in which 10 original sentences were crossed with 10 original melodies, resulting in 100 naturalistic a cappella songs (Fig. 1) (stimuli are available at www.zlab.mcgill.ca/downloads/albouy_20190815/). This orthogonalization of speech and melodic domains across stimuli allows the dissociation of speech-specific (or melody-specific) from nonspecific acoustic features, thereby controlling for any potential acoustic bias (3). We created two separate stimulus sets, one with French and one with English sentences, to allow for reproducibility and to test generality across languages. We then parametrically degraded each stimulus selectively in either the temporal or spectral dimension using a manipulation that decomposes the acoustical signal using the STM framework (18).

Fig. 1 Spectrotemporal filtering and stimulus set.

(A) Spectral and temporal degradations applied on an original a cappella song. (B) One hundred a cappella songs in each language were recorded following a 10 × 10 matrix with 10 melodies (number code) and 10 sentences (letter code). Stimuli were then filtered either in the spectral or in the temporal dimension with five filter cutoffs, resulting in 1000 degraded stimuli for each language.

We first investigated the importance of STM rates on sentence or melody recognition scores in a behavioral experiment (Fig. 2A). Native French (n = 27) and English (n = 22) speakers were presented with pairs of stimuli and asked to discriminate either the speech or the melodic content. Thus, the stimulus set across the two tasks was identical; only the instructions differed. The degradation of information in the temporal dimension impaired sentence recognition (t₍₄₈₎ = 13.61 < 0.001, one-sample t test against zero of the slope of the linear fit relating behavior to the degree of acoustic degradation) but not melody recognition (t₍₄₈₎ = 0.62, p = 0.53), whereas degradation of information in the spectral dimension impaired melody recognition (t₍₄₈₎ = 8.24 < 0.001) but not sentence recognition (t₍₄₈₎ = –1.28, p = 0.20; Fig. 2, B and C). This double dissociation was confirmed by a domain-by-degradation interaction (2 × 2 repeated-measures ANOVA: F_(1,47) = 207.04, p < 0.001). Identical results were observed for the two language groups (see fig. S2 and the supplementary results for complementary analyses).

Fig. 2 Behavioral experiment.

(A) Participants listened to degraded (either in the spectral or temporal dimension) a cappella songs presented in pairs. After the second song, a visual instruction indicated the domain of interest (sentences or melodies). Lower panel shows example trials. (B) Behavioral performance of French-speaking listeners. Aqua shading indicates temporal degradations and orange shading indicates spectral degradations. Average performance across participants (95% confidence interval) and individual performance modeled with linear regression are shown for both types of degradations. (C) Same as (B) but for English-speaking listeners. Error bars indicate SEM. Asterisks indicate significant differences.

We then investigated the impact of STM rates on the neural responses to speech and melodies using functional magnetic resonance imaging (fMRI). Blood oxygenation level–dependent (BOLD) activity was recorded while 15 French speakers who had participated in the behavioral experiment listened to blocks of five songs degraded either in the temporal or spectral dimension. Participants attended to either the speech or the melodic content (Fig. 3A). BOLD signal in bilateral ACs scaled with both temporal and spectral degradation cutoffs [i.e., parametric modulation with quantity of temporal or spectral information; p < 0.05 familywise error (FWE) corrected; Fig. 3B and table S1]. These regions were located lateral to primary ACs and correspond to the ventral auditory stream of information processing, covering both parabelt areas and the lateral anterior superior temporal gyrus [parabelt and auditory area 4 (A4) regions; see (19)], but there was no significant difference in the hemispheric response to either dimension (whole-brain two-sample t tests; all p > 0.05).

Fig. 3 fMRI experiment.

(A) fMRI design. BOLD activity was collected while participants listened to blocks of five songs (degraded either in the temporal or spectral dimension). To control for attention, participants were asked to detect two catch trials (with high filter cutoffs containing melody (or sentence) repetition (1-back task). (B) Univariate fMRI results (p < 0.05, voxelwise FWE corrected). Shown is parametric modulation of BOLD signal with temporal (top) or spectral (bottom) filter cutoffs. (C) Multivariate analysis: accuracy (minus chance) maps for 10-category classifications of sentences and melodies (p <.05, cluster corrected). t-values are plotted on FsAverage surface (Freesurfer https://surfer.nmr.mgh.harvard.edu/). Regions of interest are extracted from the atlas by Glasser *et al*. (19): A1, primary auditory cortex; A4, auditory 4 complex (TE.3); RI, retroinsular cortex; PBelt, parabelt complex; LBelt, lateral belt complex; MBelt, medial belt complex. (D) Decoding accuracy in significant clusters of (C) presented as a function of domain and regions (chance level at 10%). Solid/dashed lines: participants showing the expected/reversed effect (sentences: n = 14/1; melodies: n = 13/2). (E) Prevalence analysis for 10-category classifications of sentences and melodies computed in an anatomically defined mask covering the entire temporal lobe (see the materials and methods). Highlighted areas are those where the majority null hypothesis (prevalence γ < 50% of the population) can be rejected at a level of α = 0.05 (p < 0.05, corrected). (F) Left panel: Multivariate analysis for 10-category classification of melodies versus sentences for accuracy values analyzed in terms of lateralization index [(R – L)/(R + L); p < 0.05, cluster corrected]. Right panel: Accuracy lateralization values in the significant cluster. Negative values indicate left-lateralized accuracy, whereas positive values indicate right-lateralized accuracy. Solid/dashed lines: participants showing the expected/reversed effect (n = 13/2). Bar plots show mean accuracy. White circles indicate individual data. Asterisks indicate significant effects; ns, nonsignificant effects.

To investigate more fine-grained encoding of speech and melodic contents, we performed a multivariate pattern analysis on the fMRI data. Ten-category classifications (separately for melodies and sentences) using whole-brain searchlight analyses (support vector machine, leave-one-out cross-validation procedure, cluster corrected) revealed that the neural encoding of sentences significantly depends on neural activity patterns in left A4 [TE.3; subregion of AC; see (19)], whereas the neural decoding of melodies significantly depends on neural activity patterns in right A4 (p < 0.05 cluster corrected; Fig. 3, C and D, and table S1; other, subthreshold clusters are reported in fig. S3). To ensure that this effect was generalizable to the population, we performed a complementary information prevalence analysis within temporal lobe masks (see the materials and methods). For the decoding of sentences, a prevalence value of up to 70% was observed in left A4 (p = 0.02, corrected), whereas a prevalence value of up to 69% was observed for the decoding of melodies in right A4 (p = 0.03, corrected; see table S1). Finally, we tested whether the classification accuracy was better for sentence or melody in the right or the left hemisphere. We computed a lateralization index on accuracy scores [(R – L)/(R + L)] and observed a significant asymmetry in opposite directions for the two domains in region A4 (Fig. 3F, table S1, and fig. S4; p < 0.05, cluster corrected at the whole-brain level).

We then tested the relationship between neural specialization of left and right hemispheres for speech and melodic contents and behavioral processing of these two domains. We estimated linear and nonlinear statistical dependencies by computing the normalized mutual information [NMI (20)] between the confusion matrices extracted from classification of neural data (whole brain, for each searchlight) and those from behavioral data recorded offline (for each participant and each domain). To investigate the correspondence between neural and behavioral patterns (pattern of errors) instead of mere accuracy (diagonal), these analyses were done after removing the diagonal information (Fig. 4A). NMI was significantly higher in left than right A4 for sentences, whereas the reverse pattern was observed for melodies, as measured by the lateralization index (p < 0.05, cluster corrected; see the materials and methods, table S1, and fig. S5).

Fig. 4 Effect of degradations on decoding accuracy and NMI.

(A) NMI computed between confusion matrices extracted from behavioral and fMRI data at the whole-brain level (searchlight procedure) for each participant and for each domain (sentences or melodies). (B) NMI results presented in terms of lateralization index [(R – L)/(R + L); p < 0.05, cluster corrected]. Light blue clusters indicate a left-lateralized NMI for sentences; red and yellow clusters indicate a right-lateralized NMI for melodies. White circles indicate individual data. Solid/dashed lines: participants showing the expected/reversed effect (n = 14/1). (C) Accuracy change for 10-category classification of sentences (aqua) and melodies (orange) presented as a function of degradation type (blue-shaded bars: temporal degradation; red-shaded bars: spectral degradation) revealing a domain-by-degradation interaction (p < 0.05, cluster corrected). Same conventions as in (B) (solid/dashed lines: n = 13/2 for sentences, n = 12/3 for melodies). (D) Effect of acoustic degradations on NMI lateralization index [(R – L)/(R + L); p < 0.05, cluster corrected]. Left panel: Domain-by-degradation interaction. Right panel: Effect of degradation (temporal versus spectral) performed per domain (sentences and melodies). Bar plots indicate the mean lateralization index per domain and degradation type (temporal and spectral). Same conventions as in (B) and (C) (solid/dashed lines: n = 14/1 for sentences, n = 12/3 for melodies). Asterisks indicate significant lateralization index; ns, nonsignificant lateralization index.

We next tested whether the origin of the observed lateralization was related to attentional processes by investigating the decoding accuracy and NMI lateralization index as a function of attention to sentences or melodies. Whole-brain analyses did not reveal any significant cluster, suggesting that the previously observed hemispheric specialization is robust to attention and thus is more likely to be linked to automatic than to top-down processes (see fig. S6 and the supplementary results for details).

Finally, we investigated whether the hemispheric specialization for speech and melodic contents was directly related to a differential acoustic sensitivity of left and right ACs to STMs, as initially hypothesized. We estimated the impact of temporal or spectral degradations on decoding accuracy by computing the accuracy change (with negative indicating accuracy loss and positive indicating accuracy gain) between decoding accuracy computed on all trials (all degradation types) and on a specific degradation type (temporal or spectral). We observed a domain-by-degradation interaction in bilateral ACs (left and right area A4; p < 0.05, cluster corrected; Fig. 4C and fig. S7). For sentences, accuracy loss was observed only in the left A4 for temporal as compared with spectral degradations (p < 0.001, Tukey corrected; all others, p > 0.16), whereas the reverse pattern was observed for melodies only in right A4 (p = 0.003, Tukey corrected; all others, p > 0.29).

This differential sensitivity to acoustical cues in left and right ACs was also observed in the brain–behavior relationship. We investigated the effect of degradations on the NMI lateralization index. We first show a significant domain-by-degradation interaction observed in area A4 (p < 0.05, cluster corrected; Fig. 4D, left; table S1; and fig. S8). The main effect of degradation (temporal > spectral) was then analyzed with two-sample t tests for each domain to reveal that the NMI lateralization index was affected in opposite directions by temporal and spectral degradations (A4 and superior temporal sulcus dorsal anterior regions, see table S1; p < 0.05, cluster corrected; Fig. 4D, right, and fig. S9). Post hoc tests (one-sample t tests) revealed that for sentences, NMI was left lateralized for spectral degradations (t₍₁₄₎ = –2.32, p = 0.03), but the lateralization vanished for temporal degradations (t₍₁₄₎ = 0.44, p = 0.66). By contrast, for melodies, NMI was right lateralized for temporal degradations(t₍₁₄₎ = 3.46, p = 0.004) and the lateralization vanished for spectral degradations (t₍₁₄₎ = –0.24, p = 0.80).

Years of debate have centered on the theoretically important question of the representation of speech and music in the brain (2, 6, 21). Here, we take advantage of the STM framework to establish a rigorous demonstration that: (i) perception of speech content is most affected by degradation of information in the temporal dimension, whereas perception of melodic content is most affected by degradation in the spectral dimension (Fig. 2, B and C); (ii) neural decoding of speech and melodic contents primarily depends on neural activity patterns in the left and right AC regions, respectively (Fig. 3, C to F, and fig. S4); (iii) in turn, this neural specialization for each stimulus domain is dependent on the specific sensitivity to STM rates of each auditory region (Fig. 4C and fig S7); and (iv) the perceptual effect of temporal or spectral degradation on speech or melodic content is mirrored specifically within each hemispheric auditory region (as revealed by mutual information), thereby demonstrating the brain–behavior relationship necessary to conclude that STM features are processed differentially for each stimulus domain within each hemisphere (Fig. 4D and figs. S8 and S9).

These results extend seminal studies on the robustness of speech comprehension to spectral degradation (17, 22) and are also consistent with observations that the temporal modulation rate of speech samples from many languages is substantially higher than that of music samples across genres (23). It remains to be seen whether such a result also applies to other languages, such as tone languages, for which spectral information is arguably more important, and to musical pieces with complex rhythmic and harmonic variations or belonging to musical systems different from the Western tonal melodies used here.

The idea that auditory cognition depends on processing of spectrotemporal energy patterns and that these features often trade off against one another is supported by human psychophysics (17, 18), recordings from cat inferior colliculus (13), and human neuroimaging (6, 7, 15–17). During passive listening of short, isolated stimuli lacking semantic content, preferences for high spectral versus temporal modulation are distributed in an anterior–posterior dimension of the AC, with relatively weaker hemispheric differences (6, 7, 15, 16). Our results suggest that this purely acoustic lateralization may be enhanced during the iterative analysis of temporally structured natural stimuli (24) in the most anterior and inferior auditory (A4) patches, which are known to analyze complex acoustic features and their relationships, or sound categories, thus fitting well with their encoding of relevant speech or musical features (6, 25, 26). We hypothesize that hemispheric lateralization of STM cues scales with the strength of the dynamical interactions between acoustic and higher-level (motor, syntactic, working memory, etc.) processes, which are typically maximized with complex, cognitively engaging stimuli that require decoding of feature relationships to extract meaning (speech or melodic content), as used here.

More generally, studies across numerous species have indicated a match between ethologically relevant stimulus features and the spectrotemporal response functions of their auditory nervous systems, suggesting efficient adaptation to the statistical properties of relevant sounds, especially communicative ones (27). This is consistent with the theory of efficient neural coding (28). Our study shows that in addition to speech, this theory can be applied to melodic information, a form-bearing dimension of music. Humans have developed two means of auditory communication: speech and music. Our study suggests that these two domains exploit opposite extremes of the spectrotemporal continuum, with a complementary specialization of two parallel neural systems, one in each hemisphere, that maximizes the efficiency of encoding of their respective acoustical features.

Acknowledgments

We thank S. Norman-Haignere, A.-L. Giraud, and E. Coffey for comments on a previous version of the manuscript; C. Soden for creating the melodies; A.-K. Barbeau for singing the stimuli; and M. Generale and M. de Francisco for expertise with recording. Funding: This work was supported by a foundation grant from the Canadian Institute for Health Research to R.J.Z. P.A. is funded by a Banting Fellowship. R.J.Z. is a senior fellow of the Canadian Institute for Advanced Research. B.M.’s research is supported by grants ANR-16-CONV-0002 (ILCB) and ANR-11-LABX-0036 (BLRI) and the Excellence Initiative of Aix-Marseille University (A*MIDEX). Author contributions: Conceptualization: B.M., P.A., R.J.Z.; Methodology: P.A., L.B., B.M., R.J.Z.; Analysis: P.A., L.B.; Investigation: L.B., P.A.; Resources: R.J.Z.; Writing original draft: P.A., B.M., R.J.Z.; Writing – review & editing: P.A., L.B., B.M., R.J.Z.; Visualization: P.A.; Supervision: B.M., R.J.Z. Competing interests: The authors declare no competing interests. Data and materials availability: Sound files can be found at www.zlab.mcgill.ca/downloads/albouy_20190815/. A demo of the behavioral task can be found at: https://www.zlab.mcgill.ca/spectro_temporal_modulations/. Data and code used to generate the findings of this study are accessible online (29).

Supplementary Material

Summary

Materials and Methods

Figs. S1 to S9

Table S1

Supplementary Results

References (30–32)

Resources

File (aaz3468-albouy-sm.pdf)

Download
5.78 MB

View/request a protocol for this paper from Bio-protocol.

References and Notes

D. Poeppel, The analysis of speech in different temporal integration windows: Cerebral lateralization as “asymmetric sampling in time.” Speech Commun. 41, 245–255 (2003).

Crossref

ISI

Google Scholar

R. J. Zatorre, P. Belin, V. B. Penhune, Structure and function of auditory cortex: Music and speech. Trends Cogn. Sci. 6, 37–46 (2002).

Crossref

PubMed

ISI

Google Scholar

C. McGettigan, S. K. Scott, Cortical asymmetries in speech perception: What’s wrong, what’s right and what’s left? Trends Cogn. Sci. 16, 269–276 (2012).

Crossref

PubMed

ISI

Google Scholar

I. Peretz, M. Coltheart, Modularity of music processing. Nat. Neurosci. 6, 688–691 (2003).

Crossref

PubMed

ISI

Google Scholar

A. D. Friederici, Hierarchy processing in human neurobiology: How specific is it? Philos. Trans. R. Soc. London B Biol. Sci. 375, 20180391 (2020).

Crossref

PubMed

ISI

Google Scholar

S. Norman-Haignere, N. G. Kanwisher, J. H. McDermott, Distinct Cortical Pathways for Music and Speech Revealed by Hypothesis-Free Voxel Decomposition. Neuron 88, 1281–1296 (2015).

Crossref

PubMed

ISI

Google Scholar

R. J. Zatorre, P. Belin, Spectral and temporal processing in human auditory cortex. Cereb. Cortex 11, 946–953 (2001).

Crossref

PubMed

ISI

Google Scholar

J. Obleser, F. Eisner, S. A. Kotz, Bilateral speech comprehension reflects differential sensitivity to spectral and temporal features. J. Neurosci. 28, 8116–8123 (2008).

Crossref

PubMed

ISI

Google Scholar

M. Schönwiesner, R. Rübsamen, D. Y. von Cramon, Hemispheric asymmetry for spectral and temporal processing in the human antero-lateral auditory belt cortex. Eur. J. Neurosci. 22, 1521–1528 (2005).

Crossref

PubMed

ISI

Google Scholar

A. Boemio, S. Fromm, A. Braun, D. Poeppel, Hierarchical and asymmetric temporal sensitivity in human auditory cortices. Nat. Neurosci. 8, 389–395 (2005).

Crossref

PubMed

ISI

Google Scholar

B. Morillon, K. Lehongre, R. S. J. Frackowiak, A. Ducorps, A. Kleinschmidt, D. Poeppel, A.-L. Giraud, Neurophysiological origin of human brain asymmetry for speech and language. Proc. Natl. Acad. Sci. U.S.A. 107, 18688–18693 (2010).

Crossref

PubMed

ISI

Google Scholar

T. Chi, P. Ru, S. A. Shamma, Multiresolution spectrotemporal analysis of complex sounds. J. Acoust. Soc. Am. 118, 887–906 (2005).

Crossref

PubMed

ISI

Google Scholar

F. A. Rodríguez, H. L. Read, M. A. Escabí, Spectral and temporal modulation tradeoff in the inferior colliculus. J. Neurophysiol. 103, 887–903 (2010).

Crossref

PubMed

ISI

Google Scholar

J. Fritz, S. Shamma, M. Elhilali, D. Klein, Rapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex. Nat. Neurosci. 6, 1216–1223 (2003).

Crossref

PubMed

ISI

Google Scholar

M. Schönwiesner, R. J. Zatorre, Spectro-temporal modulation transfer function of single voxels in the human auditory cortex measured with high-resolution fMRI. Proc. Natl. Acad. Sci. U.S.A. 106, 14611–14616 (2009).

Crossref

PubMed

ISI

Google Scholar

R. Santoro, M. Moerel, F. De Martino, R. Goebel, K. Ugurbil, E. Yacoub, E. Formisano, Encoding of natural sounds at multiple spectral and temporal resolutions in the human auditory cortex. PLOS Comput. Biol. 10, e1003412 (2014).

Crossref

PubMed

ISI

Google Scholar

A. Flinker, W. K. Doyle, A. D. Mehta, O. Devinsky, D. Poeppel, Spectrotemporal modulation provides a unifying framework for auditory cortical asymmetries. Nat. Hum. Behav. 3, 393–405 (2019).

Crossref

PubMed

Google Scholar

T. M. Elliott, F. E. Theunissen, The modulation transfer function for speech intelligibility. PLOS Comput. Biol. 5, e1000302 (2009).

Crossref

PubMed

ISI

Google Scholar

M. F. Glasser, T. S. Coalson, E. C. Robinson, C. D. Hacker, J. Harwell, E. Yacoub, K. Ugurbil, J. Andersson, C. F. Beckmann, M. Jenkinson, S. M. Smith, D. C. Van Essen, A multi-modal parcellation of human cerebral cortex. Nature 536, 171–178 (2016).

Crossref

PubMed

ISI

Google Scholar

R. A. Ince, B. L. Giordano, C. Kayser, G. A. Rousselet, J. Gross, P. G. Schyns, A statistical framework for neuroimaging data analysis based on mutual information estimated via a gaussian copula. Hum. Brain Mapp. 38, 1541–1573 (2017).

Crossref

PubMed

ISI

Google Scholar

D. Schön, R. Gordon, A. Campagne, C. Magne, C. Astésano, J.-L. Anton, M. Besson, Similar cerebral networks in language, music and song perception. Neuroimage 51, 450–461 (2010).

Crossref

PubMed

ISI

Google Scholar

R. V. Shannon, F. G. Zeng, V. Kamath, J. Wygonski, M. Ekelid, Speech recognition with primarily temporal cues. Science 270, 303–304 (1995).

Crossref

PubMed

ISI

Google Scholar

N. Ding, A. D. Patel, L. Chen, H. Butler, C. Luo, D. Poeppel, Temporal modulations in speech and music. Neurosci. Biobehav. Rev. 81, (pt. B), 181–187 (2017).

Crossref

PubMed

ISI

Google Scholar

A. M. Leaver, J. P. Rauschecker, Cortical representation of natural complex sounds: Effects of acoustic features and auditory object category. J. Neurosci. 30, 7604–7612 (2010).

Crossref

PubMed

ISI

Google Scholar

T. Overath, J. H. McDermott, J. M. Zarate, D. Poeppel, The cortical analysis of speech-specific temporal structure revealed by responses to sound quilts. Nat. Neurosci. 18, 903–911 (2015).

Crossref

PubMed

ISI

Google Scholar

M. Chevillet, M. Riesenhuber, J. P. Rauschecker, Functional correlates of the anterolateral processing hierarchy in human auditory cortex. J. Neurosci. 31, 9345–9352 (2011).

Crossref

PubMed

ISI

Google Scholar

L. H. Arnal, A. Flinker, A. Kleinschmidt, A. L. Giraud, D. Poeppel, Human screams occupy a privileged niche in the communication soundscape. Curr. Biol. 25, 2051–2056 (2015).

Crossref

PubMed

ISI

Google Scholar

J. Gervain, M. N. Geffen, Efficient neural coding in auditory and speech perception. Trends Neurosci. 42, 56–65 (2019).

Crossref

PubMed

ISI

Google Scholar

P. Albouy, L. Benjamin, B. Morillon, R. J. Zatorre, Data for: Distinct sensitivity to spectrotemporal modulation supports brain asymmetry for speech and melody, Open Science Framework (2020); https://doi.org/10.17605/OSF.IO/9UB78.

Google Scholar

D. Kim, J. D. Stephens, M. A. Pitt, How does context play a part in splitting words apart? Production and perception of word boundaries in casual speech. J. Mem. Lang. 66, 509–529 (2012).

Crossref

PubMed

ISI

Google Scholar

M. N. Hebart, K. Görgen, J.-D. Haynes, The Decoding Toolbox (TDT): A versatile software package for multivariate analyses of functional imaging data. Front. Neuroinform. 8, 88 (2015).

Crossref

PubMed

ISI

Google Scholar

C. Allefeld, K. Görgen, J. D. Haynes, Valid population inference for information-based imaging: From the second-level t-test to prevalence inference. Neuroimage 141, 378–392 (2016).

Crossref

PubMed

ISI

Google Scholar

(0)eLetters

eLetters is a forum for ongoing peer review. eLetters are not edited, proofread, or indexed, but they are screened. eLetters should provide substantive and scholarly commentary on the article. Embedded figures cannot be submitted, and we discourage the use of figures within eLetters in general. If a figure is essential, please include a link to the figure within the text of the eLetter. Please read our Terms of Service before submitting an eLetter.

Information & Authors

Information

Published In

Science

Volume 367 | Issue 6481
28 February 2020

Copyright

http://www.sciencemag.org/about/science-licenses-journal-article-reuse

This is an article distributed under the terms of the Science Journals Default License.

Submission history

Received: 2 September 2019

Accepted: 2 January 2020

Published in print: 28 February 2020

Permissions

Request permissions for this article.

Request Permissions

Acknowledgments

Authors

Affiliations

Philippe Albouy^* https://orcid.org/0000-0001-8549-6954 [email protected]

Cognitive Neuroscience Unit, Montreal Neurological Institute, McGill University, Montreal, QC, Canada.

International Laboratory for Brain, Music and Sound Research (BRAMS); Centre for Research in Brain, Language and Music; Centre for Interdisciplinary Research in Music, Media, and Technology, Montreal, QC, Canada.

CERVO Brain Research Centre, School of Psychology, Laval University, Quebec, QC, Canada.

View all articles by this author

Lucas Benjamin https://orcid.org/0000-0002-9578-6039

Cognitive Neuroscience Unit, Montreal Neurological Institute, McGill University, Montreal, QC, Canada.

View all articles by this author

Benjamin Morillon^† https://orcid.org/0000-0002-0049-064X

Aix Marseille University, Inserm, INS, Institut de Neurosciences des Systèmes, Marseille, France.

View all articles by this author

Robert J. Zatorre^† https://orcid.org/0000-0003-1730-7423

Cognitive Neuroscience Unit, Montreal Neurological Institute, McGill University, Montreal, QC, Canada.

View all articles by this author

Funding Information

Canadian Institute for Health Research

Notes

Corresponding author. E-mail: [email protected]

^†

These authors contributed equally to this work.

Metrics & Citations

Metrics

Article Usage

Altmetrics

Citations

Cite as

Philippe Albouy et al.

Distinct sensitivity to spectrotemporal modulation supports brain asymmetry for speech and melody.Science367,1043-1047(2020).DOI:10.1126/science.aaz3468

Export citation

Select the format you want to export the citation of this publication.

Cited by

- Yuto Ozaki,
- Adam Tierney,
- Peter Q. Pfordresher,
- John M. McBride,
- Emmanouil Benetos,
- Polina Proutskova,
- Gakuto Chiba,
- Fang Liu,
- Nori Jacoby,
- Suzanne C. Purdy,
- Patricia Opondo,
- W. Tecumseh Fitch,
- Shantala Hegde,
- Martín Rocamora,
- Rob Thorne,
- Florence Nweke,
- Dhwani P. Sadaphal,
- Parimal M. Sadaphal,
- Shafagh Hadavi,
- Shinya Fujii,
- Sangbuem Choo,
- Marin Naruse,
- Utae Ehara,
- Latyr Sy,
- Mark Lenini Parselelo,
- Manuel Anglada-Tort,
- Niels Chr. Hansen,
- Felix Haiduk,
- Ulvhild Færøvik,
- Violeta Magalhães,
- Wojciech Krzyżanowski,
- Olena Shcherbakova,
- Diana Hereld,
- Brenda Suyanne Barbosa,
- Marco Antonio Correa Varella,
- Mark van Tongeren,
- Polina Dessiatnitchenko,
- Su Zar Zar,
- Iyadh El Kahla,
- Olcay Muslu,
- Jakelin Troy,
- Teona Lomsadze,
- Dilyana Kurdova,
- Cristiano Tsope,
- Daniel Fredriksson,
- Aleksandar Arabadjiev,
- Jehoshaphat Philip Sarbah,
- Adwoa Arhine,
- Tadhg Ó Meachair,
- Javier Silva-Zurita,
- Ignacio Soto-Silva,
- Neddiel Elcie Muñoz Millalonco,
- Rytis Ambrazevičius,
- Psyche Loui,
- Andrea Ravignani,
- Yannick Jadoul,
- Pauline Larrouy-Maestri,
- Camila Bruder,
- Tutushamum Puri Teyxokawa,
- Urise Kuikuro,
- Rogerdison Natsitsabui,
- Nerea Bello Sagarzazu,
- Limor Raviv,
- Minyu Zeng,
- Shahaboddin Dabaghi Varnosfaderani,
- Juan Sebastián Gómez-Cañón,
- Kayla Kolff,
- Christina Vanden Bosch der Nederlanden,
- Meyha Chhatwal,
- Ryan Mark David,
- I. Putu Gede Setiawan,
- Great Lekakul,
- Vanessa Nina Borsan,
- Nozuko Nguqu,
- Patrick E. Savage,
Globally, songs and instrumental melodies are slower and higher and use more stable pitches than speech: A Registered Report, Science Advances, 10, 20, (2024)./doi/10.1126/sciadv.adm9797
Abstract
- Hamed Pourfannan,
- Hamed Mahzoon,
- Yuichiro Yoshikawa,
- Hiroshi Ishiguro,
Expansion in speech time can restore comprehension in a simultaneously speaking bilingual robot, Frontiers in Robotics and AI, 9, (2023).https://doi.org/10.3389/frobt.2022.1032811
Crossref
- Carola de Beer,
- Isabell Wartenburger,
- Clara Huttenlauch,
- Sandra Hanne,
A systematic review on production and comprehension of linguistic prosody in people with acquired language and communication disorders resulting from unilateral brain lesions, Journal of Communication Disorders, 101, (106298), (2023).https://doi.org/10.1016/j.jcomdis.2022.106298
Crossref
- Chloé Jaroszynski,
- Agnès Job,
- Maciej Jedynak,
- Olivier David,
- Chantal Delon-Martin,
Tinnitus Perception in Light of a Parietal Operculo–Insular Involvement: A Review, Brain Sciences, 12, 3, (334), (2022).https://doi.org/10.3390/brainsci12030334
Crossref
- Yarui Wei,
- Xiuyuan Liang,
- Xiaotao Guo,
- Xiaoxiao Wang,
- Yunyi Qi,
- Rizwan Ali,
- Ming Wu,
- Ruobing Qian,
- Ming Wang,
- Bensheng Qiu,
- Huawei Li,
- Xianming Fu,
- Lin Chen,
Brain hemispheres with right temporal lobe damage swap dominance in early auditory processing of lexical tones, Frontiers in Neuroscience, 16, (2022).https://doi.org/10.3389/fnins.2022.909796
Crossref
- Robert J. Zatorre,
Hemispheric asymmetries for music and speech: Spectrotemporal modulations and top-down influences, Frontiers in Neuroscience, 16, (2022).https://doi.org/10.3389/fnins.2022.1075511
Crossref
- Katja Saldeitis,
- Marcus Jeschke,
- Annika Michalek,
- Julia U. Henschke,
- Wolfram Wetzel,
- Frank W. Ohl,
- Eike Budinger,
Selective Interruption of Auditory Interhemispheric Cross Talk Impairs Discrimination Learning of Frequency-Modulated Tone Direction But Not Gap Detection and Discrimination, The Journal of Neuroscience, 42, 10, (2025-2038), (2022).https://doi.org/10.1523/JNEUROSCI.0216-21.2022
Crossref
- Felix Bröhl,
- Anne Keitel,
- Christoph Kayser,
MEG Activity in Visual and Auditory Cortices Represents Acoustic Speech-Related Information during Silent Lip Reading, eneuro, 9, 3, (ENEURO.0209-22.2022), (2022).https://doi.org/10.1523/ENEURO.0209-22.2022
Crossref
- Demetrios Neophytou,
- Diego M. Arribas,
- Tushar Arora,
- Robert B. Levy,
- Il Memming Park,
- Hysell V. Oviedo,
Differences in temporal processing speeds between the right and left auditory cortex reflect the strength of recurrent synaptic connectivity, PLOS Biology, 20, 10, (e3001803), (2022).https://doi.org/10.1371/journal.pbio.3001803
Crossref
- Benjamin Morillon,
- Luc H. Arnal,
- Pascal Belin,
The path of voices in our brain, PLOS Biology, 20, 7, (e3001742), (2022).https://doi.org/10.1371/journal.pbio.3001742
Crossref
See more

View Options

View options

PDF format

Download this article as a PDF file

Download PDF

Check Access

Log in to view the full text

AAAS ID LOGIN

AAAS login provides access to Science for AAAS Members, and access to other journals in the Science family to users who have purchased individual subscriptions.

Log in via OpenAthens.

via OpenAthens

Log in via Shibboleth.

via Shibboleth

More options

As a service to the community, this article is available for free. Login or register for free to read this article.

Purchase this issue in print

Buy a single issue of Science for just $15 USD.

Speech versus music in the brain

Abstract

SIGN UP FOR THE SCIENCEADVISER NEWSLETTER

Acknowledgments

Supplementary Material

Summary

Resources

References and Notes

(0)eLetters

Information

Published In

Copyright

Submission history

Permissions

Acknowledgments

Authors

Affiliations

Funding Information

Notes

Metrics

Article Usage

Altmetrics

Citations

Cite as

Export citation

Cited by

View options

PDF format

Check Access

Log in to view the full text

More options

Figures

Multimedia

Share

Share article link

Share on social media

Vaccine priming of rare HIV broadly neutralizing antibody precursors in nonhuman primates

mRNA-LNP HIV-1 trimer boosters elicit precursors to broad neutralizing antibodies

Delocalized, asynchronous, closed-loop discovery of organic laser emitters