Skip to main content

    Robert Carlyon

    Research Interests:
    Under binaural listening conditions, the detection of target signals within background masking noise is substantially improved when the interaural phase of the target differs from that of the masker. Neural correlates of this binaural... more
    Under binaural listening conditions, the detection of target signals within background masking noise is substantially improved when the interaural phase of the target differs from that of the masker. Neural correlates of this binaural masking level difference (BMLD) have been observed in the inferior colliculus and temporal cortex, but it is not known whether degeneration of the inferior colliculus would result in a reduction of the BMLD in humans. We used magnetoencephalography to examine the BMLD in 13 healthy adults and 13 patients with progressive supranuclear palsy (PSP). PSP is associated with severe atrophy of the upper brain stem, including the inferior colliculus, confirmed by voxel-based morphometry of structural MRI. Stimuli comprised in-phase sinusoidal tones presented to both ears at three levels (high, medium, and low) masked by in-phase noise, which rendered the low-level tone inaudible. Critically, the BMLD was measured using a low-level tone presented in opposite ph...
    To evaluate a speech-processing strategy in which the lowest frequency channel is conveyed using an asymmetric pulse shape and... more
    To evaluate a speech-processing strategy in which the lowest frequency channel is conveyed using an asymmetric pulse shape and "phantom stimulation", where current is injected into one intra-cochlear electrode and where the return current is shared between an intra-cochlear and an extra-cochlear electrode. This strategy is expected to provide more selective excitation of the cochlear apex, compared to a standard strategy where the lowest-frequency channel is conveyed by symmetric pulses in monopolar mode. In both strategies all other channels were conveyed by monopolar stimulation. Within-subjects comparison between the two strategies. Four experiments: (1) discrimination between the strategies, controlling for loudness differences, (2) consonant identification, (3) recognition of lowpass-filtered sentences in quiet, (4) sentence recognition in the presence of a competing speaker. Eight users of the Advanced Bionics CII/Hi-Res 90k cochlear implant. Listeners could easily discriminate between the two strategies but no consistent differences in performance were observed. The proposed method does not improve speech perception, at least in the short term.
    ABSTRACT We used an adaptation paradigm to investigate whether the frequency following response (FFR) would show evidence for neurons tuned to modulation rate in humans, as has been previously shown in the inferior colliculus of the... more
    ABSTRACT We used an adaptation paradigm to investigate whether the frequency following response (FFR) would show evidence for neurons tuned to modulation rate in humans, as has been previously shown in the inferior colliculus of the macaque using fMRI [Baumann et al. Nat. Neurosci. 14, 423-425 (2011)]. The FFR to a 100-ms, 75-dB SPL, target complex tone with an envelope rate of 213 Hz was measured for ten subjects. The target was preceded by a 200-ms, 75-dB SPL, adaptor complex with an envelope rate of 90, 213, or 504 Hz. All complexes contained alternating-phase harmonics from approximately 3.9 to 5.4 kHz. A "vertical" montage (+ Fz, - C7, ground = mid-forehead) was used, for which the FFR is assumed to reflect phase-locked neural activity from generators in the rostral brainstem. The results showed significant adaptation effects in the spectral magnitude of the 213-Hz envelope-related component of the FFR. However, the identical-rate adaptor did not generally produce more adaptation than the different-rate adaptors. Hence, the present results do not provide evidence for neurons tuned to modulation rate in the human brainstem. [Work supported by Wellcome Trust Grant 088263.].
    Fifteen initially inexperienced subjects were trained for 4 weeks (12 2-h sessions) in frequency discrimination with pure tones around 88, 250, or 1605 Hz, or amplitude modulation rate discrimination of noise bands, using modulation rates... more
    Fifteen initially inexperienced subjects were trained for 4 weeks (12 2-h sessions) in frequency discrimination with pure tones around 88, 250, or 1605 Hz, or amplitude modulation rate discrimination of noise bands, using modulation rates around 88 or 250 Hz. Before, in the middle of, and after this training period, pure-tone frequency discrimination thresholds (DLFs), harmonic complex tone fundamental frequency
    This study is concerned with a particular form of perceptual auditory learning - namely : the fundamental frequency discrimination of harmonic complex tones - and how learning generalizes across different stimulation conditions with... more
    This study is concerned with a particular form of perceptual auditory learning - namely : the fundamental frequency discrimination of harmonic complex tones - and how learning generalizes across different stimulation conditions with respect to an essential functional property of the peripheral auditory system - namely : cochlear frequency resolution -.
    Research Interests:
    Carlyon and Shackleton [J. Acoust. Soc. Am. 95, 3541-3554 (1994)] presented an influential study supporting the existence of two pitch mechanisms, one for complex tones containing resolved and one for complex tones containing only... more
    Carlyon and Shackleton [J. Acoust. Soc. Am. 95, 3541-3554 (1994)] presented an influential study supporting the existence of two pitch mechanisms, one for complex tones containing resolved and one for complex tones containing only unresolved components. The current experiments provide an alternative explanation for their finding, namely the existence of across-frequency interference in fundamental frequency (F0) discrimination. Sensitivity (d') was measured for F0 discrimination between two sequentially presented 400 ms complex (target) tones containing only unresolved components. In experiment 1, the target was filtered between 1375 and 15,000 Hz, had a nominal F0 of 88 Hz, and was presented either alone or with an additional complex tone ("interferer"). The interferer was filtered between 125-625 Hz, and its F0 varied between 88 and 114.4 Hz across blocks. Sensitivity was significantly reduced in the presence of the interferer, and this effect decreased as its F0 was moved progressively further from that of the target. Experiment 2 showed that increasing the level of a synchronously gated lowpass noise that spectrally overlapped with the interferer reduced this "pitch discrimination interference (PDI)". In experiment 3A, the target was filtered between 3900 and 5400 Hz and had an F0 of either 88 or 250 Hz. It was presented either alone or with an interferer, filtered between 1375 and 1875 Hz with an F0 corresponding to the nominal target F0. PDI was larger in the presence of the resolved (250 Hz F0) than in the presence of the unresolved (88 Hz F0) interferer, presumably because the pitch of the former was more salient than that of the latter. Experiments 4A and 4B showed that PDI was reduced but not eliminated when the interferer was gated on 200 ms before and off 200 ms after the target, and that some PDI was observed with a continuous interferer. The current findings provide an alternative interpretation of a study supposedly providing strong evidence for the existence of two pitch mechanisms.
    Research Interests:
    Carlyon et al (2001 Journal of Experimental Psychology: Human Perception and Performance 27 115-127) have reported that the buildup of auditory streaming is reduced when attention is diverted to a competing auditory stimulus. Here, we... more
    Carlyon et al (2001 Journal of Experimental Psychology: Human Perception and Performance 27 115-127) have reported that the buildup of auditory streaming is reduced when attention is diverted to a competing auditory stimulus. Here, we demonstrate that a reduction in streaming can also be obtained by attention to a visual task or by the requirement to count backwards in threes. In all conditions participants heard a 13 s sequence of tones, and, during the first 10 s saw a sequence of visual stimuli containing three, four, or five targets. The tone sequence consisted of twenty repeating triplets in an ABA - ABA ... order, where A and B represent tones of two different frequencies. In each sequence, three, four, or five tones were amplitude modulated. During the first 10 s of the sequence, participants either counted the number of visual targets, counted the number of (modulated) auditory targets, or counted backwards in threes from a specified number. They then made an auditory-streaming judgment about the last 3 s of the tone sequence: whether one or two streams were heard. The results showed more streaming when participants counted the auditory targets (and hence were attending to the tones throughout) than in either the 'visual' or 'counting-backwards' conditions.
    We used functional magnetic resonance imaging (fMRI) to investigate the neural basis of comprehension and perceptual learning of artificially degraded [noise vocoded (NV)] speech. Fifteen participants were scanned while listening to... more
    We used functional magnetic resonance imaging (fMRI) to investigate the neural basis of comprehension and perceptual learning of artificially degraded [noise vocoded (NV)] speech. Fifteen participants were scanned while listening to 6-channel vocoded words, which are difficult for naïve listeners to comprehend, but can be readily learned with appropriate feedback presentations. During three test blocks, we compared responses to potentially
    The dominance region (DR) for pitch was determined for 16- and 200-ms complex tones containing the first seven harmonics of a fundamental frequency (F0) of 250 Hz. A tone was presented with one of the harmonics mistuned upwards or... more
    The dominance region (DR) for pitch was determined for 16- and 200-ms complex tones containing the first seven harmonics of a fundamental frequency (F0) of 250 Hz. A tone was presented with one of the harmonics mistuned upwards or downwards by 3%, followed 500 ms later by a perfectly harmonic tone of the same duration. Listeners adjusted the F0 of the harmonic tone so that its pitch matched that of the mistuned complex. In experiment 1, stimuli were presented monaurally. The DR was significantly higher in harmonic number for the short than for the long duration. The overall sum of the pitch shifts produced by all harmonics was significantly larger for the short than for the long duration, presumably due to stronger perceptual fusion for the former. In experiment 2, the mistuned harmonic was presented only contralaterally to the remainder of the complex. A similar shift in the DR with duration was observed, although the pitch shifts were smaller than for monaural presentation. There was no significant effect of duration on the overall pitch shifts. The results are discussed in terms of pattern recognition and autocorrelation models of pitch perception, and a role of attention in pitch matching is suggested.
    Auditory processing of frequency modulation (FM) was explored. In experiment 1, detection of a tau-radians modulator phase shift deteriorated as modulation rate increased from 2.5 to 20 Hz, for 1- and 6-kHz carriers. In experiment 2,... more
    Auditory processing of frequency modulation (FM) was explored. In experiment 1, detection of a tau-radians modulator phase shift deteriorated as modulation rate increased from 2.5 to 20 Hz, for 1- and 6-kHz carriers. In experiment 2, listeners discriminated between two 1-kHz carriers, where, mid-way through, the 10-Hz frequency modulator had either a phase shift or increased in depth by deltaD% for half a modulator period. Discrimination was poorer for deltaD = 4% than for smaller or larger increases. These results are consistent with instantaneous frequency being smoothed by a time window with a total duration of about 110 ms. In experiment 3, the central 200-ms of a 1-s 1-kHz carrier modulated at 5 Hz was replaced by noise, or by a faster FM applied to a more intense 1-kHz carrier. Listeners heard the 5-Hz FM continue at the same depth throughout the stimulus. Experiments 4 and 5 showed that, after an FM tone had been interrupted by a 200-ms noise, listeners were insensitive to the phase at which the FM resumed. It is argued that the auditory system explicitly encodes the presence, and possibly the rate and depth, of FM in a way that does not preserve information on FM phase.
    Listeners can detect phase differences between the envelopes of sounds occupying remote frequency regions, and between the fine structures of partials that interact within a single auditory filter. They are insensitive to phase... more
    Listeners can detect phase differences between the envelopes of sounds occupying remote frequency regions, and between the fine structures of partials that interact within a single auditory filter. They are insensitive to phase differences between partials that differ sufficiently in frequency to preclude within-channel interactions. A new model is proposed that can account for all three of these findings, and which, unlike currently popular approaches, does not discard across-channel timing information. Sensitivity is predicted quantitatively by analyzing the output of a cochlear model using a spectro-temporal decomposition inspired by responses of neurons in the auditory cortex, and by computing a distance metric between the responses to two stimuli to be discriminated. Discriminations successfully modeled include phase differences between pairs of bandpass filtered harmonic complexes, and between pairs of sinusoidally amplitude modulated tones, discrimination between amplitude and frequency modulation, and discrimination of transient signals differing only in their phase spectra ("Huffman sequences").
    Pitch discrimination interference (PDI) is an impairment in fundamental frequency (F0) discrimination between two sequentially presented complex (target) tones produced by another complex tone (the interferer) that is filtered into a... more
    Pitch discrimination interference (PDI) is an impairment in fundamental frequency (F0) discrimination between two sequentially presented complex (target) tones produced by another complex tone (the interferer) that is filtered into a remote spectral frequency region. Micheyl and Oxenham [J. Acoust. Soc. Am. 121, 1621-1631 (2007)] reported a modest PDI for target tones and interferers both containing resolved harmonics when the F0 difference between the two target tones (DeltaF0) was small. When the interferer was in a lower spectral region than the target, a much larger PDI was observed when DeltaF0 was large (14%-20%), and, under these conditions, performance in the presence of an interferer was worse than at smaller DeltaF0s. The present study replicated the occurrence of PDI for complex tones containing resolved harmonics for small DeltaF0s. In contrast to Micheyl and Oxenham's findings, performance in the presence of an interferer always increased monotonically with increasing DeltaF0. However, when the interferer was in a lower spectral region than the target (and not vice versa), some subjects needed verbal instructions or modified stimuli to choose the correct cue, indicating an asymmetry in spontaneous obviousness of the correct listening cue across conditions.
    Ciocca and Darwin [V. Ciocca and C. J. Darwin, J. Acoust. Soc. Am. 105, 2421-2430 (1999)] reported that the shift in residue pitch caused by mistuning a single harmonic (the fourth out of the first 12) was the same when the mistuned... more
    Ciocca and Darwin [V. Ciocca and C. J. Darwin, J. Acoust. Soc. Am. 105, 2421-2430 (1999)] reported that the shift in residue pitch caused by mistuning a single harmonic (the fourth out of the first 12) was the same when the mistuned harmonic was presented after the remainder of the complex as when it was simultaneous, even though subjects were asked to ignore the pure-tone percept. The present study tried to replicate this result, and investigated the role of the presence of the nominally mistuned harmonic in the matching sound. Subjects adjusted a "matching" sound so that its pitch equaled that of a subsequent 90-ms complex tone (12 harmonics of a 155-Hz F0), whose mistuned (+/-3%) third harmonic was presented either simultaneously with or after the remaining harmonics. In experiment 1, the matching sound was a harmonic complex whose third harmonic was either present or absent. In experiments 2A and 2B, the target and matching sound had nonoverlapping spectra. Pitch shifts were reduced both when the mistuned component was nonsimultaneous, and when the third harmonic was absent in the matching sound. The results indicate a shorter than originally estimated time window for obligatory integration of nonsimultaneous components into a virtual pitch.
    Carlyon and Shackleton [J. Acoust. Soc. Am. 95, 3541-3554 (1994)] presented an influential study supporting the existence of two pitch mechanisms, one for complex tones containing resolved and one for complex tones containing only... more
    Carlyon and Shackleton [J. Acoust. Soc. Am. 95, 3541-3554 (1994)] presented an influential study supporting the existence of two pitch mechanisms, one for complex tones containing resolved and one for complex tones containing only unresolved components. The current experiments provide an alternative explanation for their finding, namely the existence of across-frequency interference in fundamental frequency (F0) discrimination. Sensitivity (d') was measured for F0 discrimination between two sequentially presented 400 ms complex (target) tones containing only unresolved components. In experiment 1, the target was filtered between 1375 and 15,000 Hz, had a nominal F0 of 88 Hz, and was presented either alone or with an additional complex tone ("interferer"). The interferer was filtered between 125-625 Hz, and its F0 varied between 88 and 114.4 Hz across blocks. Sensitivity was significantly reduced in the presence of the interferer, and this effect decreased as its F0 was moved progressively further from that of the target. Experiment 2 showed that increasing the level of a synchronously gated lowpass noise that spectrally overlapped with the interferer reduced this "pitch discrimination interference (PDI)". In experiment 3A, the target was filtered between 3900 and 5400 Hz and had an F0 of either 88 or 250 Hz. It was presented either alone or with an interferer, filtered between 1375 and 1875 Hz with an F0 corresponding to the nominal target F0. PDI was larger in the presence of the resolved (250 Hz F0) than in the presence of the unresolved (88 Hz F0) interferer, presumably because the pitch of the former was more salient than that of the latter. Experiments 4A and 4B showed that PDI was reduced but not eliminated when the interferer was gated on 200 ms before and off 200 ms after the target, and that some PDI was observed with a continuous interferer. The current findings provide an alternative interpretation of a study supposedly providing strong evidence for the existence of two pitch mechanisms.
    Three experiments studied the effect of pulse rate on temporal pitch perception by cochlear implant users. Experiment 1 measured rate discrimination for pulse trains presented in bipolar mode to either an apical, middle, or basal... more
    Three experiments studied the effect of pulse rate on temporal pitch perception by cochlear implant users. Experiment 1 measured rate discrimination for pulse trains presented in bipolar mode to either an apical, middle, or basal electrode and for standard rates of 100 and 200 pps. In each block of trials the signals could have a level of -0.35, 0, or +0.35 dB re the standard, and performance for each signal level was recorded separately. Signal level affected performance for just over half of the combinations of subject, electrode, and standard rate studied. Performance was usually, but not always, better at the higher signal level. Experiment 2 showed that, for a given subject and condition, the direction of the effect was similar in monopolar and bipolar mode. Experiment 3 employed a pitch comparison procedure without feedback, and showed that the signal levels in experiment 1 that produced the best performance for a given subject and condition also led to the signal having a higher pitch. It is concluded that small level differences can have a robust and substantial effect on pitch judgments and argue that these effects are not entirely due to response biases or to co-variation of place-of-excitation with level.
    Three experiments studied discrimination of changes in the rate of electrical pulse trains by cochlear-implant (CI) users and investigated the effect of manipulations that would be expected to substantially affect the pattern of auditory... more
    Three experiments studied discrimination of changes in the rate of electrical pulse trains by cochlear-implant (CI) users and investigated the effect of manipulations that would be expected to substantially affect the pattern of auditory nerve (AN) activity. Experiment 1 used single-electrode stimulation and tested discrimination at baseline rates between 100 and 500 pps. Performance was generally similar for stimulus durations of 200 and 800 ms, and, for the longer duration, for stimuli that were gated on abruptly or with 300-ms ramps. Experiment 2 used a similar procedure and found that no substantial benefit was obtained by the addition of background 5000-pps "conditioning" pulses. Experiment 3 used a pitch-ranking procedure and found that the range of rates over which pitch increased with increasing rate was not greater for multiple-electrode than for single-electrode stimulation. The results indicate that the limitation on pulse-rate discrimination by CI users, at high baseline rates, is not specific to a particular temporal pattern of the AN response.
    Fifteen subjects were trained during twelve 2-h sessions in either frequency discrimination with pure tones, or amplitude-modulation rate discrimination of noise bands. Thresholds for the discrimination of pure-tone frequency, harmonic... more
    Fifteen subjects were trained during twelve 2-h sessions in either frequency discrimination with pure tones, or amplitude-modulation rate discrimination of noise bands. Thresholds for the discrimination of pure-tone frequency, harmonic complex tone fundamental frequency, and amplitude-modulation rate were measured before, during, and after training. Comparison of pre- and post-training thresholds revealed significant improvements in all conditions in both subjects trained
    Periodic sound waves produce periodic patterns of phase-locked activity in the auditory nerve and in nuclei throughout the auditory brainstem. It has been suggested that this temporal code is the basis for our sensation of pitch. However,... more
    Periodic sound waves produce periodic patterns of phase-locked activity in the auditory nerve and in nuclei throughout the auditory brainstem. It has been suggested that this temporal code is the basis for our sensation of pitch. However, some stimuli evoke a pitch without monaural pitch information (temporal or otherwise). Huggins pitch (HP) is produced by presenting the same wideband noise to both ears except for a narrow frequency band which is interaurally decorrelated. “Complex” HP (CHP) can be produced by generating HP components at harmonic frequencies. The frequency-following response (FFR) is an electrophysiological measure of phase locking in the upper brainstem. The FFR was measured for a 300-Hz CHP in a 0-2 kHz noise, and a perceptually similar stimulus comprising a series of narrowband noise (NBN) harmonics of a 300-Hz fundamental presented in a 0-2 kHz background noise at different relative levels of NBN and background. The FFR measurements revealed a phase-locked resp...
    Research Interests:
    The authors show that a narrowband noise (NBN) is perceived as longer when presented immediately after a wideband noise (WBN), compared to when the WBN is absent. This effect depended on the... more
    The authors show that a narrowband noise (NBN) is perceived as longer when presented immediately after a wideband noise (WBN), compared to when the WBN is absent. This effect depended on the WBN's frequency spectrum overlapping that of the NBN, and it increased as the duration of the WBN increased up to 300 ms. It decreased when a silent gap was introduced between the WBN and NBN, but remained significant for an easily detectable gap of 40 ms. A correlate of the effect was observed in the mismatch negativity (MMN) to a deviant stimulus, consisting of a WBN + NBN, presented in a sequence of more common isolated WBNs. The MMN latency was longer for an on-frequency than for an off-frequency WBN; and, more importantly, the size of this difference correlated across participants with the difference in perceived duration. A rhythm-adjustment experiment showed that the presence of an on-frequency WBN immediately preceding a tone caused that tone to be heard as starting earlier than when the WBN was absent. The results are discussed in relation to the continuity illusion and models of duration encoding.
    Two experiments used simulations of cochlear implant hearing to investigate the use of temporal codes in speech segregation. Sentences were filtered into six bands, and their envelopes used to modulate filtered alternating-phase harmonic... more
    Two experiments used simulations of cochlear implant hearing to investigate the use of temporal codes in speech segregation. Sentences were filtered into six bands, and their envelopes used to modulate filtered alternating-phase harmonic complexes with rates of 80 or 140 pps. Experiment 1 showed that identification of single sentences was better for the higher rate. In experiment 2, maskers (time-reversed concatenated sentences) were scaled by -9 dB relative to a target sentence, which was added with an offset of 1.2 s. When the target and masker were each processed on all six channels, and then summed, processing the masker on a different rate to the target improved performance only when the target rate was 140 pps. When the target sentence was processed on the odd-numbered channels and the masker on the even-numbered channels, or vice versa, performance was worse overall, but showed similar effects of pulse rate. The results, combined with recent psychophysical evidence, suggest that differences in pulse rate are unlikely to prove useful for concurrent sound segregation.

    And 2 more