Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Aug;64(8):1896-1905.
doi: 10.1109/TBME.2016.2628884. Epub 2016 Nov 15.

Dynamic Estimation of the Auditory Temporal Response Function From MEG in Competing-Speaker Environments

Dynamic Estimation of the Auditory Temporal Response Function From MEG in Competing-Speaker Environments

Sahar Akram et al. IEEE Trans Biomed Eng. 2017 Aug.

Abstract

Objective: A central problem in computational neuroscience is to characterize brain function using neural activity recorded from the brain in response to sensory inputs with statistical confidence. Most of existing estimation techniques, such as those based on reverse correlation, exhibit two main limitations: first, they are unable to produce dynamic estimates of the neural activity at a resolution comparable with that of the recorded data, and second, they often require heavy averaging across time as well as multiple trials in order to construct statistical confidence intervals for a precise interpretation of data. In this paper, we address the above-mentioned issues for estimating auditory temporal response function (TRF) as a parametric computational model for selective auditory attention in competing-speaker environments.

Methods: The TRF is a sparse kernel which regresses auditory MEG data with respect to the envelopes of the speech streams. We develop an efficient estimation technique by exploiting the sparsity of the TRF and adopting an ℓ1-regularized least squares estimator which is capable of producing dynamic TRF estimates as well as confidence intervals at sampling resolution from single-trial MEG data.

Results: We evaluate the performance of our proposed estimator using evoked MEG responses from the human brain in an auditory attention experiment with two competing speakers. The TRFs are estimated dynamically over time using the proposed technique with multisecond resolution, which is a significant improvement over previous results with a temporal resolution of the order of a minute.

Conclusion: Application of our method to MEG data reveals a precise characterization of the modulation of M50 and M100 evoked responses with respect to the attentional state of the subject at multisecond resolution.

Significance: Our proposed estimation technique provides a high resolution real-time attention decoding framework in multispeaker environments with potential application in smart hearing aid technology.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Schematic depiction of dynamic TRF estimation using evoked MEG response and speech envelopes of the speakers. Here, the auditory scene consists of the mixture of two concurrent speech streams, in which the subject is attending to the first speaker. Earlier studies demonstrated that the significant TRF component corresponding to the M100 response is significantly larger for attended vs. unattended speaker.
Fig. 2
Fig. 2
(A) MEG magnetic field map for the first DSS component of a sample subject. A stereotypical pattern of neural activity, originating separately in the left and right auditory cortices is observed. Black arrows schematically represent the locations of the equivalent dipole currents, generating the measured magnetic field. The τ̂M50,n (B1) and τ̂M100,n (B2) response amplitude differences for the attended (with superscript att) vs. unattended (with superscript unatt) conditions are computed for each of the two trials from 5 participants. Each box plot indicates the statistics of τ̂M500,n and τ̂M100,n response differences in the constant-attention trials, for all the time points which significantly differ from zero at a confidence level of 95%. On average over the trial duration and at a confidence level of 95%, the τ̂M50,n component of the attended TRF shows a significant change compared to its unattended counterpart in only 4 out of 10 trials (3 decreases and 1 increase), whereas |τ̂M100,n| is significantly increased in 8 out of 10 trials. The upward (resp. downward) arrows indicate the trials for which a significant increase (resp. decrease) in the TRF component differences is observed. Response differences for τ̂M50,n (C1) and τ̂M100,n (C2) are also plotted over time to demonstrate trackability of response differences with high temporal resolution, along with confidence intervals around each estimation point, using the proposed algorithm. Different colors indicate results from different subjects, where each trace is the averaged response difference over all three trials of each attended condition (speaker 1 or 2), in the constant-attention experiment. The τ^M50,natt and τ^M50,nunatt amplitude differences are not significantly different from zero over 81% of the trials; for the remaining 19% they were split almost equally between |τ^M50,nunatt|<|τ^M50,natt|(9%) and |τ^M50,nunatt|>|τ^M50,natt|(10%). The τ̂M100,n amplitude differences are significantly positive in 76% of the trials. For the remaining 24%, the τ^M100,natt and τ^M100,nunatt amplitude differences are not significantly different from 0. In summary, the amplitude differences of the τ̂M50 responses for both the attended and unattended TRFs are not significantly different from zero, whereas for the τ̂M100 responses, there are significantly positive amplitude differences between the attended and unattended TRFs. The vertical double-headed arrows indicate the extent of the average amplitude differences between the attended and unattended TRF components at the end of the trial. Error hulls indicate %95 confidence intervals around the estimated parameters.
Fig. 3
Fig. 3
Amplitude comparison for (A1) τ̂M50,n and (A2) τ̂M100,n components during attended and unattended conditions. Each circle corresponds to a single trial, and the constant-attention and attention-switch conditions are color coded by red and blue circles, respectively. For each trial, the time fractions in which the amplitude of the auditory component is significantly larger (y-axis) or smaller (x-axis) in attended vs. unattended TRFs are computed in percentage. The dashed line (45° line) corresponds to the condition that the TRF component is not modulated by selective attention. For the τ̂M100,n component, the amplitude differences are above the 45° n modulation in τ̂M100,n amplitudes; however, for the τ̂M50,n response, the results are more symmetrically scattered above (42%) and below (58%) the 45° line, implying no significant selectivity to attention. Scatter plots showing the corresponding results obtained via the cross-correlation method are presented for the τ̂M50,n (B1) and τ̂M100,n (B2) components. Cross-correlation was performed on each trial using a sliding time window of length 5 s with 4.5 s overlap between the successive windows. No attention modulation pattern is detected in either the τ̂M50,n or the τ̂M100,n responses.
Fig. 4
Fig. 4
Scatter plot of the correlations between the MEG signal and the model predictions of the attended and unattended speakers using the commonly-used static TRF estimates obtained by the least squares technique. Each circle corresponds to a single trial, and the constant-attention and attention-switch conditions are color coded by red and blue circles, respectively. For each trial, the time fractions in which the correlation values are significantly larger (y-axis) or smaller (x-axis) for the attended vs. unattended speakers are computed in percentage. The dashed line (45° line) corresponds to the condition that the correlations are not modulated by selective attention.
Fig. 5
Fig. 5
Tracking the attentional state through the estimated τ̂M100,n amplitudes. Results are shown for a sample subject. Bottom panel: the amplitude of the τ̂M100,n response for the estimated TRFs from speaker 1 and speaker 2 are plotted as a function of time (5 to 55 ms) in red and green, respectively. According to the τ̂M100,n amplitude comparisons, the attention switch occurs at around 15 s after the onset of the trial. Top panel: The TRF estimates for both speakers at times 21 s and 42 s are shown in the insets A and B, respectively. The putative temporal location of the τ̂M100,n components are indicated via the dash lines in each subplot. Error hulls indicate 95% confidence intervals around the estimated parameters.

References

    1. Fritz J, Shamma S, Elhilali M, Klein D. Rapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex. Nature neuroscience. 2003;6(11):1216–1223. - PubMed
    1. Fritz J, Elhilali M, Shamma S. Active listening: task-dependent plasticity of spectrotemporal receptive fields in primary auditory cortex. Hearing research. 2005;206(1):159–176. - PubMed
    1. Schreiner CE, Winer JA. Auditory cortex mapmaking: principles, projections, and plasticity. Neuron. 2007;56(2):356–365. - PMC - PubMed
    1. Atiani S, Elhilali M, David SV, Fritz JB, Shamma SA. Task difficulty and performance induce diverse adaptive patterns in gain and shape of primary auditory cortical receptive fields. Neuron. 2009;61(3):467–480. - PMC - PubMed
    1. Ahveninen J, Hämäläinen M, Jääskeläinen IP, Ahlfors SP, Huang S, Lin F-H, Raij T, Sams M, Vasios CE, Belliveau JW. Attention-driven auditory cortex short-term plasticity helps segregate relevant sounds from noise. Proceedings of the National Academy of Sciences. 2011;108(10):4182–4187. - PMC - PubMed

Publication types