Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 May 29;16(5):e2003663.
doi: 10.1371/journal.pbio.2003663. eCollection 2018 May.

A novel unsupervised analysis of electrophysiological signals reveals new sleep substages in mice

Affiliations

A novel unsupervised analysis of electrophysiological signals reveals new sleep substages in mice

Vasiliki-Maria Katsageorgiou et al. PLoS Biol. .

Abstract

Sleep science is entering a new era, thanks to new data-driven analysis approaches that, combined with mouse gene-editing technologies, show a promise in functional genomics and translational research. However, the investigation of sleep is time consuming and not suitable for large-scale phenotypic datasets, mainly due to the need for subjective manual annotations of electrophysiological states. Moreover, the heterogeneous nature of sleep, with all its physiological aspects, is not fully accounted for by the current system of sleep stage classification. In this study, we present a new data-driven analysis approach offering a plethora of novel features for the characterization of sleep. This novel approach allowed for identifying several substages of sleep that were hidden to standard analysis. For each of these substages, we report an independent set of homeostatic responses following sleep deprivation. By using our new substages classification, we have identified novel differences among various genetic backgrounds. Moreover, in a specific experiment with the Zfhx3 mouse line, a recent circadian mutant expressing both shortening of the circadian period and abnormal sleep architecture, we identified specific sleep states that account for genotypic differences at specific times of the day. These results add a further level of interaction between circadian clock and sleep homeostasis and indicate that dissecting sleep in multiple states is physiologically relevant and can lead to the discovery of new links between sleep phenotypes and genetic determinants. Therefore, our approach has the potential to significantly enhance the understanding of sleep physiology through the study of single mutations. Moreover, this study paves the way to systematic high-throughput analyses of sleep.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Multisubject data analysis pipeline.
(A) Data collection and preprocessing per subject; FFT was applied to EEG signals to derive the power histogram of 5 bands (Alpha, Beta, Delta, Gamma, and Theta), and the ratios between all possible pairwise combinations of the 5 EEG bands were computed. For the EMG, we used its integral values. Each individual dataset was preprocessed by taking the natural logarithm of the data and zeroing the mean of each feature separately. (B) Scatter plot of the log-(Delta/Theta) and the log-EMG features showing the existence of more clusters in the samples of one subject. Notice that REM sleep is characterized by at least 2 substages. Data points belong to a single-subject time series. Blue, green, and red samples are associated with wakefulness, NREM sleep, and REM sleep, respectively, as hand-labeled by the expert. (C) Data modeling: time series of EEG/EMG data of all subjects were concatenated into a unique dataset and modeled with an mcRBM, resulting in a set of latent variables’ representation of the input data. (D) Evolution of mcRBM’s latent variables activity while processing an input time series in which, for a visualization purpose, the sequence of states has been grouped according to the 3 known stages, i.e., NREM (NR), REM (R), and wakefulness (W), as defined by the manual scoring. Time flows from left to right, with each pixel-wide column representing 1 time step (4 s epoch) and each pixel-wide row representing 1 latent variable. Black pixels represent nonactive units, while white pixels represent active ones. It is interesting to see how the model mostly gives similar configurations to epochs that belong to the same sleep stage. For this reason, blocks that belong to the same stage look similar to each other. Notice that units that are active in REM are generally not active in NREM and wakefulness (units in yellow frame). With the mcRBM, we are fitting Gaussian distributions to the data. Hence, each single binary representation is associated with a multivariate normal distribution [8]. EEG, electroencephalography; EMG, electromyography; FFT, fast Fourier transformation; mcRBM, mean-covariance restricted Boltzmann machine; NREM, non-rapid eye movement; REM, rapid eye movement.
Fig 2
Fig 2. Evaluation of the observed substages.
(A) Probability matrix of LS being associated with the known sleep stages. Rows refer to LS and columns to wakefulness, NREM, and REM sleep stages. Color shades range from blue, representing low probability, to red, representing high probability, for the corresponding LS to belong to each of the 3 known stages. The order of the substages in the graph has been achieved using agglomerative hierarchical clustering. (B) Analysis of the NMI between the observed substages and the 3 stages. Each bar represents the NMI per stage. The right bar corresponds to ratio assuming as 2 random variables the “substages” and the “three stages.” (C) Examples of distributions of the input features associated with 3 substages corresponding with high probability to the NREM stage. The top row shows the distributions of the input samples in a box plot. The bottom row shows the covariance between the input variables. EMG, electromyography; LS, latent states; MI, mutual information; NMI, normalized mutual information; NREM, non-rapid eye movement; REM, rapid eye movement.
Fig 3
Fig 3. Examples of 3 substages that are characteristic of the 3 mouse genetic backgrounds.
Box plots describing the statistics on the number of epochs each strain falls in the corresponding substage. Arrows below each graph show the p-values for the 2-sample independent t test computed on the top distributions. (A) A substage associated with the wakefulness stage for which the C57BL/6J mouse group is significantly different from the other 2 groups (p-values < 0.05). (B) A substage mapping to the NREM stage that is characteristic for the CD1 strain. (C) A substage associated with NREM sleep that is characteristic for the mixed background group. LS, latent stage; NREM, non-rapid eye movement; REM, rapid eye movement.
Fig 4
Fig 4. Transitions’ probabilities between substages and daily profiles analysis.
Top: state-transition graph in which nodes have been clustered according to the transition probabilities. Nodes correspond to the substages, and edges identify the transitions between stages. Nodes’ size is related to their in-degree. Blue, green, and red nodes are associated with substages mapping with high probability to wakefulness, NREM sleep, and REM sleep, respectively. Purple nodes correspond to substages that do not have a clear mapping to 1 of the 3 known stages (probability lower than 0.6 for any stage). Edges’ weight is related to the corresponding transition probability, and their color is related to the source node. Graphs were built using the ForceAtlas2 algorithm [15,16] (see also interactive graphs at http://pavis.iit.it/datasets/mouse-sleep-analysis). Bottom: examples of daily profiles associated with NREM sleep, in which the profile of C57BL/6J mouse group is slightly shifted. Histograms describe the distribution of epochs using bins of 1 hour. Polynomial curves are fitted to the histograms to obtain a simplified visual representation of daily oscillations. LS, latent state; NREM, non-rapid eye movement; REM, rapid eye movement; ZT, zeitgeber time.
Fig 5
Fig 5. Temporal profile of REM-like states.
(A) Analysis of the peaks during the light phase (07:00–19:00) identified 2 populations of latent states, presenting peaks either in the first or the second half of the subjective night. Top graphs are associated with the first experiment (data coming from the 3 wild-type genetic backgrounds). Bottom graphs are associated with the second experiment (Zfhx3Sci/+ mutant strain versus Zfhx3+/+ littermate controls). In both experiments, the majority of the substages that map into the second half of the light phase has a high probability to be classified as REM (therefore REM-like). The substages classified as NREMs are equally divided into the first and second half of the light phase. (B) Comparison between manually annotated REMs versus REM-like states generated by our approach, which peak in the second half of the light phase, in the 2 experiments. Graph shows the percentage of epochs labeled as REM (refer also to S1 Data). NREM, non-rapid eye movement; REM, rapid eye movement; ZT, zeitgeber time.
Fig 6
Fig 6. Daily behavior analysis in the circadian mutants.
(A) Profiles per sleep stage of Zfhx3Sci/+ versus Zfhx3+/+, using the manually scored epochs. (B) Examples of substages mapping with high probability to wakefulness. Left: the Zfhx3Sci/+ mutants show a reduced profile compared to the Zfhx3+/+. Right: the profiles of both groups are flat during the dark phase. As can be observed from the graphs on the bottom, also the distributions of the input variables across the 2 substages are different. (C) Examples of substages mapping with high probability to NREM. Similarly to C, there are substages where the Zfhx3Sci/+ mutants show a reduced profile compared to the Zfhx3+/+ (left), as well as substages where the profiles of both groups are flat during the dark phase (right). Also, in this case, the distributions of the input variables across the 2 substages are different (bottom graphs). (D) Top left and middle graphs: examples of substages mapping with high probability to REM, in which we observed that the profile of Zfhx3Sci/+ has an advance compared to the one of Zfhx3+/+. Top right: example of a substage in which no difference is observed across the 2 groups. Bottom: Differences in the input data distributions can be observed across the 3 examples. EMG, electromyography; LS, latent state; NREM, non-rapid eye movement; REM, rapid eye movement; ZT, zeitgeber time.
Fig 7
Fig 7. Homeostatic responses of latent states.
(A) Percentage of peak rebounds at ZT 6 (immediate), within ZT 6 and ZT 12 (intermediate), and beyond ZT 12 (late). Within each category, the distribution of wake (blue), NREM (green), and REM (red) stages according to the manual annotation is represented (refer also to S2 Data). (B) Analysis of responses in Zfhx3 mutants versus wild-type mice. Each dot describes the difference between the rebound and the baseline phase of a latent state for the same period of the day. Examples of latent-state occurrences are presented for states presenting no differences between mutants and wild-types (C); for states in which mutants present reduced response (D); for states in which mutants present higher response compared to wild-types (E); and for states in which both groups are not responding, then not homeostatically regulated (F). Note that the shaded region in the plots C-F corresponds to the period of sleep deprivation. NREM, non-rapid eye movement; REM, rapid eye movement; ZT, zeitgeber time.
Fig 8
Fig 8. Graphical model associated to the RBM.
The visible variables are represented by v (in our case, the ratios between the EEG frequency bands and the EMG), h are the latent variables, and w are the weights on the undirected connections between visible and hidden units. EEG, electroencephalography; EMG, electromyography; RBM, restricted Boltzmann machine.
Fig 9
Fig 9. Graphical model associated with the mcRBM.
v represents the visible variables (in our case, the EEG frequency bands and the EMG), hm the latent variables modeling the mean of variables v, and hc those modeling their covariance. F is the number of factors. W is the 2D matrix of the weights on the undirected connections between v and hm variables. C is the weight matrix of the connections from the v variables to the factors (F). P is the weight matrix of the connections from the factors (F) to the hc. EEG, electroencephalography; EMG, electromyography; mcRBM, mean-covariance restricted Boltzmann machine.

References

    1. Hobson JA. A manual of standardized terminology, techniques and scoring system for sleep stages of human subjects: Rechtschaffen A. and Kales A. (Editors).(Public Health Service, US Government Printing Office, Washington, DC, 1968, 58 p., $4.00). Electroencephalography and clinical neurophysiology. 1969. June 1;26(6):644.
    1. Younes M, Raneri J, Hanly P. Staging sleep in polysomnograms: analysis of inter-scorer variability. Journal of clinical sleep medicine: JCSM: official publication of the American Academy of Sleep Medicine. 2016. June 15;12(6):885. - PMC - PubMed
    1. Rosenberg RS, Van Hout S. The American Academy of Sleep Medicine inter-scorer reliability program: sleep stage scoring. Journal of clinical sleep medicine: JCSM: official publication of the American Academy of Sleep Medicine. 2013. January 15;9(1):81. - PMC - PubMed
    1. Penzel T, Zhang X, Fietze I. Inter-scorer reliability between sleep centers can teach us what to improve in the scoring rules. Journal of Clinical Sleep Medicine. 2013. January 15;9(01):89–91. - PMC - PubMed
    1. Müller B, Gäbelein WD, Schulz H. A taxonomic analysis of sleep stages. Sleep. 2006. July 1;29(7):967–74. - PubMed

Publication types