Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov 29;13(11):e0207177.
doi: 10.1371/journal.pone.0207177. eCollection 2018.

Data representations and -analyses of binary diary data in pursuit of stratifying children based on common childhood illnesses

Affiliations

Data representations and -analyses of binary diary data in pursuit of stratifying children based on common childhood illnesses

Johan de Rooi et al. PLoS One. .

Abstract

In this article we analyse diary reports concerning childhood symptoms of illness, these data are part of a larger study with other types of measurements on childhood asthma. The children are followed for three years and the diaries are updated, by the parents, on a daily basis. Here we focus on the methodological implications of analysing such data. We investigate two ways of representing the data and explore which tools are applicable given both representations. The first representation relies on proper alignment and point by point comparison of the signals. The second approach takes into account combinations of symptoms on a day by day basis and boils down to the analysis of counts. In the present case both methods are well applicable. However, more generally, when symptom episodes are occurring more at random locations in time, a point by point comparison becomes less applicable and shape based approaches will fail to come up with satisfactory results. In such cases, pattern based methods will be of much greater use. The pattern based representation focuses on reoccurring patterns and ignores ordering in time. With this representation we stratify the data on the level of years, so that possibly yearly differences can still be detected.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. The observed binary scores for three symptoms measured over the first year of the life of one child.
Highlighted by the grey bars are three (arbitrary) periods. One with both cold and cough symptoms (A), a second with only cold symptoms (B), the third one marks a number of days with cough and wheezing symptoms (C).
Fig 2
Fig 2. Inspecting the missing values in the diaries.
The left upper panel shows that almost 60 percent of the diaries are complete. The missings are clearly related to the time the child and parents are enrolled in the study, as can be seen in the right top panel. The bottom panel shows that there is no relation between the symptom burden and drop-out.
Fig 3
Fig 3. A block diagram showing the number of samples included in both representations.
Fig 4
Fig 4. A schematic presentation of the alignment two diaries.
Unaligned data (A), two diaries centered around January the first, neglecting year of birth (B) and centered and restructured samples neglecting year of birth and serial correlation of a single dataseries (C).
Fig 5
Fig 5. The monthly average scores for all symptoms calculated over all children.
Fig 6
Fig 6. Results of a three component NMF applied to the matrix Xˇ.
Fig 7
Fig 7. The components, resulting from the INDSCAL model, plotted against each other.

References

    1. Bisgaard H, Vissing NH, Carson CG, Bischoff AL, Følsgaard NV, Kreiner-Møller E, et al. Deep phenotyping of the unselected COPSAC2010 birth cohort study. Clinical & Experimental Allergy. 2013;43(12):1384–1394. 10.1111/cea.12213 - DOI - PMC - PubMed
    1. Novembre E, Galli E, Landi F, Caffarelli C, Pifferi M, De Marco E, et al. Coseasonal sublingual immunotherapy reduces the development of asthma in children with allergic rhinoconjunctivitis. Journal of allergy and clinical immunology. 2004;114(4):851–857. 10.1016/j.jaci.2004.07.012 - DOI - PubMed
    1. Bisgaard H, Pipper CB, Bønnelykke K. Endotyping early childhood asthma by quantitative symptom assessment. Journal of Allergy and Clinical Immunology. 2011;127(5):1155–1164. 10.1016/j.jaci.2011.02.007 - DOI - PubMed
    1. Little RJ, Rubin DB. Statistical analysis with missing data. Hoboken: John Wiley & Sons; 2014.
    1. Keogh E, Lonardi S, Ratanamahatana CA. Towards parameter-free data mining. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM; 2004. p. 206–215.