Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2019 Aug;34(4):347-363.
doi: 10.1177/0748730419850917. Epub 2019 Jun 12.

Comparing Methods for Measurement Error Detection in Serial 24-h Hormonal Data

Affiliations
Comparative Study

Comparing Methods for Measurement Error Detection in Serial 24-h Hormonal Data

Evie van der Spoel et al. J Biol Rhythms. 2019 Aug.

Abstract

Measurement errors commonly occur in 24-h hormonal data and may affect the outcomes of such studies. Measurement errors often appear as outliers in such data sets; however, no well-established method is available for their automatic detection. In this study, we aimed to compare performances of different methods for outlier detection in hormonal serial data. Hormones (glucose, insulin, thyroid-stimulating hormone, cortisol, and growth hormone) were measured in blood sampled every 10 min for 24 h in 38 participants of the Leiden Longevity Study. Four methods for detecting outliers were compared: (1) eyeballing, (2) Tukey's fences, (3) stepwise approach, and (4) the expectation-maximization (EM) algorithm. Eyeballing detects outliers based on experts' knowledge, and the stepwise approach incorporates physiological knowledge with a statistical algorithm. Tukey's fences and the EM algorithm are data-driven methods, using interquartile range and a mathematical algorithm to identify the underlying distribution, respectively. The performance of the methods was evaluated based on the number of outliers detected and the change in statistical outcomes after removing detected outliers. Eyeballing resulted in the lowest number of outliers detected (1.0% of all data points), followed by Tukey's fences (2.3%), the stepwise approach (2.7%), and the EM algorithm (11.0%). In all methods, the mean hormone levels did not change materially after removing outliers. However, their minima were affected by outlier removal. Although removing outliers affected the correlation between glucose and insulin on the individual level, when averaged over all participants, none of the 4 methods influenced the correlation. Based on our results, the EM algorithm is not recommended given the high number of outliers detected, even where data points are physiologically plausible. Since Tukey's fences is not suitable for all types of data and eyeballing is time-consuming, we recommend the stepwise approach for outlier detection, which combines physiological knowledge and an automated process.

Keywords: automatic outlier detection; hormones; measurement error; outlier; time series.

PubMed Disclaimer

Conflict of interest statement

Conflict Of Interest Statement: The authors have no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Figures

Figure 1.
Figure 1.
(a) Eyeballing detects outliers without fitting smooth curves. By visual inspection, individual experts detect outliers by taking into account that some hormones were measured in a same sample. Afterward, a consensus meeting is held, and the experts discuss all data points with conflicting detection results. (b) Tukey’s fences starts with fitting a moving average curve to per-person per-hormone data and taking residuals of all data points. Then the interquartile range (IQR = Q3–Q1) of the residuals is calculated. The data points lying outside the range between Q1 − 31QR and Q3 + 3IQR are detected as outliers. (c) The stepwise approach fits the moving average curve to per-person per-hormone data, and standardized residuals of all data points are calculated (step 1). The data points lying outside the range between −3 and 4 standard deviations are detected as outliers (step 2). Then, the residuals of 5 hormones measured at the same time points are summed. When the sum of the residuals is smaller than −8, the data points are detected as outliers (step 3). Afterward, steps 1 and 3 are repeated (step 4). (d) The expectation-maximization (EM) algorithm first fits a smoothing curve to per-person per-hormone data, and the residuals are calculated. Then, all the residuals of a hormone from all 38 participants are put in the EM algorithm. The algorithm then identifies 2 distinguishable distributions and yields the probability of each data point to be an outlier.
Figure 2.
Figure 2.
Mean number of data points detected per hormone per method across all participants.
Figure 3.
Figure 3.
Venn diagrams visualizing the number of measurement errors detected by each method (eyeballing, stepwise approach, and Tukey’s fences) and their overlap counted in total time points (a) and in all data points (b). The overlap with the expectation-maximization algorithm is not presented here for the reasons mentioned in the Results section.
Figure 4.
Figure 4.
(a) The results of outlier detection by eyeballing in glucose, insulin, thyroid-stimulating hormone (TSH), cortisol, and growth hormone of participant 19. Hollow data points indicate detected outliers (b) The results of outlier detection by Tukey’s fences Hollow data points indicate detected outliers (c) The results of outlier detection by stepwise approach Hollow data points indicate detected outliers Hollow data points indicate detected outliers (d) The results of outlier detection by the expectation-maximization algorithm. Hollow data points indicate the probability of the data point to be an outlier is higher than 0.9.
Figure 5.
Figure 5.
Change in correlation at lag time 0 (%) after removal of measurement errors detected by the 4 methods: eyeballing, Tukey’s fences, stepwise approach, and the expectation-maximization algorithm. Each bar represents an individual participant.

References

    1. Aitkin M, Wilson GT. (1980) Mixture models, outliers, and the EM algorithm. Technometrics 22:325-331.
    1. Akintola AA, Jansen SW, Wilde RB, Hultzer G, Rodenburg R, van Heemst D. (2015) A simple and versatile method for frequent 24 h blood sample collection in healthy older adults. MethodsX 2:33-38. - PMC - PubMed
    1. Benaglia T, Chauveau D, Hunter DR, Young D. (2009). mixtools: an R Package for analyzing finite mixture models. J Stat Software 32(6):1-29. http://www.jstatsoft.org/v32/i06/.
    1. Brown EN, Meehan PM, Dempster AP. (2001) A stochastic differential equation model of diurnal cortisol patterns. Am J Physiol Endocrinol Metab 280:E450-E461. - PubMed
    1. Dempster AP, Laird NM, Rubin DB. (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B (Methodological) 1-38.

Publication types