Outliers in nutrient intake data for U.S. adults: national health and nutrition examination survey 2017-2018
- PMID: 38013683
- PMCID: PMC10637781
- DOI: 10.1515/em-2023-0018
Outliers in nutrient intake data for U.S. adults: national health and nutrition examination survey 2017-2018
Abstract
Objectives: An important step in preparing data for statistical analysis is outlier detection and removal, yet no gold standard exists in current literature. The objective of this study is to identify the ideal decision test using the National Health and Nutrition Examination Survey (NHANES) 2017-2018 dietary data.
Methods: We conducted a secondary analysis of NHANES 24-h dietary recalls, considering the survey's multi-stage cluster design. Six outlier detection and removal strategies were assessed by evaluating the decision tests' impact on the Pearson's correlation coefficient among macronutrients. Furthermore, we assessed changes in the effect size estimates based on pre-defined sample sizes. The data were collected as part of the 2017-2018 24-h dietary recall among adult participants (N=4,893).
Results: Effect estimate changes for macronutrients varied from 6.5 % for protein to 39.3 % for alcohol across all decision tests. The largest proportion of outliers removed was 4.0 % in the large sample size, for the decision test, >2 standard deviations from the mean. The smallest sample size, particularly for alcohol analysis, was most affected by the six decision tests when compared to no decision test.
Conclusions: This study, the first to use 2017-2018 NHANES dietary data for outlier evaluation, emphasizes the importance of selecting an appropriate decision test considering factors such as statistical power, sample size, normality assumptions, the proportion of data removed, effect estimate changes, and the consistency of estimates across sample sizes. We recommend the use of non-parametric tests for non-normally distributed variables of interest.
Keywords: CDC; NHANES; dietary intake; macronutrient; outlier.
© 2023 Walter de Gruyter GmbH, Berlin/Boston.
Conflict of interest statement
Competing interests: The authors state no conflict of interest.
Figures
References
-
- Thakwalakwa CM, Kuusipalo HM, Maleta KM, Phuka JC, Ashorn P, Cheung YB. The validity of a structured interactive 24-hour recall in estimating energy and nutrient intakes in 15-month-old rural Malawian children: the validity of 24 h recall. Matern Child Nutr. 2012;8:380–9. doi: 10.1111/j.1740-8709.2010.00283.x. - DOI - PMC - PubMed
Grants and funding
LinkOut - more resources
Full Text Sources