Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Mar 9:5:9.
doi: 10.1186/s40364-017-0089-4. eCollection 2017.

Application of penalized linear regression methods to the selection of environmental enteropathy biomarkers

Affiliations

Application of penalized linear regression methods to the selection of environmental enteropathy biomarkers

Miao Lu et al. Biomark Res. .

Abstract

Background: Environmental Enteropathy (EE) is a subclinical condition caused by constant fecal-oral contamination and resulting in blunting of intestinal villi and intestinal inflammation. Of primary interest in the clinical research is to evaluate the association between non-invasive EE biomarkers and malnutrition in a cohort of Bangladeshi children. The challenges are that the number of biomarkers/covariates is relatively large, and some of them are highly correlated.

Methods: Many variable selection methods are available in the literature, but which are most appropriate for EE biomarker selection remains unclear. In this study, different variable selection approaches were applied and the performance of these methods was assessed numerically through simulation studies, assuming the correlations among covariates were similar to those in the Bangladesh cohort. The suggested methods from simulations were applied to the Bangladesh cohort to select the most relevant biomarkers for the growth response, and bootstrapping methods were used to evaluate the consistency of selection results.

Results: Through simulation studies, SCAD (Smoothly Clipped Absolute Deviation), Adaptive LASSO (Least Absolute Shrinkage and Selection Operator) and MCP (Minimax Concave Penalty) are the suggested variable selection methods, compared to traditional stepwise regression method. In the Bangladesh data, predictors such as mother weight, height-for-age z-score (HAZ) at week 18, and inflammation markers (Myeloperoxidase (MPO) at week 12 and soluable CD14 at week 18) are informative biomarkers associated with children's growth.

Conclusions: Penalized linear regression methods are plausible alternatives to traditional variable selection methods, and the suggested methods are applicable to other biomedical studies. The selected early-stage biomarkers offer a potential explanation for the burden of malnutrition problems in low-income countries, allow early identification of infants at risk, and suggest pathways for intervention.

Trial registration: This study was retrospectively registered with ClinicalTrials.gov, number NCT01375647, on June 3, 2011.

Keywords: Biomarker selection; Correlated covariates; Environmental enteropathy; Malnutrition; Penalized linear regression.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Heatmap of correlation for all biomarkers

References

    1. Fan J, Lv J. A selective overview of variable selection in high dimensional feature space. Stat Sin. 2010;20(1):101. - PMC - PubMed
    1. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer; 2009.
    1. Harrell FE. Regression Modeling Strategies. New York: Springer; 2001.
    1. Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Series B (Stat Methodol) 1996;58(1):267–88.
    1. Fan J, Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc. 2001;96(456):1348–60. doi: 10.1198/016214501753382273. - DOI

Associated data