Multiple linear regression modeling with values below a lower limit of quantification - a statistical method comparison
- PMID: 41547754
- DOI: 10.1186/s12874-026-02770-y
Multiple linear regression modeling with values below a lower limit of quantification - a statistical method comparison
Abstract
Background: Missing values occur in almost all real-world medical data. Sometimes, more information is available for the missing values due to technical measurement limits. This was also the case for some sports medical data set where several laboratory measurements below a lower limit of quantification (LLOQ) were faced and supposed to be used in a multiple linear regression model. When studying the literature, the problem arises in several disciplines (environmental epidemiology, pharmacokinetic studies etc.) and different statistical methods are suggested. However, only very limited work on a method comparison is available, especially in the multivariable linear regression settting.
Methods: Therefore, we compare statistical methods for addressing values below a LLOQ in multiple linear regression modeling by a simulation study. We consider both the case that the variable below the LLOQ is among one of the independent variables and that it is the dependent variable in the regression model. We also vary different underlying assumptions, such as distributions, sample sizes, proportions of missing values, correlations, or linearity assumptions.
Results: Overall, the two compartment model showed the best performance in terms of bias and coverage when the LLOQ occurred in the independent variable and no big collinearity issue was present. When the variable subject to the LLOQ is the dependent variable, tobit showed the lowest bias and highest coverage for censoring proportions up to 0.8.
Conclusion: When facing a data set with values below a lower limit of quantification and a multiple linear regression model is chosen as analysis model, a conscious choice for dealing with those left-censored data should be made. In this article, we provide guidance on the performance of different established methods.
Keywords: Left censoring; Lower limit of quantification; Regression model; Statistical method comparison.
© 2026. The Author(s).
Conflict of interest statement
Declarations. Ethical approval and consent to participate: Not applicable. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.
References
-
- Kalski L, Greiß F, Hartung JJ, Hafermann L, Hofmann MA, Wolfarth B. Preventive health examinations: protocol for a prospective cross-sectional study of German employees aged 45 to 59 years (Ü45-check). Front Public Health. 2023;11:1076565.
-
- Kalski L, Pulst Caliman TJ, Greiß F, Karathanos A, Hafermann L, Völkel L, et al. Evaluation of a validated questionnaire to assess the need for prevention or rehabilitation by preventive health examinations: a cross-sectional study of German employees aged 45 to 59 years (Ü45-check). Front Public Health. 2025;13:1480312.
-
- Helsel DR. More than obvious: better methods for interpreting nondetect data. Environ Sci Technol. 2005;39(20):419A-423A.
-
- Helsel DR. Statistics for censored environmental data using Minitab and R, vol. 77. John Wiley & Sons; 2011.
-
- Hornung RW, Reed LD. Estimation of average concentration in the presence of nondetectable values. Appl Occup Environ Hyg. 1990;5(1):46–51.
LinkOut - more resources
Full Text Sources
