Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Editorial
. 2020 Apr 2:4:3.
doi: 10.1186/s41512-020-00074-3. eCollection 2020.

State of the art in selection of variables and functional forms in multivariable analysis-outstanding issues

Collaborators, Affiliations
Editorial

State of the art in selection of variables and functional forms in multivariable analysis-outstanding issues

Willi Sauerbrei et al. Diagn Progn Res. .

Abstract

Background: How to select variables and identify functional forms for continuous variables is a key concern when creating a multivariable model. Ad hoc 'traditional' approaches to variable selection have been in use for at least 50 years. Similarly, methods for determining functional forms for continuous variables were first suggested many years ago. More recently, many alternative approaches to address these two challenges have been proposed, but knowledge of their properties and meaningful comparisons between them are scarce. To define a state of the art and to provide evidence-supported guidance to researchers who have only a basic level of statistical knowledge, many outstanding issues in multivariable modelling remain. Our main aims are to identify and illustrate such gaps in the literature and present them at a moderate technical level to the wide community of practitioners, researchers and students of statistics.

Methods: We briefly discuss general issues in building descriptive regression models, strategies for variable selection, different ways of choosing functional forms for continuous variables and methods for combining the selection of variables and functions. We discuss two examples, taken from the medical literature, to illustrate problems in the practice of modelling.

Results: Our overview revealed that there is not yet enough evidence on which to base recommendations for the selection of variables and functional forms in multivariable analysis. Such evidence may come from comparisons between alternative methods. In particular, we highlight seven important topics that require further investigation and make suggestions for the direction of further research.

Conclusions: Selection of variables and of functional forms are important topics in multivariable analysis. To define a state of the art and to provide evidence-supported guidance to researchers who have only a basic level of statistical knowledge, further comparative research is required.

Keywords: Bias; Categorisation; Descriptive modelling; Empirical evidence; Fractional polynomials; Methods for variable selection; STRATOS initiative; Shrinkage; Spline procedures.

PubMed Disclaimer

Conflict of interest statement

Competing interestsThe authors declare that they have no competing interests.

References

    1. Abrahamowicz M, du Berger R, Grover SA. Flexible modelling of the effects of serum cholesterol on coronary heart disease mortality. Am J Epidemiol. 1997;145:714–729. - PubMed
    1. Altman DG, Andersen PK. Bootstrap investigation of the stability of a Cox regression model. Stat Med. 1989;8:771–783. - PubMed
    1. Altman DG, Lausen B, Sauerbrei W, Schumacher M. The dangers of using ‘optimal’cutpoints in the evaluation of prognostic factors. J Nat Cancer Inst. 1994;86:829–835. - PubMed
    1. Altman DG, McShane LM, Sauerbrei W, Taube SE. Reporting recommendations for tumor marker prognostic studies (REMARK): explanation and elaboration. PLoS Med. 2012;9:e1001216. - PMC - PubMed
    1. Antoniadis A, Gijbels I, Verhasselt A. Variable selection in additive models using P-splines. Technometrics. 2012;54:425–438.

Publication types

LinkOut - more resources