Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Mar 19;24(2):bbad002.
doi: 10.1093/bib/bbad002.

A review on longitudinal data analysis with random forest

Affiliations
Review

A review on longitudinal data analysis with random forest

Jianchang Hu et al. Brief Bioinform. .

Abstract

In longitudinal studies variables are measured repeatedly over time, leading to clustered and correlated observations. If the goal of the study is to develop prediction models, machine learning approaches such as the powerful random forest (RF) are often promising alternatives to standard statistical methods, especially in the context of high-dimensional data. In this paper, we review extensions of the standard RF method for the purpose of longitudinal data analysis. Extension methods are categorized according to the data structures for which they are designed. We consider both univariate and multivariate response longitudinal data and further categorize the repeated measurements according to whether the time effect is relevant. Even though most extensions are proposed for low-dimensional data, some can be applied to high-dimensional data. Information of available software implementations of the reviewed extensions is also given. We conclude with discussions on the limitations of our review and some future research directions.

Keywords: clustered data; longitudinal data; machine learning; multivariate response; repeated measurements.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Illustration of bootstrap samples used to construct decision trees in standard RF when it is applied to clustered data.
Figure 2
Figure 2
Illustration of subject-level bootstrap samples used to construct decision trees in RF++.
Figure 3
Figure 3
Summary of the concepts and methods reviewed in the paper.

References

    1. Ashley EA. Towards precision medicine. Nat Rev Genet 2016; 17(9): 507–22. - PubMed
    1. Larry Jameson J, Longo DL. Precision medicine-personalized, problematic, and promising. Obstet Gynecol Surv 2015; 70(10): 612–4. - PubMed
    1. Matchett KB, Niamh Lynam-Lennon R, Watson W, et al. Advances in precision medicine: tailoring individualized therapies. Cancer 2017; 9(11): 146. - PMC - PubMed
    1. Fitzmaurice GM, Laird NM, Ware JH. Applied Longitudinal Analysis. John Wiley & Sons, Hoboken, New Jersey, 2012.
    1. Hedeker D, Gibbons RD. Longitudinal Data Analysis. Wiley-Interscience, Hoboken, New Jersey, 2006.

Publication types