Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Oct;13(8):2409-2421.
doi: 10.1177/21925682221085535. Epub 2022 Apr 3.

Leveraging Artificial Intelligence and Synthetic Data Derivatives for Spine Surgery Research

Affiliations

Leveraging Artificial Intelligence and Synthetic Data Derivatives for Spine Surgery Research

Jacob K Greenberg et al. Global Spine J. 2023 Oct.

Abstract

Study design: Retrospective cohort study.

Objectives: Leveraging electronic health records (EHRs) for spine surgery research is impeded by concerns regarding patient privacy and data ownership. Synthetic data derivatives may help overcome these limitations. This study's objective was to validate the use of synthetic data for spine surgery research.

Methods: Data came from the EHR from 15 hospitals. Patients that underwent anterior cervical or posterior lumbar fusion (2010-2020) were included. Real data were obtained from the EHR. Synthetic data was generated to simulate the properties of the real data, without maintaining a one-to-one correspondence with real patients. Within each cohort, ability to predict 30-day readmissions and 30-day complications was evaluated using logistic regression and extreme gradient boosting machines (XGBoost).

Results: We identified 9,072 real and 9,088 synthetic cervical fusion patients. Descriptive characteristics were nearly identical between the 2 datasets. When predicting readmission, models built using real and synthetic data both had c-statistics of .69-.71 using logistic regression and XGBoost. Among 12,111 real and 12,126 synthetic lumbar fusion patients, descriptive characteristics were nearly the same for most variables. Using logistic regression and XGBoost to predict readmission, discrimination was similar with models built using real and synthetic data (c-statistics .66-.69). When predicting complications, models derived using real and synthetic data showed similar discrimination in both cohorts. Despite some differences, the most influential predictors were similar in the real and synthetic datasets.

Conclusion: Synthetic data replicate most descriptive and predictive properties of real data, and therefore may expand EHR research in spine surgery.

Keywords: artificial intelligence; electronic health records; machine learning; medical informatics; spine surgery; synthetic data derivatives; treatment outcome.

PubMed Disclaimer

Conflict of interest statement

No Authors report any financial conflicts of interest. Drs. Greenberg and Foraker have delivered one or more webinars on the use of MDClone and received a nominal gift card in appreciation. Dr. Ray received research support from the Defense Advanced Research Projects Agency, Department of Defense, Missouri Spinal Cord Injury Foundation, National Institute of Health/NINDs, Hope Center, and Johnson & Johnson. Dr. Ray reports: stock/equity in Acera surgical; consulting support from Depuy/Synthes, Globus, and Nuvasive; royalties from Depuy/Synthes, Nuvasive, Acera surgical. Dr. Foraker received no funding specifically related to this study. Dr. Foraker reports research support from the Washington University Institute for Public Health, National Institutes of Health, Global Autoimmune Institute, Agency for Healthcare Research and Quality, Siteman Investment Program, Alzheimer’s Drug Discovery Foundation, and Children’s Discovery Institute. Dr. Kelly reported no funding related to this submission. Dr. Kelly received research support from the Setting Scoliosis Straight Foundation and the International Spine Study Group Foundation. Dr. Kelly received personal fees from The Journal of Bone and Joint Surgery. Dr. Molina reported equity in Augmedics and consulting fees from Depuy/Synthes and Kuros.

Figures

Figure 1.
Figure 1.
Comparison of cervical (A) and lumbar (B) cohort characteristics in the real and synthetic datasets. Violin plots shows the distribution of the data for select continuous variables.
Figure 2.
Figure 2.
Receive operating characteristic (ROC) curves showing model discrimination predicting readmission for the cervical (A) and lumbar (B) cohorts.
Figure 3.
Figure 3.
A comparison of the most influential variables between the models built using real versus synthetic data to predict readmission for the cervical (A) and lumbar (B) cohorts.

References

    1. Burns PB, Rohrich RJ, Chung KC. The levels of evidence and their role in evidence-based medicine. Plast Reconstr Surg. 2011;128(1):305-310. - PMC - PubMed
    1. Reith C, Landray M, Devereaux PJ, et al. Randomized clinical trials - removing unnecessary obstacles. N Engl J Med. 2013;369(11):1061-1065. - PubMed
    1. James S, Rao SV, Granger CB. Registry-based randomized clinical trials-a new clinical trial paradigm. Nat Rev Cardiol. 2015;12(5):312-316. - PubMed
    1. Vickers AJ, Scardino PT. The clinically-integrated randomized trial: proposed novel method for conducting large trials at low cost. Trials. 2009;10(1):14. - PMC - PubMed
    1. Weinstein JN, Tosteson TD, Lurie JD, et al. Surgical vs nonoperative treatment for lumbar disk herniation. JAMA. 2006;296(20):2441-2450. - PMC - PubMed

LinkOut - more resources