Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Apr 7:7:46226.
doi: 10.1038/srep46226.

Analysis of free text in electronic health records for identification of cancer patient trajectories

Affiliations

Analysis of free text in electronic health records for identification of cancer patient trajectories

Kasper Jensen et al. Sci Rep. .

Abstract

With an aging patient population and increasing complexity in patient disease trajectories, physicians are often met with complex patient histories from which clinical decisions must be made. Due to the increasing rate of adverse events and hospitals facing financial penalties for readmission, there has never been a greater need to enforce evidence-led medical decision-making using available health care data. In the present work, we studied a cohort of 7,741 patients, of whom 4,080 were diagnosed with cancer, surgically treated at a University Hospital in the years 2004-2012. We have developed a methodology that allows disease trajectories of the cancer patients to be estimated from free text in electronic health records (EHRs). By using these disease trajectories, we predict 80% of patient events ahead in time. By control of confounders from 8326 quantified events, we identified 557 events that constitute high subsequent risks (risk > 20%), including six events for cancer and seven events for metastasis. We believe that the presented methodology and findings could be used to improve clinical decision support and personalize trajectories, thereby decreasing adverse events and optimizing cancer treatment.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1. Information stored in free text of EHRs.
(A) Patient events in terms of symptoms and diseases (blue), drugs, medication (magenta) and surgical procedures (orange) accumulated over time in the EHRs. (B) The most common symptoms and diseases. (C) The most common drugs and medication and (D) the most common surgical procedures.
Figure 2
Figure 2. Construction of trajectories.
(A) Progression of health state to multiple morbidities X and Y. (B) Variations in how patient information is registered yields a distorted information space observed by clinicians. (C) Frequent Item Set (FIS) mining identified observations that repeatedly appear together. (D) Trajectories are created using order of first appearance.
Figure 3
Figure 3. Reconstruction of patient events ahead in time.
(A) The positive predictive value (PPV) (events explained, %) by trajectories compared with randomized trajectories of the same size. (B) The PPV in terms of the number of events known for a patient. (C) Positive predictions, % in terms of the event sequence number in the trajectories. (D) PPV in terms of event probability.
Figure 4
Figure 4. The most common symptoms and diseases reported prior to cancer diagnoses and from cancer to death.
(A) The path from chest pain to cancer (neoplasms). (B) The paths from chest pain to cancer with intermediate events. (C) The path from cancer to death, and (D) the paths from cancer to death with intermediate events.
Figure 5
Figure 5. Control of confounding factors in event trajectories and disease trajectories.
(A) The adjusted risks from the event trajectories on the x-axis and those of the disease trajectories on the y-axis correlate with the outcomes in the trajectories when controlling confounders. (B) The change in intermediate events and extraneous variables on the x-axis and the change in adjusted risk on the y-axis. The outcomes in the two sets remain the same when the intermediate events and extraneous variables change.
Figure 6
Figure 6. Adjusted risk of complication and readmission events.
There are several paths that may bring a patient from an event to an outcome. Thus, we nominate the number of paths k. (A) The six events with high-risk for downstream complications and readmission with cancer. (B) The seven events with high-risk for downstream complications and readmission with metastasis.

References

    1. Burke W., Brown Trinidad S. & Press N. A. Essential elements of personalized medicine. Urol. Oncol. Semin. Orig. Investig. 32, 193–197 (2014). - PMC - PubMed
    1. Frankovich J., Longhurst C. A. & Sutherland S. M. Evidence-Based Medicine in the EMR Era. N. Engl. J. Med. 365, 1758–1759 (2011). - PubMed
    1. Dindo D., Demartines N. & Clavien P.-A. Classification of Surgical Complications: A New Proposal With Evaluation in a Cohort of 6336 Patients and Results of a Survey. Ann. Surg. 240, 205–213 (2004). - PMC - PubMed
    1. Sell S. Readmissions within 30 days cost the NHS £1.6bn a year. GP-Online Available at: http://www.gponline.com/readmissions-within-30-days-cost-nhs-16bn-year/a... (2010).
    1. Donzé J., Lipsitz S., Bates D. W. & Schnipper J. L. Causes and patterns of readmissions in patients with common comorbidities: retrospective cohort study. BMJ 347, f7171 (2013). - PMC - PubMed