Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec 12;1(12):e0000158.
doi: 10.1371/journal.pdig.0000158. eCollection 2022 Dec.

Is artificial intelligence capable of generating hospital discharge summaries from inpatient records?

Affiliations

Is artificial intelligence capable of generating hospital discharge summaries from inpatient records?

Kenichiro Ando et al. PLOS Digit Health. .

Abstract

Medical professionals have been burdened by clerical work, and artificial intelligence may efficiently support physicians by generating clinical summaries. However, whether hospital discharge summaries can be generated automatically from inpatient records stored in electronic health records remains unclear. Therefore, this study investigated the sources of information in discharge summaries. First, the discharge summaries were automatically split into fine-grained segments, such as those representing medical expressions, using a machine learning model from a previous study. Second, these segments in the discharge summaries that did not originate from inpatient records were filtered out. This was performed by calculating the n-gram overlap between inpatient records and discharge summaries. The final source origin decision was made manually. Finally, to reveal the specific sources (e.g., referral documents, prescriptions, and physician's memory) from which the segments originated, they were manually classified by consulting medical professionals. For further and deeper analysis, this study designed and annotated clinical role labels that represent the subjectivity of the expressions and builds a machine learning model to assign them automatically. The analysis results revealed the following: First, 39% of the information in the discharge summary originated from external sources other than inpatient records. Second, patient's past clinical records constituted 43%, and patient referral documents constituted 18% of the expressions derived from external sources. Third, 11% of the missing information was not derived from any documents. These are possibly derived from physicians' memories or reasoning. According to these results, end-to-end summarization using machine learning is considered infeasible. Machine summarization with an assisted post-editing process is the best fit for this problem domain.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest associated with this manuscript.

Figures

Fig 1
Fig 1. Proposed framework of our study.
The colored blocks in the dummy record represent the clinical segment developed in previous study, where the sentence is split by medical sense [20].
Fig 2
Fig 2. Overview of the classification model for subjectivity, clinical role, and probable label.
Each of the three labels is defined as three tasks. Input segments are fed to UTH-BERT, and then the outputs to the specific layers. Finally, the loss scores of three tasks are calculated and combined to obtain the overall loss score.
Fig 3
Fig 3. Our annotation flowchart of the source origin.
The source origin is manually determined in two steps using pre-filtering.
Fig 4
Fig 4. Origin rate of segments in discharge summaries against the inpatient records.
Distribution of origin rates using bi-grams from the randomly sampled data. Red, blue, and gray dots are sourced, unsourced, and filtered out segments, respectively. Note that symbols and segments categorized as middle subjectivity are excluded. The y-axis values were randomly generated from a uniform distribution of visibility.
Fig 5
Fig 5. Origin rate of segments in discharge summaries against the inpatient records.
Proportion of unsourced segments appearing in manually annotated data. The y-axis is the value averaged every 0.1 steps for segments with origin rates less than 0.5, as shown in Fig 4.
Fig 6
Fig 6. Breakdown of the information source in discharge summaries.

Similar articles

Cited by

References

    1. Tomašev N, Glorot X, Rae JW, Zielinski M, Askham H, Saraiva A, et al.. A Clinically Applicable Approach to Continuous Prediction of Future Acute Kidney Injury. Nature. 2019;572(7767):116–119. doi: 10.1038/s41586-019-1390-1 - DOI - PMC - PubMed
    1. Ouyang D, He B, Ghorbani A, Yuan N, Ebinger J, Langlotz CP, et al.. Video-based AI for Beat-to-beat Assessment of Cardiac Function. Nature. 2020;580(7802):252–256. doi: 10.1038/s41586-020-2145-8 - DOI - PMC - PubMed
    1. Lu MY, Chen TY, Williamson DF, Zhao M, Shady M, Lipkova J, et al.. AI-based Pathology Predicts Origins for Cancers of Unknown Primary. Nature. 2021;594(7861):106–110. doi: 10.1038/s41586-021-03512-4 - DOI - PubMed
    1. Frazer J, Notin P, Dias M, Gomez A, Min JK, Brock K, et al.. Disease Variant Prediction with Deep Generative Models of Evolutionary Data. Nature. 2021;599(7883):91–95. doi: 10.1038/s41586-021-04043-8 - DOI - PubMed
    1. Bastani H, Drakopoulos K, Gupta V, Vlachogiannis I, Hadjicristodoulou C, Lagiou P, et al.. Efficient and Targeted COVID-19 Border Testing via Reinforcement Learning. Nature. 2021;599(7883):108–113. doi: 10.1038/s41586-021-04014-z - DOI - PubMed

LinkOut - more resources