Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jun 23;8(Suppl 1):13-22.
doi: 10.4137/BII.S37977. eCollection 2016.

Toward a Learning Health-care System - Knowledge Delivery at the Point of Care Empowered by Big Data and NLP

Affiliations

Toward a Learning Health-care System - Knowledge Delivery at the Point of Care Empowered by Big Data and NLP

Vinod C Kaggal et al. Biomed Inform Insights. .

Abstract

The concept of optimizing health care by understanding and generating knowledge from previous evidence, ie, the Learning Health-care System (LHS), has gained momentum and now has national prominence. Meanwhile, the rapid adoption of electronic health records (EHRs) enables the data collection required to form the basis for facilitating LHS. A prerequisite for using EHR data within the LHS is an infrastructure that enables access to EHR data longitudinally for health-care analytics and real time for knowledge delivery. Additionally, significant clinical information is embedded in the free text, making natural language processing (NLP) an essential component in implementing an LHS. Herein, we share our institutional implementation of a big data-empowered clinical NLP infrastructure, which not only enables health-care analytics but also has real-time NLP processing capability. The infrastructure has been utilized for multiple institutional projects including the MayoExpertAdvisor, an individualized care recommendation solution for clinical care. We compared the advantages of big data over two other environments. Big data infrastructure significantly outperformed other infrastructure in terms of computing speed, demonstrating its value in making the LHS a possibility in the near future.

Keywords: big data; health-care analytics; learning health-care system; natural language processing.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Learning cycle in an LHS. Analytics experts enable the cycle. Domain pragmatics provides the contextual information related to the domain, which is needed for discovering knowledge. Users are people who consume the knowledge.
Figure 2
Figure 2
Generic clinical NLP process. Clinical NLP involves processing textual data obtained from clinical notes and voice dictated text. The process includes both syntactic and semantic processing. While syntactic components identify the grammatical structure of the text, the semantic components identify clinical concepts and its context such as experiencer, certainty, and negation.
Figure 3
Figure 3
A high-level architecture of big data-empowered analytics in LHS. Big data architecture at Mayo consists of three layers: (i) data ingestion layer that reads data from real-time feeds from the EMR and archived data, (ii) big data analytics layer that does stream processing for analyzing the data, and (iii) data storage and retrieval that stores the information and knowledge that are generated through big data analytics and facilitate retrieval at the appropriate time for clinical use.
Figure 4
Figure 4
Data-empowered NLP architecture. Apache storm topology consists of the following components: (i) Spout: streamlines data from their respective data sources; (ii) Bolts: Processing unit often dedicated to a single type of process; (iii) HDFS – currently used for archive; (iv) Elasticsearch – retrieve data from archive.
Figure 5
Figure 5
MEA workflow architecture. MEA workflow consists of three components: (i) MedTagger, a clinical NLP pipeline reads data from clinical notes, radiology notes, ECG text, and other reports and identifies data elements; (ii) Webservices aggregate the information from both the NLP pipeline and structured data sources such as laboratory values, patient provided information to synthesize concept assertion at patient level; and (iii) synthesized information is fed to a decision rule system that generates care recommendation for the clinician at the point of care.
Figure 6
Figure 6
Average processing time of different server environments. Time taken for big data-empowered NLP to process 20,000 documents in three different server environments. On an average, (i) standalone server takes 23.97 minutes to complete the NLP process, (ii) data stage takes the maximum time of 85.67 minutes for the same, while (iii) big data take 20.03 minutes for the same task.
Figure 7
Figure 7
Time taken to process 20,000 documents with varying number of parallel threads in big data. With increasing parallelism, there is a significant drop in the processing times of NLP empowered MEA algorithm.
Figure 8
Figure 8
Processing time using 16 threads on varying number of documents. As the number of documents processing doubles, the processing time increases almost linearly.

References

    1. Health Information Technology (HITECH) Act 2009. Index for Excerpts from the American Recovery and Reinvestment Act of 2009.

    1. Friedman C, Rubin J, Brown J, et al. Toward a science of learning systems: a research agenda for the high-functioning Learning Health System. J Am Med Inform Assoc. 2015;22:43–50. - PMC - PubMed
    1. The Foundation for Continuous Improvement in Health and Health Care . Digital Infrastructure for the Learning Health System: Institute of Medicine. 2011. - PubMed
    1. Fernandes L, O’Connor M, Weaver V. Big data, bigger outcomes: healthcare is embracing the big data movement, hoping to revolutionize HIM by distilling vast collection of data for specific analysis. J AHIMA. 2012;83(10):38–43. quiz 44. - PubMed
    1. Demner-Fushman D, Chapman W, McDonald CJ. What can natural language processing do for clinical decision support? J Biomed Inform. 2009;42(5):760–72. - PMC - PubMed

LinkOut - more resources