Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun;15(6):1500-1506.
doi: 10.1111/cts.13268. Epub 2022 Apr 3.

Machine learning approach to identify adverse events in scientific biomedical literature

Affiliations

Machine learning approach to identify adverse events in scientific biomedical literature

Sonja Wewering et al. Clin Transl Sci. 2022 Jun.

Abstract

Monitoring the occurrence of adverse events in the scientific literature is a mandatory process in drug marketing surveillance. This is a very time-consuming and complex task to fulfill the compliance and, most importantly, to ensure patient safety. Therefore, a machine learning (ML) algorithm has been trained to support this manual intellectual review process, by automatically providing a classification of the literature articles into two types. An algorithm has been designed to automatically classify "relevant articles" which are reporting any kind of drug safety relevant information, and those which are not reporting an adverse drug reaction as "not relevant." The review process is consisted of many rules and aspects which needed to be taken into consideration. Therefore, for the training of the algorithm, thousands of documents from previous screenings have been used. After several iterations of adjustments and fine tuning, the ML approach is definitively a great achievement in pre-sorting the articles into "relevant" and "non-relevant" and supporting the intellectual review process.

PubMed Disclaimer

Conflict of interest statement

Sonja Wewering, Claudia Pietsch, and Anna‐Theresa Lulf‐Averhoff are employees of Bayer AG. All other authors declared no competing interests for this work.

Figures

FIGURE 1
FIGURE 1
Approach to evaluate the results from automatic classification compared to the previous decisions of human reviewers. Remove 10% of the past decisions randomly from the examples and train the machine on the remaining 90% only. Then let the machine categorize the 10% automatically and compare with the intellectual decisions for the same. ML, machine learning
FIGURE 2
FIGURE 2
Classification of data by SVM. SVM finds a linear separation (hyperplane) that divides the dataset into two classes. SVM, Support Vector Machine
FIGURE 3
FIGURE 3
SVM application to textual data. To apply SVMs to classify textual data, like the titles and abstracts of clinical studies, the texts need to be represented as points in a geometrical space (numerical vectors). SVM, Support Vector Machine
FIGURE 4
FIGURE 4
Convert text to geometric point with bag‐of‐words representation. Tokenization extracts words, stem forms of words are used, and counts of different stems

References

    1. European Medicines Agency . [Online] October 26, 2020. https://www.ema.europa.eu/en/human‐regulatory/post‐authorisation/pharmac....
    1. U.S: Food & Drug Administration . https://www.fda.gov/. [Online] 2021.
    1. Scilit . Market Size in Terms of Articles [Online]. Scilit; 2021. [Cited: 7 16, 2021]. https://www.scilit.net/statistic‐publishing‐market‐article.
    1. Elsevier . Embase content [Online]. Embase/Elsevier; 2020. https://www.elsevier.com/solutions/embase‐biomedical‐research/embase‐cov....
    1. Joachims, T. Text categorization with support vector machines: learning with many relevant features. ECML‐98 1398, 1998, pp. 137‐142. 10.1007/BFb0026683 - DOI

Publication types

MeSH terms