Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Oct 27;13(1):26.
doi: 10.1186/s13326-022-00280-6.

We are not ready yet: limitations of state-of-the-art disease named entity recognizers

Affiliations

We are not ready yet: limitations of state-of-the-art disease named entity recognizers

Lisa Kühnel et al. J Biomed Semantics. .

Abstract

Background: Intense research has been done in the area of biomedical natural language processing. Since the breakthrough of transfer learning-based methods, BERT models are used in a variety of biomedical and clinical applications. For the available data sets, these models show excellent results - partly exceeding the inter-annotator agreements. However, biomedical named entity recognition applied on COVID-19 preprints shows a performance drop compared to the results on test data. The question arises how well trained models are able to predict on completely new data, i.e. to generalize.

Results: Based on the example of disease named entity recognition, we investigate the robustness of different machine learning-based methods - thereof transfer learning - and show that current state-of-the-art methods work well for a given training and the corresponding test set but experience a significant lack of generalization when applying to new data.

Conclusions: We argue that there is a need for larger annotated data sets for training and testing. Therefore, we foresee the curation of further data sets and, moreover, the investigation of continual learning processes for machine learning-based models.

Keywords: BERT; Manual Curation; Text mining; bioNLP.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Semantic comparison of the NCBI and BC5CDR corpora on disease mention and concept level. The training sets are compared to their corresponding test sets. Additionally, the two different training sets are compared to the test sets of the respective other corpus
Fig. 2
Fig. 2
Comparison of the data sets with scattertext. On each axis, the frequency of a term is shown for the given documents. In Fig. 2a, the BC5CDR training set is compared to its given test set whereas in Fig. 2b, the BC5CDR training set is compared to the NCBI training set. In Figs. 2c and 2d, the BC5CDR training set and the NCBI training set are compared against a randomly chosen PubMed corpus of similar size
Fig. 3
Fig. 3
NER results for all tested ML algorithms. The F1-score is shown for the test set that belongs to the training set (corresponding test set) and to the test set of the respective other data set

References

    1. School HM. N2C2: National NLP Clinical Challenges. https://n2c2.dbmi.hms.harvard.edu/. Accessed 20 June 2021.
    1. Doğan RI, Leaman R, Lu Z. The NCBI Disease Corpus. https://www.ncbi.nlm.nih.gov/CBBresearch/Dogan/DISEASE/. Accessed 11 July 2021.
    1. Li J, Sun Y, Johnson RJ, Sciaky D, Wei C-H, Leaman R, Davis AP, Mattingly CJ, Wiegers TC, Lu Z. BioCreative v CDR task corpus: a resource for chemical disease relation extraction. 2016. 10.1093/database/baw068. Accessed 11 July 2021. - PMC - PubMed
    1. The NCBI Disease Corpus Guidelines. https://www.ncbi.nlm.nih.gov/CBBresearch/Dogan/DISEASE/Guidelines.html. Accessed 12 July 2021.
    1. The BC5CDR Corpus Guidelines. https://biocreative.bioinformatics.udel.edu/media/store/files/2015/bc5_C.... Accessed 12 July 2021.

Publication types