Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 30;26(1):34.
doi: 10.1186/s12859-024-06008-w.

Biomedical named entity recognition using improved green anaconda-assisted Bi-GRU-based hierarchical ResNet model

Affiliations

Biomedical named entity recognition using improved green anaconda-assisted Bi-GRU-based hierarchical ResNet model

Ram Chandra Bhushan et al. BMC Bioinformatics. .

Abstract

Background: Biomedical text mining is a technique that extracts essential information from scientific articles using named entity recognition (NER). Traditional NER methods rely on dictionaries, rules, or curated corpora, which may not always be accessible. To overcome these challenges, deep learning (DL) methods have emerged. However, DL-based NER methods may need help identifying long-distance relationships within text and require significant annotated datasets.

Results: This research has proposed a novel model to address the challenges in natural language processing. The Improved Green anaconda-assisted Bi-GRU based Hierarchical ResNet BNER model (IGa-BiHR BNERM) is the model. IGa-BiHR BNERM model has shown promising results in accurately identifying named entities. The MACCROBAT dataset was obtained from Kaggle and underwent several pre-processing steps such as Stop Word Filtering, WordNet processing, Removal of non-alphanumeric characters, stemming Segmentation, and Tokenization, which is standardized and improves its quality. The pre-processed text was fed into a feature extraction model like the Robustly Optimized BERT -Whole Word Masking model. This model provides word embeddings with semantic information. Then, the BNER process utilized an Improved Green Anaconda-assisted Bi-GRU-based Hierarchical ResNet BNER model (IGa-BiHR BNERM).

Conclusion: To improve the training phase of the IGa-BiHR BNERM, the Improved Green Anaconda Optimization technique was used to select optimal weight parameter coefficients for training the model parameters. After the model was tested using the MACCROBAT dataset, it outperformed previous models with a tremendous accuracy rate of 99.11%. This model effectively and accurately identifies biomedical names within the text, significantly advancing this field.

Keywords: Bi-GRU; Biomedical name; Hierarchical ResNet; IGAO; ROBERT-WWM; Recognition; Word embedding.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Not applicable. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Workflow of the proposed technique
Fig. 2
Fig. 2
Structural diagram of ROBERT-WWM model
Fig. 3
Fig. 3
IGa-BiHR BNERM model
Fig. 4
Fig. 4
IGa-BiHR BNERM model recognizes the biomedical name entity in the MACCROBAT dataset
Algorithm 1
Algorithm 1
Pseudocode of the proposed IGAO algorithm
Fig. 5
Fig. 5
a, b: Analysis of proposed and existing model performance
Fig. 6
Fig. 6
Error analysis of the proposed and existing model
Figure 7
Figure 7
a, b: Analysis of the PPV and NPR
Figure 8
Figure 8
a, b: Analysis of the FPR and FNR
Fig. 9
Fig. 9
Analysis of Execution time
Fig. 10
Fig. 10
Analysis of the AUC curve
Fig. 11
Fig. 11
a, b: Analysis of training and testing of accuracy and loss curve
Fig. 12
Fig. 12
Contribution of individual components to the overall performance

References

    1. Govindarajan S, et al. (2023) RETRACTED: an optimization based feature extraction and machine learning techniques for named entity identification
    1. Sung M, et al. BERN2: an advanced neural biomedical named entity recognition and normalization tool. Bioinformatics. 2022;38(20):4837–9. - PMC - PubMed
    1. Kaswan KS, et al (2021) "AI-based natural language processing for the generation of meaningful information electronic health record (EHR) data." Advanced AI techniques and applications in bioinformatics. CRC Press, 41-86
    1. Wang DQ, et al. Accelerating the integration of ChatGPT and other large-scale AI models into biomedical research and healthcare. MedComm–Future Med. 2023;2(2):43.
    1. Ahmad PN, Shah AM, Lee K. A review on electronic health record text-mining for biomedical name entity recognition in healthcare domain. InHealthcare 2023 Apr 28 (Vol. 11, No. 9, p. 1268). MDPI - PMC - PubMed

Supplementary concepts

LinkOut - more resources