This is a preprint.
Enhancing Automatic PT Tagging for MEDLINE Citations Using Transformer-Based Models
- PMID: 40735093
- PMCID: PMC12306818
Enhancing Automatic PT Tagging for MEDLINE Citations Using Transformer-Based Models
Abstract
We investigated the feasibility of predicting Medical Subject Headings (MeSH) Publication Types (PTs) from MEDLINE citation metadata using pre-trained Transformer-based models BERT and DistilBERT. This study addresses limitations in the current automated indexing process, which relies on legacy NLP algorithms. We evaluated monolithic multi-label classifiers and binary classifier ensembles to enhance the retrieval of biomedical literature. Results demonstrate the potential of Transformer models to significantly improve PT tagging accuracy, paving the way for scalable, efficient biomedical indexing.
Keywords: MEDLINE; Machine Learning; MeSH Publication Types; Natural Language Processing; Pre-trained Foundation Models.
Figures
Similar articles
-
Trajectory-Ordered Objectives for Self-Supervised Representation Learning of Temporal Healthcare Data Using Transformers: Model Development and Evaluation Study.JMIR Med Inform. 2025 Jun 4;13:e68138. doi: 10.2196/68138. JMIR Med Inform. 2025. PMID: 40465350 Free PMC article.
-
Improving Transformer Performance for French Clinical Notes Classification Using Mixture of Experts on a Limited Dataset.IEEE J Transl Eng Health Med. 2025 Jun 4;13:261-274. doi: 10.1109/JTEHM.2025.3576570. eCollection 2025. IEEE J Transl Eng Health Med. 2025. PMID: 40740832 Free PMC article.
-
Predicting Drug-Side Effect Relationships From Parametric Knowledge Embedded in Biomedical BERT Models: Methodological Study With a Natural Language Processing Approach.JMIR Med Inform. 2025 Jul 10;13:e67513. doi: 10.2196/67513. JMIR Med Inform. 2025. PMID: 40638775 Free PMC article.
-
Search strategies to identify diagnostic accuracy studies in MEDLINE and EMBASE.Cochrane Database Syst Rev. 2013 Sep 11;2013(9):MR000022. doi: 10.1002/14651858.MR000022.pub3. Cochrane Database Syst Rev. 2013. PMID: 24022476 Free PMC article.
-
Deciphering genomic codes using advanced natural language processing techniques: a scoping review.J Am Med Inform Assoc. 2025 Apr 1;32(4):761-772. doi: 10.1093/jamia/ocaf029. J Am Med Inform Assoc. 2025. PMID: 39998912
References
-
- NLM, “Publication Characteristics (Publication Types) with Scope Notes,” 22 December 2023. [Online]. Available: https://www.nlm.nih.gov/mesh/pubtypes.html. [Accessed 3 June 2024].
-
- NLM, “MEDLINE/PubMed Data Element (Field) Descriptions.,” [Online]. Available: https://www.nlm.nih.gov/bsd/mms/medlineelements.html. [Accessed 3 June 2024].
-
- Barnes J., Abbot N. C., F. H. E. and Ernst E., “Articles on Complementary Medicine in the Mainstream Medical Literature: An Investigation of MEDLINE, 1966 through 1996.,” Arch Intern Med, vol. 159, no. 15, pp. 1721–1725, 1999. - PubMed
Publication types
LinkOut - more resources
Full Text Sources