Machine learning-based donor permission extraction from informed consent documents
- PMID: 38102593
- PMCID: PMC10724888
- DOI: 10.1186/s12859-023-05568-7
Machine learning-based donor permission extraction from informed consent documents
Abstract
Background: With more clinical trials are offering optional participation in the collection of bio-specimens for biobanking comes the increasing complexity of requirements of informed consent forms. The aim of this study is to develop an automatic natural language processing (NLP) tool to annotate informed consent documents to promote biorepository data regulation, sharing, and decision support. We collected informed consent documents from several publicly available sources, then manually annotated them, covering sentences containing permission information about the sharing of either bio-specimens or donor data, or conducting genetic research or future research using bio-specimens or donor data.
Results: We evaluated a variety of machine learning algorithms including random forest (RF) and support vector machine (SVM) for the automatic identification of these sentences. 120 informed consent documents containing 29,204 sentences were annotated, of which 1250 sentences (4.28%) provide answers to a permission question. A support vector machine (SVM) model achieved a F-1 score of 0.95 on classifying the sentences when using a gold standard, which is a prefiltered corpus containing all relevant sentences.
Conclusions: This study provides the feasibility of using machine learning tools to classify permission-related sentences in informed consent documents.
Keywords: Informed consent; Machine learning; Natural language processing; Text classification.
© 2023. The Author(s).
Conflict of interest statement
The authors declare no competing interests.
Figures
Similar articles
-
Simplifying informed consent for biorepositories: stakeholder perspectives.Genet Med. 2010 Sep;12(9):567-72. doi: 10.1097/GIM.0b013e3181ead64d. Genet Med. 2010. PMID: 20697289 Free PMC article.
-
Surgical classification using natural language processing of informed consent forms in spine surgery.Neurosurg Focus. 2023 Jun;54(6):E10. doi: 10.3171/2023.3.FOCUS2371. Neurosurg Focus. 2023. PMID: 37283446
-
Natural language processing and machine learning to enable automatic extraction and classification of patients' smoking status from electronic medical records.Ups J Med Sci. 2020 Nov;125(4):316-324. doi: 10.1080/03009734.2020.1792010. Epub 2020 Jul 22. Ups J Med Sci. 2020. PMID: 32696698 Free PMC article.
-
Clinical Text Data in Machine Learning: Systematic Review.JMIR Med Inform. 2020 Mar 31;8(3):e17984. doi: 10.2196/17984. JMIR Med Inform. 2020. PMID: 32229465 Free PMC article. Review.
-
Machine learning applications to clinical decision support in neurosurgery: an artificial intelligence augmented systematic review.Neurosurg Rev. 2020 Oct;43(5):1235-1253. doi: 10.1007/s10143-019-01163-8. Epub 2019 Aug 17. Neurosurg Rev. 2020. PMID: 31422572
Cited by
-
Advancing genome-based precision medicine: a review on machine learning applications for rare genetic disorders.Brief Bioinform. 2025 Jul 2;26(4):bbaf329. doi: 10.1093/bib/bbaf329. Brief Bioinform. 2025. PMID: 40668553 Free PMC article. Review.
-
Perspectives of Artificial Intelligence Use for In-House Ethics Checks of Journal Submissions.J Korean Med Sci. 2025 Jun 2;40(21):e170. doi: 10.3346/jkms.2025.40.e170. J Korean Med Sci. 2025. PMID: 40461144 Free PMC article. Review.
References
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources