Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Dec 15;24(Suppl 3):477.
doi: 10.1186/s12859-023-05568-7.

Machine learning-based donor permission extraction from informed consent documents

Affiliations

Machine learning-based donor permission extraction from informed consent documents

Meng Zhang et al. BMC Bioinformatics. .

Abstract

Background: With more clinical trials are offering optional participation in the collection of bio-specimens for biobanking comes the increasing complexity of requirements of informed consent forms. The aim of this study is to develop an automatic natural language processing (NLP) tool to annotate informed consent documents to promote biorepository data regulation, sharing, and decision support. We collected informed consent documents from several publicly available sources, then manually annotated them, covering sentences containing permission information about the sharing of either bio-specimens or donor data, or conducting genetic research or future research using bio-specimens or donor data.

Results: We evaluated a variety of machine learning algorithms including random forest (RF) and support vector machine (SVM) for the automatic identification of these sentences. 120 informed consent documents containing 29,204 sentences were annotated, of which 1250 sentences (4.28%) provide answers to a permission question. A support vector machine (SVM) model achieved a F-1 score of 0.95 on classifying the sentences when using a gold standard, which is a prefiltered corpus containing all relevant sentences.

Conclusions: This study provides the feasibility of using machine learning tools to classify permission-related sentences in informed consent documents.

Keywords: Informed consent; Machine learning; Natural language processing; Text classification.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Screenshot of annotation in CLAMP
Fig. 2
Fig. 2
Data distribution
Fig. 3
Fig. 3
Sentence length distribution
Fig. 4
Fig. 4
Number of sentences related to each permission question
Fig. 5
Fig. 5
Model accuracies based on permission type

Similar articles

Cited by

References

    1. Warner A, Moore H, Reinhard D, et al. Harmonizing global biospecimen consent practices to advance translational research: a call to action. Clin Pharmacol Ther. 2017;101:317–319. doi: 10.1002/cpt.461. - DOI - PubMed
    1. Eisenhauer ER, Tait AR, Rieh SY, et al. Participants’ understanding of informed consent for biobanking: a systematic review. Clin Nurs Res. 2019;28:30–51. doi: 10.1177/1054773817722690. - DOI - PubMed
    1. Manson NC. The ethics of biobanking: Assessing the right to control problem for broad consent. Bioethics. 2019;33:540–549. doi: 10.1111/bioe.12550. - DOI - PubMed
    1. Master Z, Nelson E, Murdoch B, et al. Biobanks, consent and claims of consensus. Nat Methods. 2012;9:885–888. doi: 10.1038/nmeth.2142. - DOI - PubMed
    1. Husedzinovic A, Ose D, Schickhardt C, et al. Stakeholders’ perspectives on biobank-based genomic research: systematic review of the literature. Eur J Hum Genet. 2015;23:1607–1614. doi: 10.1038/ejhg.2015.27. - DOI - PMC - PubMed

LinkOut - more resources