Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov 10;1(2):None.
doi: 10.1016/j.xgen.2021.100028.

The Data Use Ontology to streamline responsible access to human biomedical datasets

Affiliations

The Data Use Ontology to streamline responsible access to human biomedical datasets

Jonathan Lawson et al. Cell Genom. .

Abstract

Human biomedical datasets that are critical for research and clinical studies to benefit human health also often contain sensitive or potentially identifying information of individual participants. Thus, care must be taken when they are processed and made available to comply with ethical and regulatory frameworks and informed consent data conditions. To enable and streamline data access for these biomedical datasets, the Global Alliance for Genomics and Health (GA4GH) Data Use and Researcher Identities (DURI) work stream developed and approved the Data Use Ontology (DUO) standard. DUO is a hierarchical vocabulary of human and machine-readable data use terms that consistently and unambiguously represents a dataset's allowable data uses. DUO has been implemented by major international stakeholders such as the Broad and Sanger Institutes and is currently used in annotation of over 200,000 datasets worldwide. Using DUO in data management and access facilitates researchers' discovery and access of relevant datasets. DUO annotations increase the FAIRness of datasets and support data linkages using common data use profiles when integrating the data for secondary analyses. DUO is implemented in the Web Ontology Language (OWL) and, to increase community awareness and engagement, hosted in an open, centralized GitHub repository. DUO, together with the GA4GH Passport standard, offers a new, efficient, and streamlined data authorization and access framework that has enabled increased sharing of biomedical datasets worldwide.

Keywords: FAIR; GA4GH; automated data access; consent; controlled access; data access; data restrictions; ontology; secondary data use; standard.

PubMed Disclaimer

Conflict of interest statement

M.N.C. is an employee of Foundation Medicine and equity holder of Roche. A.A.P. is a venture partner at GV and an employee of alphabet corporation. He has received funding from MSFT, Verily, IBM, Intel, Bayer, and Novartis. The views expressed by L.L.R. are the author’s own and do not necessarily represent those of her organization.

Figures

None
Graphical abstract
Figure 1
Figure 1
Data Use Ontology permissions and modifiers DUO is a hierarchical vocabulary of data use terms most often used to denote secondary usage conditions for controlled access datasets. DUO does not aim to represent all possible data use terms, consent phrases, or complex logical permutations of permissions, limitations, or requirements. As of June 2021, DUO contains 25 terms representing two types of data use terms, permissions and modifiers. Permissions such as General Research Use (GRU), Health or Medical or Biomedical use (HMB), Disease Specific research (DS), and Population Origins and Ancestry research (POA) standardize allowed usage of the datasets. Modifiers are used to further qualify main categories of controlled access.
Figure 2
Figure 2
Browsing the Data Use Ontology The DUO OWL file has been loaded in human-friendly browsers such as the Ontology Lookup Service (OLS). This enables interactive navigation through the hierarchy and display of additional properties such as definition, comment, or relations to other terms. For example, the “disease specific research” DUO term, http://purl.obolibrary.org/obo/DUO_0000007, clarifies that it should be used in conjunction with a term from a disease ontology. The “Preferred root terms” button (middle, active green checkbox) guides display of the top classes to be displayed to the user instead of presenting the complex upper-level BFO hierarchy (accessible by selecting “All terms”)
Figure 3
Figure 3
Current implementations of the Data Use Ontology DUO has been implemented to annotate genomics datasets worldwide. As of November 2021, implementers include repositories, databases, and projects in North America, Europe, Africa, Europe, Asia, and Australia.
None

References

    1. Rehm H.L., Page A.J.H., Smith L., Adams J.B., Alterovitz G., Babb L.J., Barkley M.P., Baudis M., Beauvais M.J.S., Beck T., et al. GA4GH: international policies and standards for data sharing across genomic research and healthcare. Cell Genomics. 2021;1 100029-1–100029-33. - PMC - PubMed
    1. Thorogood A., Rehm H.L., Goodhand P., Page A.J.H., Joly Y., Baudis M., Rambla J., Navarro A., Nyronen T.H., Linden M., et al. International Federation of Genomic Medicine Databases Using GA4GH Standards. Cell Genomics. 2021;1 100032-1–100032-5. - PMC - PubMed
    1. Woolley J.P., Kirby E., Leslie J., Jeanson F., Cabili M.N., Rushton G., Hazard J.G., Ladas V., Veal C.D., Gibson S.J., et al. Responsible sharing of biomedical data and biospecimens via the “Automatable Discovery and Access Matrix” (ADA-M) npj Genomic Med. 2018;3:1–6. - PMC - PubMed
    1. GA4GH Data Use and Researcher ID work stream https://ga4gh-duri.github.io/
    1. Voisin C., Linden M., Dyke S.O.M., Bowers S.R., Reinold K., Lawson J., Li S., Ota Wang V., Barkley M.P., Bernick D., et al. GA4GH Passport standard for digital identity and access permissions. Cell Genomics. 2021;1 100030-1–100030-12. - PMC - PubMed

LinkOut - more resources