Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 Apr 10;12(1):19.
doi: 10.1186/s40246-018-0147-5.

The development of large-scale de-identified biomedical databases in the age of genomics-principles and challenges

Affiliations
Review

The development of large-scale de-identified biomedical databases in the age of genomics-principles and challenges

Fida K Dankar et al. Hum Genomics. .

Abstract

Contemporary biomedical databases include a wide range of information types from various observational and instrumental sources. Among the most important features that unite biomedical databases across the field are high volume of information and high potential to cause damage through data corruption, loss of performance, and loss of patient privacy. Thus, issues of data governance and privacy protection are essential for the construction of data depositories for biomedical research and healthcare. In this paper, we discuss various challenges of data governance in the context of population genome projects. The various challenges along with best practices and current research efforts are discussed through the steps of data collection, storage, sharing, analysis, and knowledge dissemination.

Keywords: Biomedical database; Data governance; Data privacy; Whole genome sequencing.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
Secure storage strategy for a large-scale population sequencing project. All data is stored in a secure data center with partial mirroring for research on site, partial archival mirroring for backup at geographically distant remote sites within the country, and additional mirror copy for protection against unforeseeable rare catastrophic (aka “Black Swan”) events.
Fig. 2
Fig. 2
De-identification of clinical data
Fig. 3
Fig. 3
Framework for the secure multiparty computation

References

    1. Decode genetics. http://www.decode.com/.
    1. Gulcher J, Stefansson K. An Icelandic saga on a centralized healthcare database and democratic decision making. Nat Biotechnol. 1999;17:620. doi: 10.1038/10796. - DOI - PubMed
    1. Gudbjartsson DF, Helgason H, Gudjonsson SA, Zink F, Oddson A, Gylfason A, et al. Large-scale whole-genome sequencing of the Icelandic population. Nat Genet. 2015;47:435–444. doi: 10.1038/ng.3247. - DOI - PubMed
    1. Genome England. http://genomicsengland.co.uk.
    1. Human Longevity. http://www.humanlongevity.com/.

Publication types

LinkOut - more resources