Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Apr 5:7:58.
doi: 10.3389/fbioe.2019.00058. eCollection 2019.

Defending Our Public Biological Databases as a Global Critical Infrastructure

Affiliations

Defending Our Public Biological Databases as a Global Critical Infrastructure

Jacob Caswell et al. Front Bioeng Biotechnol. .

Abstract

Progress in modern biology is being driven, in part, by the large amounts of freely available data in public resources such as the International Nucleotide Sequence Database Collaboration (INSDC), the world's primary database of biological sequence (and related) information. INSDC and similar databases have dramatically increased the pace of fundamental biological discovery and enabled a host of innovative therapeutic, diagnostic, and forensic applications. However, as high-value, openly shared resources with a high degree of assumed trust, these repositories share compelling similarities to the early days of the Internet. Consequently, as public biological databases continue to increase in size and importance, we expect that they will face the same threats as undefended cyberspace. There is a unique opportunity, before a significant breach and loss of trust occurs, to ensure they evolve with quality and security as a design philosophy rather than costly "retrofitted" mitigations. This Perspective surveys some potential quality assurance and security weaknesses in existing open genomic and proteomic repositories, describes methods to mitigate the likelihood of both intentional and unintentional errors, and offers recommendations for risk mitigation based on lessons learned from cybersecurity.

Keywords: bioeconomy; biological databases; biosecurity; cyberbiosecurity; cybersecurity; machine learning.

PubMed Disclaimer

References

    1. Alipanahi B., Delong A., Weirauch M. T., Frey B. J. (2015). Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol. 33, 831–838. 10.1038/nbt.3300 - DOI - PubMed
    1. Ballenghien M., Faivre N., Galtier N. (2017). Patterns of cross-contamination in a multispecies population genomic project: detection, quantification, impact, and solutions. BMC Biol. 15:25. 10.1186/s12915-017-0366-6 - DOI - PMC - PubMed
    1. Beaz-Hidalgo R., Hossain M. J., Liles M. R., Figueras M. J. (2015). Strategies to avoid wrongly labelled genomes using as example the detected wrong taxonomic affiliation for aeromonas genomes in the GenBank database. PLoS ONE. 10:e0115813. 10.1371/journal.pone.0115813 - DOI - PMC - PubMed
    1. Biggio B., Corona I., Maiorca D., Nelson B., Šrndić N., Laskov P., et al. (2013a). Evasion Attacks Against Machine Learning at Test Time. Berlin: Springer, 387–402.
    1. Biggio B., Pillai I., Bulò S. R., Ariu D., Pelillo M., Roli F. (2013b). Is data clustering in adversarial settings secure?, in Proceedings of the 2013 ACM Workshop on Artificial Intelligence and Security. (Berlin: ACM; ).

LinkOut - more resources