Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 7;50(D1):D980-D987.
doi: 10.1093/nar/gkab1059.

The European Genome-phenome Archive in 2021

Affiliations

The European Genome-phenome Archive in 2021

Mallory Ann Freeberg et al. Nucleic Acids Res. .

Erratum in

Abstract

The European Genome-phenome Archive (EGA - https://ega-archive.org/) is a resource for long term secure archiving of all types of potentially identifiable genetic, phenotypic, and clinical data resulting from biomedical research projects. Its mission is to foster hosted data reuse, enable reproducibility, and accelerate biomedical and translational research in line with the FAIR principles. Launched in 2008, the EGA has grown quickly, currently archiving over 4,500 studies from nearly one thousand institutions. The EGA operates a distributed data access model in which requests are made to the data controller, not to the EGA, therefore, the submitter keeps control on who has access to the data and under which conditions. Given the size and value of data hosted, the EGA is constantly improving its value chain, that is, how the EGA can contribute to enhancing the value of human health data by facilitating its submission, discovery, access, and distribution, as well as leading the design and implementation of standards and methods necessary to deliver the value chain. The EGA has become a key GA4GH Driver Project, leading multiple development efforts and implementing new standards and tools, and has been appointed as an ELIXIR Core Data Resource.

PubMed Disclaimer

Figures

Graphical Abstract
Graphical Abstract
The European Genome-phenome Archive serves human genomics and health communities by offering data submission, validation, discovery, and access services to researchers worldwide. Data reuse value is maximised by implementing community standards and ensuring interoperability of data and tools to accelerate biomedical and translational research.
Figure 1.
Figure 1.
Data archived at EGA between 2013–2021. Cumulative size of data (A), number of studies and datasets (B), and number of files (C) archived and available for download from EGA per year. (D) Number of institutes per country that have archived data at the EGA.
Figure 2.
Figure 2.
EGA facilitates the submission, discovery, access, and distribution of sensitive human data. A researcher submits controlled access human genetic, phenotypic and clinical data to EGA after signing a Data Processing Agreement (1). EGA processes, archives, and releases the dataset to be findable. Another researcher discovers data of interest at the EGA (2). They contact the Data Access Committee for the data of interest and agree to the terms of data reuse by signing a Data Access Agreement (3). The Data Access Committee informs EGA that access is approved (4). The EGA grants access to the requesting researcher (5) who can then download and visualise the data (6). GDPR: General Data Protection Regulation.
Figure 3.
Figure 3.
EGA data distribution to approved researchers between 2011 and 2021. (A) Number of EGA data requester accounts created over time. (B) Amount of data distributed to approved researchers over time.
Figure 4.
Figure 4.
The EGA offers a variety of secure data access and download services to meet user needs, many of which implement GA4GH standards. FUSE: Filesystem in Userspace. AAI: Authentication and Authorization Infrastructure. OpenIDC: OpenID Connect, an open standard and decentralized authentication protocol.

References

    1. Lappalainen I., Almeida-King J., Kumanduri V., Senf A., Spalding J.D., Ur-Rehman S., Saunders G., Kandasamy J., Caccamo M., Leinonen R.et al.. The European Genome-phenome Archive of human data consented for biomedical research. Nat. Genet. 2015; 47:692–695. - PMC - PubMed
    1. Saunders G., Baudis M., Becker R., Beltran S., Béroud C., Birney E., Brooksbank C., Brunak S., Van den Bulcke M., Drysdale R.et al.. Leveraging European infrastructures to access 1 million human genomes by 2022. Nat. Rev. Genet. 2019; 20:693–701. - PMC - PubMed
    1. Wilkinson M.D., Dumontier M., Aalbersberg I.J.J., Appleton G., Axton M., Baak A., Blomberg N., Boiten J.-W., da Silva Santos L.B., Bourne P.E.et al.. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016; 3:160018. - PMC - PubMed
    1. Sudlow C., Gallacher J., Allen N., Beral V., Burton P., Danesh J., Downey P., Elliott P., Green J., Landray M.et al.. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015; 12:e1001779. - PMC - PubMed
    1. Wellcome Trust Case Control Consortium Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007; 447:661–678. - PMC - PubMed

Publication types