Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Dec 9;12(12):2040.
doi: 10.3390/jpm12122040.

Biobanking as a Tool for Genomic Research: From Allele Frequencies to Cross-Ancestry Association Studies

Affiliations
Review

Biobanking as a Tool for Genomic Research: From Allele Frequencies to Cross-Ancestry Association Studies

Tatyana E Lazareva et al. J Pers Med. .

Abstract

In recent years, great advances have been made in the field of collection, storage, and analysis of biological samples. Large collections of samples, biobanks, have been established in many countries. Biobanks typically collect large amounts of biological samples and associated clinical information; the largest collections include over a million samples. In this review, we summarize the main directions in which biobanks aid medical genetics and genomic research, from providing reference allele frequency information to allowing large-scale cross-ancestry meta-analyses. The largest biobanks greatly vary in the size of the collection, and the amount of available phenotype and genotype data. Nevertheless, all of them are extensively used in genomics, providing a rich resource for genome-wide association analysis, genetic epidemiology, and statistical research into the structure, function, and evolution of the human genome. Recently, multiple research efforts were based on trans-biobank data integration, which increases sample size and allows for the identification of robust genetic associations. We provide prominent examples of such data integration and discuss important caveats which have to be taken into account in trans-biobank research.

Keywords: GWAS; allele frequency; biobank; genomics; meta-analysis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
A diagram showing five major types of biobank applications for genomic research and medical genetics.
Figure 2
Figure 2
Bias in meta-analysis due to differences in data preprocessing. (a) Descriptive statistics of the raw data. Shown are (from left to right) effect size distribution in FinnGen, effect size distribution in UKB, and a scatterplot of p-values from UKB and FinnGen. (b) Scatterplot showing a comparison of METAL meta-analysis p-values for (from left to right) inverse variance method using raw data, sample size-based method, and inverse variance method using scaled data (scaling was performed by multiplying UKB effect size by the ratio of mean effect size from FinnGen to UKB). Note the extremely high degree of correlation of meta-analysis p-values and UKB p-values in case of inverse variance test and original UKB effects. Phenotypes used for analysis: other (seronegative) rheumatoid arthritis, wide (RHEUMA_OTHER_WIDE) from FinnGen, and other rheumatoid arthritis (M06) from UKB. Data and code used for the analysis presented in this figure are publicly available in the repository at https://github.com/TohaRhymes/meta-analysis-methods-comp, accessed on 27 November 2022.

Similar articles

Cited by

References

    1. Parodi B. Ethics, Law and Governance of Biobanking. Springer; Dordrecht, The Netherlands: 2015. Biobanks: A Definition; pp. 15–19. - DOI
    1. Cambon-Thomsen A., Ducournau P., Gourraud P.A., Pontille D. Biobanks for Genomics and Genomics for Biobanks. Comp. Funct. Genom. 2003;4:628–634. doi: 10.1002/cfg.333. - DOI - PMC - PubMed
    1. Goodwin S., McPherson J.D., McCombie W.R. Coming of age: Ten years of next-generation sequencing technologies. Nat. Rev. Genet. 2016;17:333–351. doi: 10.1038/nrg.2016.49. - DOI - PMC - PubMed
    1. Daw Elbait G., Henschel A., Tay G.K., Al Safar H.S. A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population. Front. Genet. 2021;12:428. doi: 10.3389/fgene.2021.660428. - DOI - PMC - PubMed
    1. Takayama J., Tadaka S., Yano K., Katsuoka F., Gocho C., Funayama T., Makino S., Okamura Y., Kikuchi A., Sugimoto S., et al. Construction and integration of three de novo Japanese human genome assemblies toward a population-specific reference. Nat. Commun. 2021;12:226. doi: 10.1038/s41467-020-20146-8. - DOI - PMC - PubMed

LinkOut - more resources