Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov 29;12(12):1974.
doi: 10.3390/jpm12121974.

The Penn Medicine BioBank: Towards a Genomics-Enabled Learning Healthcare System to Accelerate Precision Medicine in a Diverse Population

Affiliations

The Penn Medicine BioBank: Towards a Genomics-Enabled Learning Healthcare System to Accelerate Precision Medicine in a Diverse Population

Anurag Verma et al. J Pers Med. .

Abstract

The Penn Medicine BioBank (PMBB) is an electronic health record (EHR)-linked biobank at the University of Pennsylvania (Penn Medicine). A large variety of health-related information, ranging from diagnosis codes to laboratory measurements, imaging data and lifestyle information, is integrated with genomic and biomarker data in the PMBB to facilitate discoveries and translational science. To date, 174,712 participants have been enrolled into the PMBB, including approximately 30% of participants of non-European ancestry, making it one of the most diverse medical biobanks. There is a median of seven years of longitudinal data in the EHR available on participants, who also consent to permission to recontact. Herein, we describe the operations and infrastructure of the PMBB, summarize the phenotypic architecture of the enrolled participants, and use body mass index (BMI) as a proof-of-concept quantitative phenotype for PheWAS, LabWAS, and GWAS. The major representation of African-American participants in the PMBB addresses the essential need to expand the diversity in genetic and translational research. There is a critical need for a "medical biobank consortium" to facilitate replication, increase power for rare phenotypes and variants, and promote harmonized collaboration to optimize the potential for biological discovery and precision medicine.

Keywords: PMBB; biobank; electronic health records; genomics; learning health system; precision medicine.

PubMed Disclaimer

Conflict of interest statement

SMD receives research funding from RenalytixAI, in-kind research support from Novo Nordisk, and personal consulting fees from Calico Labs. DJR serves on scientific advisory boards for Alnylam, Novartis, Pfizer, and Verve. Regeneron has generated genomic data in PMBB participants. These entities had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of this manuscript; or in the decision to publish these results.

Figures

Figure 1
Figure 1
Recruitment and Demographics. (A) Distribution of enrollment through paper and electronic consent. (B) Cumulative numbers of participants consented and biospecimen sample collection. (C) Distribution of participants by age group and self-reported sex. (D) Distribution of participants by self-reported race. (E) Density of recruitment around the six clinical sites of UPHS in Pennsylvania and New Jersey: Hospital of the University of Pennsylvania, Penn Presbyterian Medical Center, Pennsylvania Hospital, Chester County Hospital, Lancaster General Health, Princeton Health.
Figure 2
Figure 2
Prevalence of diagnoses code among PMBB participants grouped by broader disease domain.
Figure 3
Figure 3
A phenome-wide association between mean body mass index and 1856 EHR-derived phenotypes.
Figure 4
Figure 4
A laboratory-wide association between mean body mass index and 24 laboratory measurements derived from the electronic health records.
Figure 5
Figure 5
Manhattan plot showing association between common genetic variants (MAF > 1%) and BMI.

References

    1. Institute of Medicine . Genomics-Enabled Learning Health Care Systems: Gathering and Using Genomic Information to Improve Patient Care and Research: Workshop Summary. National Academies Press (US); Washington, DC, USA: 2015. - PubMed
    1. Loh P.-R., Danecek P., Palamara P.F., Fuchsberger C., AReshef Y., Finucane H.K., Schoenherr S., Forer L., McCarthy S., Abecasis G.R., et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat. Genet. 2016;48:1443–1448. doi: 10.1038/ng.3679. - DOI - PMC - PubMed
    1. Das S., Forer L., Schönherr S., Sidore C., Locke A.E., Kwong A., Vrieze S.I., Chew E.Y., Levy S., McGue M., et al. Next-generation genotype imputation service and methods. Nat Genet. 2016;48:1284–1287. doi: 10.1038/ng.3656. - DOI - PMC - PubMed
    1. Price A.L., Patterson N.J., Plenge R.M., Weinblatt M.E., Shadick N.A., Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 2006;38:904–909. doi: 10.1038/ng1847. - DOI - PubMed
    1. Klann J.G., Joss M.A.H., Embree K., Murphy S.N. Data model harmonization for the All Of Us Research Program: Transforming i2b2 data into the OMOP common data model. PLoS ONE. 2019;14:e0212463. doi: 10.1371/journal.pone.0212463. - DOI - PMC - PubMed

LinkOut - more resources