Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Feb 4;220(2):iyab227.
doi: 10.1093/genetics/iyab227.

A reference-quality, fully annotated genome from a Puerto Rican individual

Affiliations

A reference-quality, fully annotated genome from a Puerto Rican individual

Aleksey V Zimin et al. Genetics. .

Abstract

Until 2019, the human genome was available in only one fully annotated version, GRCh38, which was the result of 18 years of continuous improvement and revision. Despite dramatic improvements in sequencing technology, no other genome was available as an annotated reference until 2019, when the genome of an Ashkenazi individual, Ash1, was released. In this study, we describe the assembly and annotation of a second individual genome, from a Puerto Rican individual whose DNA was collected as part of the Human Pangenome project. The new genome, called PR1, is the first true reference genome created from an individual of African descent. Due to recent improvements in both sequencing and assembly technology, and particularly to the use of the recently completed CHM13 human genome as a guide to assembly, PR1 is more complete and more contiguous than either GRCh38 or Ash1. Annotation revealed 37,755 genes (of which 19,999 are protein coding), including 12 additional gene copies that are present in PR1 and missing from CHM13. Fifty-seven genes have fewer copies in PR1 than in CHM13, 9 map only partially, and 3 genes (all noncoding) from CHM13 are entirely missing from PR1.

Keywords: DNA sequencing; annotation; genome assembly; reference genome; variant calling.

PubMed Disclaimer

Figures

Figure 1
Figure 1
IBS relationship between PR1 and 2371 individuals of African/African-American (afr), NonFinish European (nfe), and East Asian (eas) descent in the 1KG and HGDP callset. Population labels were obtained from gnomAD v3.1 (Karczewski et al. 2020). The x-axis ratio has two components, IBS0 and IBS2*. IBS0 is an aggregate count of times that a given individual does not share any alleles with PR1 (i.e., PR1 is AA and the target individual is BB). IBS2* is the aggregate count of times that the target individual is heterozygous and shares two alleles with PR1 (genotypes AB and AB). The y-axis ratio (IBS1het2/IBS1het1) is a proxy for heterozygosity and the level of genetic variation. IBS1het1 is the aggregate count of times that PR1 is heterozygous and the target individual is homozygous, and they share one allele (i.e., AB and AA, respectively). IBS1het2 is the aggregate count of times that PR1 is homozygous and the target individual is heterozygous, and they share one allele (i.e., AA and AB, respectively). As expected for parent–child relationships, the IBS2* ratio for PR1’s parents, who appear as two dots on the far right of the plot, is 1.

References

    1. 1000 Genomes Project Consortium. 2015. A global reference for human genetic variation. Nature. 526:68–74. - PMC - PubMed
    1. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, et al. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389–3402. - PMC - PubMed
    1. Busby GB, Brisighelli F, Sanchez-Diz P, Ramos-Luis E, Martinez-Cadenas C, et al. 2012. The peopling of Europe and the cautionary tale of Y chromosome lineage R-M269. Proc Biol Sci. 279:884–892. - PMC - PubMed
    1. Cerezo M, Achilli A, Olivieri A, Perego UA, Gomez-Carballa A, et al. 2012. Reconstructing ancient mitochondrial DNA links between Africa and Europe. Genome Res. 22:821–826. - PMC - PubMed
    1. Cheng H, Concepcion GT, Feng X, Zhang H, Li H.. 2021. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods. 18:170–175. - PMC - PubMed

Publication types