Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jan 8;48(D1):D971-D976.
doi: 10.1093/nar/gkz829.

PGG.Han: the Han Chinese genome database and analysis platform

Affiliations

PGG.Han: the Han Chinese genome database and analysis platform

Yang Gao et al. Nucleic Acids Res. .

Abstract

As the largest ethnic group in the world, the Han Chinese population is nonetheless underrepresented in global efforts to catalogue the genomic variability of natural populations. Here, we developed the PGG.Han, a population genome database to serve as the central repository for the genomic data of the Han Chinese Genome Initiative (Phase I). In its current version, the PGG.Han archives whole-genome sequences or high-density genome-wide single-nucleotide variants (SNVs) of 114 783 Han Chinese individuals (a.k.a. the Han100K), representing geographical sub-populations covering 33 of the 34 administrative divisions of China, as well as Singapore. The PGG.Han provides: (i) an interactive interface for visualization of the fine-scale genetic structure of the Han Chinese population; (ii) genome-wide allele frequencies of hierarchical sub-populations; (iii) ancestry inference for individual samples and controlling population stratification based on nested ancestry informative markers (AIMs) panels; (iv) population-structure-aware shared control data for genotype-phenotype association studies (e.g. GWASs) and (v) a Han-Chinese-specific reference panel for genotype imputation. Computational tools are implemented into the PGG.Han, and an online user-friendly interface is provided for data analysis and results visualization. The PGG.Han database is freely accessible via http://www.pgghan.org or https://www.hanchinesegenomes.org.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Sketch map of data integration in the PGG.Han. Using all the whole-genome sequencing data of the Han Chinese samples as a reference panel, the genotype data of 102 586 samples were carefully imputed. The imputation results and the whole-genome sequencing data were further integrated. Strict quality control was applied throughout the process. WGS, whole-genome sequencing.
Figure 2.
Figure 2.
Sitemap of the PGG.Han. The database consists of two major functional modules: (i) visualizing the population structure and querying allele frequencies of subgroups (red boxes); (ii) online analysis tools (blue boxes). Each analysis tool can be used independently and combined freely. The red arrow represents the recommended workflow. The imputation step is optional and should be contingent upon the details of the dataset of interest. AF, allele frequency; Ref. allele, reference allele.

References

    1. Goodwin S., McPherson J.D., McCombie W.R.. Coming of age: ten years of next-generation sequencing technologies. Nat. Rev. Genet. 2016; 17:333–351. - PMC - PubMed
    1. Consortium U.K., Walter K., Min J.L., Huang J., Crooks L., Memari Y., McCarthy S., Perry J.R., Xu C., Futema M. et al. .. The UK10K project identifies rare variants in health and disease. Nature. 2015; 526:82–90. - PMC - PubMed
    1. Metspalu A., Kohler F., Laschinski G., Ganten D., Roots I.. The Estonian Genome Project in the context of European genome research. Dtsch. Med. Wochenschr. 2004; 129(Suppl. 1):S25–28. - PubMed
    1. Brody J.A., Morrison A.C., Bis J.C., O’Connell J.R., Brown M.R., Huffman J.E., Ames D.C., Carroll A., Conomos M.P., Gabriel S. et al. .. Analysis commons, a team approach to discovery in a big-data environment for genetic epidemiology. Nat. Genet. 2017; 49:1560–1563. - PMC - PubMed
    1. Tadaka S., Katsuoka F., Ueki M., Kojima K., Makino S., Saito S., Otsuki A., Gocho C., Sakurai-Yageta M., Danjoh I. et al. .. 3.5KJPNv2: an allele frequency panel of 3552 Japanese individuals including the X chromosome. Hum. Genome Var. 2019; 6:28. - PMC - PubMed

Publication types