Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jul 20;3(8):100361.
doi: 10.1016/j.xgen.2023.100361. eCollection 2023 Aug 9.

Genotyping and population characteristics of the China Kadoorie Biobank

Affiliations

Genotyping and population characteristics of the China Kadoorie Biobank

Robin G Walters et al. Cell Genom. .

Abstract

The China Kadoorie Biobank (CKB) is a population-based prospective cohort of >512,000 adults recruited from 2004 to 2008 from 10 geographically diverse regions across China. Detailed data from questionnaires and physical measurements were collected at baseline, with additional measurements at three resurveys involving ∼5% of surviving participants. Analyses of genome-wide genotyping, for >100,000 participants using custom-designed Axiom arrays, reveal extensive relatedness, recent consanguinity, and signatures reflecting large-scale population movements from recent Chinese history. Systematic genome-wide association studies of incident disease, captured through electronic linkage to death and disease registries and to the national health insurance system, replicate established disease loci and identify 14 novel disease associations. Together with studies of candidate drug targets and disease risk factors and contributions to international genetics consortia, these demonstrate the breadth, depth, and quality of the CKB data. Ongoing high-throughput omics assays of collected biosamples and planned whole-genome sequencing will further enhance the scientific value of this biobank.

Keywords: GWAS; biobank; cardiovascular health; complex disease; genetic association studies; genetic epidemiology; genetics; genotyping; omics; prospective.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
China Kadoorie Biobank (CKB) survey details Baseline questionnaire content, physical measurements, on-site assays, and biosample collection were repeated at three subsequent resurveys of approximately 5% of surviving participants. Second and third resurveys used updated questionnaires, included additional physical measurements and on-site assays, and collected additional biosamples, as shown. Participants attending more than one resurvey are as indicated in the Venn diagram. See also Table S1.
Figure 2
Figure 2
Design of the CKB Axiom genotyping array The figure illustrates the different categories of content on the revised CKB array. Numbers indicate the approximate counts of variants in each category. Some variants fall into more than one category. See also Figures S1–S3, Tables S2 and S3, and Data S1 and S2.
Figure 3
Figure 3
Allele frequency and functional annotation of genotyped variants (A) Allele frequency distribution in unrelated CKB participants of variants passing QC on the two versions of the CKB genotyping array. (B) Comparison of CKB allele frequency of quality-controlled variants on array version 2 with the corresponding allele in the East Asian subset of the 1000 Genomes Phase 3 reference. (C) Frequency and characteristics of different classes of quality-controlled variants on array version 2, according to Combined Annotation Dependent Depletion (CADD version 1.6)., (D) Allele frequency distribution in CKB and European populations of quality-controlled variants on array version 2, for variants with different levels of predicted functional impact according to CADD. See also Figure S4.
Figure 4
Figure 4
National and local population structure in the CKB (A) Map of China and adjacent countries showing the locations of the ten CKB regional centers (RCs). Arrows denote known major population movements in recent history that can account for mismatches in the correlation between PCA and geography. (B) Plot of the two leading principal components from PCA of CKB genotypes, with each participant color-coded according to the RC where they were recruited. (C) Local maps are shown for each recruitment region, showing the geolocation of individual assessment centers color-coded according to latitude and longitude; the size of the symbol is proportional to the number of genotyped individuals from that center. Corresponding PCA plots show the first two principal components from PCA of individuals from that region, color-coded according to their recruitment center. Top 2 rows, urban regions; bottom 2 rows, rural regions. See also Figures S9, S10, S12, and S13.
Figure 5
Figure 5
Genome-wide significant associations from GWASs of ICD-10-coded disease events (A) Summary of the minor allele frequency and the effect size for the risk allele, for all associations with ICD-10-coded outcomes reaching genome-wide significance (5 × 10−8). Symbols are colored according to whether the association has previously been reported, and are sized in proportion to the number of cases in the corresponding GWAS. (B–E) Labels denote newly identified associations with (B) H40 (glaucoma), (C) H43 (disorders of vitreous body), (D) K60 (fissure and fistula of anal and rectal regions), and (E) K81 (cholecystitis), illustrated in the corresponding regional association plots, for which there are previously reported associations with related phenotypes or diseases at the same locus. Further plots for these and all other ICD-10 GWASs are available on the CKB PheWeb browser at pheweb.ckbiobank.org. See also Table S10.

Similar articles

Cited by

References

    1. Chen Z., Lee L., Chen J., Collins R., Wu F., Guo Y., Linksted P., Peto R. Cohort profile: the Kadoorie Study of Chronic Disease in China (KSCDC) Int. J. Epidemiol. 2005;34:1243–1249. doi: 10.1093/ije/dyi174. - DOI - PubMed
    1. Chen Z., Chen J., Collins R., Guo Y., Peto R., Wu F., Li L., China Kadoorie Biobank CKB collaborative group China Kadoorie Biobank of 0.5 million people: survey methods, baseline characteristics and long-term follow-up. Int. J. Epidemiol. 2011;40:1652–1666. doi: 10.1093/ije/dyr120. - DOI - PMC - PubMed
    1. Hindorff L.A., Bonham V.L., Brody L.C., Ginoza M.E.C., Hutter C.M., Manolio T.A., Green E.D. Prioritizing diversity in human genomics research. Nat. Rev. Genet. 2018;19:175–185. doi: 10.1038/nrg.2017.89. - DOI - PMC - PubMed
    1. Martin A.R., Kanai M., Kamatani Y., Okada Y., Neale B.M., Daly M.J. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 2019;51:584–591. doi: 10.1038/s41588-019-0379-x. - DOI - PMC - PubMed
    1. Millwood I.Y., Bennett D.A., Walters R.G., Clarke R., Waterworth D., Johnson T., Chen Y., Yang L., Guo Y., Bian Z., et al. A phenome-wide association study of a lipoprotein-associated phospholipase A2 loss-of-function variant in 90 000 Chinese adults. Int. J. Epidemiol. 2016;45:1588–1599. doi: 10.1093/ije/dyw087. - DOI - PMC - PubMed