Haplotype-resolved de novo assembly of a Tujia genome suggests the necessity for high-quality population-specific genome references
- PMID: 35180379
- DOI: 10.1016/j.cels.2022.01.006
Haplotype-resolved de novo assembly of a Tujia genome suggests the necessity for high-quality population-specific genome references
Abstract
Even though the human reference genome assembly is continually being improved, it remains debatable whether a population-specific reference is necessary for every ethnic group. Here, we de novo assembled an individual genome (TJ1) from the Tujia population, an ethnic minority group most closely related to the Han Chinese. TJ1 provided a high-quality haplotype-resolved assembly of chromosome-scale with a scaffold N50 size >78 Mb. Compared with GRCh38 and other de novo assemblies, TJ1 improved short-read mapping, enhanced calling precision for structural variants, and detected rare and low-frequency variants. This revealed fine-scale differences between the closely related Han and Tujia populations, such as population-stratified variants of LCT and UBXN8, and improved screening for ancestry informative markers. We demonstrated that TJ1 could reduce false positives in clinical diagnosis and analyzed the PRSS1-PRSS2 locus as a test case. Our results suggest that population-specific assemblies are necessary for genetic and medical analysis, especially when closely related populations are studied. A record of this paper's transparent peer review process is included in the supplemental information.
Keywords: Tujia; de novo assembly; haploid genome; medical practice; population genetics; population-specific reference genome; structural variation; variant calling.
Copyright © 2022 Elsevier Inc. All rights reserved.
Conflict of interest statement
Declaration of interests H.Z. and M.S. are employees of Berry Genomics.
Comment in
-
Evaluation of Lou et al.: Proposed benefits of a population-specific genome reference.Cell Syst. 2022 Apr 20;13(4):265-267. doi: 10.1016/j.cels.2022.03.005. Cell Syst. 2022. PMID: 35447075
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Miscellaneous
