A comprehensively molecular haplotype-resolved genome of a European individual
- PMID: 21813624
- PMCID: PMC3202284
- DOI: 10.1101/gr.125047.111
A comprehensively molecular haplotype-resolved genome of a European individual
Abstract
Independent determination of both haplotype sequences of an individual genome is essential to relate genetic variation to genome function, phenotype, and disease. To address the importance of phase, we have generated the most complete haplotype-resolved genome to date, "Max Planck One" (MP1), by fosmid pool-based next generation sequencing. Virtually all SNPs (>99%) and 80,000 indels were phased into haploid sequences of up to 6.3 Mb (N50 ~1 Mb). The completeness of phasing allowed determination of the concrete molecular haplotype pairs for the vast majority of genes (81%) including potential regulatory sequences, of which >90% were found to be constituted by two different molecular forms. A subset of 159 genes with potentially severe mutations in either cis or trans configurations exemplified in particular the role of phase for gene function, disease, and clinical interpretation of personal genomes (e.g., BRCA1). Extended genomic regions harboring manifold combinations of physically and/or functionally related genes and regulatory elements were resolved into their underlying "haploid landscapes," which may define the functional genome. Moreover, the majority of genes and functional sequences were found to contain individual or rare SNPs, which cannot be phased from population data alone, emphasizing the importance of molecular phasing for characterizing a genome in its molecular individuality. Our work provides the foundation to understand that the distinction of molecular haplotypes is essential to resolve the (inherently individual) biology of genes, genomes, and disease, establishing a reference point for "phase-sensitive" personal genomics. MP1's annotated haploid genomes are available as a public resource.
Figures





Similar articles
-
A Fosmid Pool-Based Next Generation Sequencing Approach to Haplotype-Resolve Whole Genomes.Methods Mol Biol. 2017;1551:223-269. doi: 10.1007/978-1-4939-6750-6_13. Methods Mol Biol. 2017. PMID: 28138850
-
Haplotype sorting using human fosmid clone end-sequence pairs.Genome Res. 2008 Dec;18(12):2016-23. doi: 10.1101/gr.081786.108. Epub 2008 Oct 3. Genome Res. 2008. PMID: 18836033 Free PMC article.
-
Multiple haplotype-resolved genomes reveal population patterns of gene and protein diplotypes.Nat Commun. 2014 Nov 26;5:5569. doi: 10.1038/ncomms6569. Nat Commun. 2014. PMID: 25424553 Free PMC article.
-
Haplotype-resolved genome sequencing: experimental methods and applications.Nat Rev Genet. 2015 Jun;16(6):344-58. doi: 10.1038/nrg3903. Epub 2015 May 7. Nat Rev Genet. 2015. PMID: 25948246 Review.
-
De novo phasing resolves haplotype sequences in complex plant genomes.Plant Biotechnol J. 2022 Jun;20(6):1031-1041. doi: 10.1111/pbi.13815. Epub 2022 Apr 9. Plant Biotechnol J. 2022. PMID: 35332665 Free PMC article. Review.
Cited by
-
Probing meiotic recombination and aneuploidy of single sperm cells by whole-genome sequencing.Science. 2012 Dec 21;338(6114):1627-30. doi: 10.1126/science.1229112. Science. 2012. PMID: 23258895 Free PMC article.
-
Genome mapping on nanochannel arrays for structural variation analysis and sequence assembly.Nat Biotechnol. 2012 Aug;30(8):771-6. doi: 10.1038/nbt.2303. Nat Biotechnol. 2012. PMID: 22797562 Free PMC article.
-
A fast and accurate algorithm for single individual haplotyping.BMC Syst Biol. 2012;6 Suppl 2(Suppl 2):S8. doi: 10.1186/1752-0509-6-S2-S8. Epub 2012 Dec 12. BMC Syst Biol. 2012. PMID: 23282221 Free PMC article.
-
Comparison of phasing strategies for whole human genomes.PLoS Genet. 2018 Apr 5;14(4):e1007308. doi: 10.1371/journal.pgen.1007308. eCollection 2018 Apr. PLoS Genet. 2018. PMID: 29621242 Free PMC article.
-
Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of Single Individual Haplotyping techniques.Nucleic Acids Res. 2012 Mar;40(5):2041-53. doi: 10.1093/nar/gkr1042. Epub 2011 Nov 18. Nucleic Acids Res. 2012. PMID: 22102577 Free PMC article.
References
-
- Bansal V, Bafna V 2008. HapCUT: An efficient and accurate algorithm for the haplotype assembly problem. Bioinformatics 24: i153–i159 - PubMed
-
- Bansal V, Tewhey R, Topol EJ, Schork NJ 2011. The next phase in human genetics. Nat Biotechnol 29: 38–39 - PubMed
-
- Beissbarth T, Speed TP 2004. GOstat: Find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics 20: 1464–1465 - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous