Combination of short-read, long-read, and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications
- PMID: 28360231
- PMCID: PMC5411765
- DOI: 10.1101/gr.215095.116
Combination of short-read, long-read, and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications
Abstract
Accurate and contiguous genome assembly is key to a comprehensive understanding of the processes shaping genomic diversity and evolution. Yet, it is frequently constrained by constitutive heterochromatin, usually characterized by highly repetitive DNA. As a key feature of genome architecture associated with centromeric and subtelomeric regions, it locally influences meiotic recombination. In this study, we assess the impact of large tandem repeat arrays on the recombination rate landscape in an avian speciation model, the Eurasian crow. We assembled two high-quality genome references using single-molecule real-time sequencing (long-read assembly [LR]) and single-molecule optical maps (optical map assembly [OM]). A three-way comparison including the published short-read assembly (SR) constructed for the same individual allowed assessing assembly properties and pinpointing misassemblies. By combining information from all three assemblies, we characterized 36 previously unidentified large repetitive regions in the proximity of sequence assembly breakpoints, the majority of which contained complex arrays of a 14-kb satellite repeat or its 1.2-kb subunit. Using whole-genome population resequencing data, we estimated the population-scaled recombination rate (ρ) and found it to be significantly reduced in these regions. These findings are consistent with an effect of low recombination in regions adjacent to centromeric or subtelomeric heterochromatin and add to our understanding of the processes generating widespread heterogeneity in genetic diversity and differentiation along the genome. By combining three different technologies, our results highlight the importance of adding a layer of information on genome structure that is inaccessible to each approach independently.
© 2017 Weissensteiner et al.; Published by Cold Spring Harbor Laboratory Press.
Figures





Similar articles
-
Alpha-CENTAURI: assessing novel centromeric repeat sequence variation with long read sequencing.Bioinformatics. 2016 Jul 1;32(13):1921-1924. doi: 10.1093/bioinformatics/btw101. Epub 2016 Feb 24. Bioinformatics. 2016. PMID: 27153570 Free PMC article.
-
Assembly of chloroplast genomes with long- and short-read data: a comparison of approaches using Eucalyptus pauciflora as a test case.BMC Genomics. 2018 Dec 29;19(1):977. doi: 10.1186/s12864-018-5348-8. BMC Genomics. 2018. PMID: 30594129 Free PMC article.
-
HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies.Genome Res. 2017 May;27(5):793-800. doi: 10.1101/gr.214767.116. Epub 2017 Jan 19. Genome Res. 2017. PMID: 28104618 Free PMC article.
-
HINGE: long-read assembly achieves optimal repeat resolution.Genome Res. 2017 May;27(5):747-756. doi: 10.1101/gr.216465.116. Epub 2017 Mar 20. Genome Res. 2017. PMID: 28320918 Free PMC article.
-
Long-read sequencing reveals a 4.4 kb tandem repeat region in the mitogenome of Echinococcus granulosus (sensu stricto) genotype G1.Parasit Vectors. 2019 May 16;12(1):238. doi: 10.1186/s13071-019-3492-x. Parasit Vectors. 2019. PMID: 31097022 Free PMC article.
Cited by
-
The Complete Plastome Sequences of Eleven Capsicum Genotypes: Insights into DNA Variation and Molecular Evolution.Genes (Basel). 2018 Oct 17;9(10):503. doi: 10.3390/genes9100503. Genes (Basel). 2018. PMID: 30336638 Free PMC article.
-
Decoding the Role of Satellite DNA in Genome Architecture and Plasticity-An Evolutionary and Clinical Affair.Genes (Basel). 2020 Jan 9;11(1):72. doi: 10.3390/genes11010072. Genes (Basel). 2020. PMID: 31936645 Free PMC article. Review.
-
Comprehensive evaluation of non-hybrid genome assembly tools for third-generation PacBio long-read sequence data.Brief Bioinform. 2019 May 21;20(3):866-876. doi: 10.1093/bib/bbx147. Brief Bioinform. 2019. PMID: 29112696 Free PMC article.
-
Chromosome-level reference genome assembly of the gyrfalcon (Falco rusticolus) and population genomics offer insights into the falcon population in Mongolia.Sci Rep. 2025 Feb 4;15(1):4154. doi: 10.1038/s41598-025-88216-9. Sci Rep. 2025. PMID: 39900672 Free PMC article.
-
A test for meiotic drive in hybrids between Australian and Timor zebra finches.Ecol Evol. 2020 Nov 3;10(23):13464-13475. doi: 10.1002/ece3.6951. eCollection 2020 Dec. Ecol Evol. 2020. PMID: 33304552 Free PMC article.
References
-
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol 215: 403–410. - PubMed
-
- Anantharaman T, Mishra B. 2001. False positives in genomic map assembly and sequence validation. In Algorithms in bioinformatics first international workshop, WABI 2001, Århus, Denmark.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials