Single-molecule sequencing resolves the detailed structure of complex satellite DNA loci in Drosophila melanogaster
- PMID: 28373483
- PMCID: PMC5411766
- DOI: 10.1101/gr.213512.116
Single-molecule sequencing resolves the detailed structure of complex satellite DNA loci in Drosophila melanogaster
Abstract
Highly repetitive satellite DNA (satDNA) repeats are found in most eukaryotic genomes. SatDNAs are rapidly evolving and have roles in genome stability and chromosome segregation. Their repetitive nature poses a challenge for genome assembly and makes progress on the detailed study of satDNA structure difficult. Here, we use single-molecule sequencing long reads from Pacific Biosciences (PacBio) to determine the detailed structure of all major autosomal complex satDNA loci in Drosophila melanogaster, with a particular focus on the 260-bp and Responder satellites. We determine the optimal de novo assembly methods and parameter combinations required to produce a high-quality assembly of these previously unassembled satDNA loci and validate this assembly using molecular and computational approaches. We determined that the computationally intensive PBcR-BLASR assembly pipeline yielded better assemblies than the faster and more efficient pipelines based on the MHAP hashing algorithm, and it is essential to validate assemblies of repetitive loci. The assemblies reveal that satDNA repeats are organized into large arrays interrupted by transposable elements. The repeats in the center of the array tend to be homogenized in sequence, suggesting that gene conversion and unequal crossovers lead to repeat homogenization through concerted evolution, although the degree of unequal crossing over may differ among complex satellite loci. We find evidence for higher-order structure within satDNA arrays that suggest recent structural rearrangements. These assemblies provide a platform for the evolutionary and functional genomics of satDNAs in pericentric heterochromatin.
© 2017 Khost et al.; Published by Cold Spring Harbor Laboratory Press.
Figures
References
-
- Abad JP, Agudo M, Molina I, Losada A, Ripoll P, Villasante A. 2000. Pericentromeric regions containing 1.688 satellite DNA sequences show anti-kinetochore antibody staining in prometaphase chromosomes of Drosophila melanogaster. Mol Gen Genet 264: 371–377. - PubMed
-
- Berlin K, Koren S, Chin CS, Drake JP, Landolin JM, Phillippy AM. 2015. Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat Biotechnol 33: 623–630. - PubMed
-
- Britten RJ, Kohne DE. 1968. Repeated sequences in DNA. Hundreds of thousands of copies of DNA sequences have been incorporated into the genomes of higher organisms. Science 161: 529–540. - PubMed
-
- Brutlag D, Carlson M, Fry K, Hsieh TS. 1977. DNA-sequence organization in Drosophila heterochromatin. Cold Spring Harb Sym 42: 1137–1146. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases