An improved protocol for sequencing of repetitive genomic regions and structural variations using mutagenesis and next generation sequencing
- PMID: 22912860
- PMCID: PMC3422288
- DOI: 10.1371/journal.pone.0043359
An improved protocol for sequencing of repetitive genomic regions and structural variations using mutagenesis and next generation sequencing
Abstract
The rise of Next Generation Sequencing (NGS) technologies has transformed de novo genome sequencing into an accessible research tool, but obtaining high quality eukaryotic genome assemblies remains a challenge, mostly due to the abundance of repetitive elements. These also make it difficult to study nucleotide polymorphism in repetitive regions, including certain types of structural variations. One solution proposed for resolving such regions is Sequence Assembly aided by Mutagenesis (SAM), which relies on the fact that introducing enough random mutations breaks the repetitive structure, making assembly possible. Sequencing many different mutated copies permits the sequence of the repetitive region to be inferred by consensus methods. However, this approach relies on molecular cloning in order to isolate and amplify individual mutant copies, making it hard to scale-up the approach for use in conjunction with high-throughput sequencing technologies. To address this problem, we propose NG-SAM, a modified version of the SAM protocol that relies on PCR and dilution steps only, coupled to a NGS workflow. NG-SAM therefore has the potential to be scaled-up, e.g. using emerging microfluidics technologies. We built a realistic simulation pipeline to study the feasibility of NG-SAM, and our results suggest that under appropriate experimental conditions the approach might be successfully put into practice. Moreover, our simulations suggest that NG-SAM is capable of reconstructing robustly a wide range of potential target sequences of varying lengths and repetitive structures.
Conflict of interest statement
Figures













Similar articles
-
Preparing a re-sequencing DNA library of 2 cancer candidate genes using the ligation-by-amplification protocol by two PCR reactions.Sci China C Life Sci. 2009 May;52(5):483-91. doi: 10.1007/s11427-009-0066-8. Epub 2009 May 27. Sci China C Life Sci. 2009. PMID: 19471873
-
De novo construction of a "Gene-space" for diploid plant genome rich in repetitive sequences by an iterative Process of Extraction and Assembly of NGS reads (iPEA protocol) with limited computing resources.BMC Res Notes. 2016 Feb 11;9:81. doi: 10.1186/s13104-016-1903-z. BMC Res Notes. 2016. PMID: 26864345 Free PMC article.
-
ReRep: computational detection of repetitive sequences in genome survey sequences (GSS).BMC Bioinformatics. 2008 Sep 9;9:366. doi: 10.1186/1471-2105-9-366. BMC Bioinformatics. 2008. PMID: 18782453 Free PMC article.
-
PacBio Sequencing and Its Applications.Genomics Proteomics Bioinformatics. 2015 Oct;13(5):278-89. doi: 10.1016/j.gpb.2015.08.002. Epub 2015 Nov 2. Genomics Proteomics Bioinformatics. 2015. PMID: 26542840 Free PMC article. Review.
-
Effects of genome structure variation, homeologous genes and repetitive DNA on polyploid crop research in the age of genomics.Plant Sci. 2016 Jan;242:37-46. doi: 10.1016/j.plantsci.2015.09.017. Epub 2015 Sep 26. Plant Sci. 2016. PMID: 26566823 Review.
Cited by
-
Genetic Diversity and Gene Family Expansions in Members of the Genus Entamoeba.Genome Biol Evol. 2019 Mar 1;11(3):688-705. doi: 10.1093/gbe/evz009. Genome Biol Evol. 2019. PMID: 30668670 Free PMC article.
-
Facilitated sequence counting and assembly by template mutagenesis.Proc Natl Acad Sci U S A. 2014 Oct 28;111(43):E4632-7. doi: 10.1073/pnas.1416204111. Epub 2014 Oct 13. Proc Natl Acad Sci U S A. 2014. PMID: 25313059 Free PMC article.
References
-
- Metzker M (2010) Sequencing technologies - the next generation. Nat Rev Genet 11: 31–46. - PubMed
-
- Flicek P, Birney E (2009) Sense from sequence reads: methods for alignment and assembly. Nat Methods 6: S6–S12. - PubMed
-
- International Human Genome Sequencing Consortium (2004) Finishing the euchromatic sequence of the human genome. Nature 431: 931–945. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources