An algorithm for the DNA sequence generation from k-tuple word contents of the minimal number of random fragments
- PMID: 1878166
- DOI: 10.1080/07391102.1991.10507867
An algorithm for the DNA sequence generation from k-tuple word contents of the minimal number of random fragments
Abstract
An algorithm is described for generation of the long sequence written in a four letter alphabet from the constituent k-tuple words in the minimal number of separate, randomly defined fragments of the starting sequence. It is primarily intended for use in sequencing by hybridization (SBH) process- a potential method for sequencing human genome DNA (Drmanac et al., Genomics 4, pp. 114-128, 1989). The algorithm is based on the formerly defined rules and informative entities of the linear sequence. The algorithm requires neither knowledge on the number of appearances of a given k-tuple in sequence fragments, nor the information on which k-tuple words are on the ends of a fragment. It operates with the mixed content of k-tuples of the various lengths. The concept of the algorithm enables operations with the k-tuple sets containing false positive and false negative k-tuples. The content of the false k-tuples primarily affects the completeness of the generated sequence, and its correctness in the specific cases only. The algorithm can be used for the optimization of SBH parameters in the simulation experiments, as well as for the sequence generation in the real SBH experiments on the genomic DNA.
Similar articles
-
Effect of k-tuple length on sample-comparison with high-throughput sequencing data.Biochem Biophys Res Commun. 2016 Jan 22;469(4):1021-7. doi: 10.1016/j.bbrc.2015.11.094. Epub 2015 Dec 22. Biochem Biophys Res Commun. 2016. PMID: 26721429
-
Likelihood DNA sequencing by hybridization.J Biomol Struct Dyn. 1993 Dec;11(3):637-53. doi: 10.1080/07391102.1993.10508020. J Biomol Struct Dyn. 1993. PMID: 8129876
-
Identification of functional elements in unaligned nucleic acid sequences by a novel tuple search algorithm.Comput Appl Biosci. 1996 Feb;12(1):71-80. doi: 10.1093/bioinformatics/12.1.71. Comput Appl Biosci. 1996. PMID: 8670622
-
1-Tuple DNA sequencing: computer analysis.J Biomol Struct Dyn. 1989 Aug;7(1):63-73. doi: 10.1080/07391102.1989.10507752. J Biomol Struct Dyn. 1989. PMID: 2684223 Review.
-
Negative information for building phylogenies.Recent Pat DNA Gene Seq. 2013 Aug;7(2):128-36. doi: 10.2174/1872215611307020007. Recent Pat DNA Gene Seq. 2013. PMID: 22974263 Review.
Cited by
-
DNA sequencing by hybridization: 100 bases read by a non-gel-based method.Proc Natl Acad Sci U S A. 1991 Nov 15;88(22):10089-93. doi: 10.1073/pnas.88.22.10089. Proc Natl Acad Sci U S A. 1991. PMID: 1946427 Free PMC article.
-
Melting studies of dangling-ended DNA hairpins: effects of end length, loop sequence and biotinylation of loop bases.Nucleic Acids Res. 2002 Sep 15;30(18):4088-93. doi: 10.1093/nar/gkf514. Nucleic Acids Res. 2002. PMID: 12235393 Free PMC article.
-
k-mer approaches for biodiversity genomics.Genome Res. 2025 Feb 14;35(2):219-230. doi: 10.1101/gr.279452.124. Genome Res. 2025. PMID: 39890468 Free PMC article. Review.
-
Real-time detection of DNA hybridization and melting on oligonucleotide arrays by using optical wave guides.Proc Natl Acad Sci U S A. 1995 Jul 3;92(14):6379-83. doi: 10.1073/pnas.92.14.6379. Proc Natl Acad Sci U S A. 1995. PMID: 7603999 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous