Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2014 May;15(3):343-53.
doi: 10.1093/bib/bbt067. Epub 2013 Sep 23.

New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing

Affiliations
Review

New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing

Kai Song et al. Brief Bioinform. 2014 May.

Abstract

With the development of next-generation sequencing (NGS) technologies, a large amount of short read data has been generated. Assembly of these short reads can be challenging for genomes and metagenomes without template sequences, making alignment-based genome sequence comparison difficult. In addition, sequence reads from NGS can come from different regions of various genomes and they may not be alignable. Sequence signature-based methods for genome comparison based on the frequencies of word patterns in genomes and metagenomes can potentially be useful for the analysis of short reads data from NGS. Here we review the recent development of alignment-free genome and metagenome comparison based on the frequencies of word patterns with emphasis on the dissimilarity measures between sequences, the statistical power of these measures when two sequences are related and the applications of these measures to NGS data.

Keywords: Markov model; NGS data; alignment-free; genome comparison; statistical power; word patterns.

PubMed Disclaimer

Similar articles

Cited by

References

    1. Smith TF, Waterman MS. Comparison of biosequences. Adv Appl Math. 1981;2:482–9.
    1. Altschul SF, Gish W, Miller W, et al. Basic local alignment search tool. J Mol Biol. 1990;215:403–10. - PubMed
    1. Blaisdell BE. A measure of the similarity of sets of sequences not requiring sequence alignment. Proc Natl Acad Sci USA. 1986;83:5155–9. - PMC - PubMed
    1. Blaisdell BE. Markov chain analysis finds a significant influence of neighboring bases on the occurrence of a base in eucaryotic nuclear DNA sequences both protein-coding and noncoding. J Mol Evol. 1985;21:278–88. - PubMed
    1. Vinga S, Almeida J. Alignment-free sequence comparison - a review. Bioinformatics. 2003;19:513–23. - PubMed

Publication types

MeSH terms