Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Mar;19(2):363-371.
doi: 10.1007/s10142-018-0647-3. Epub 2018 Nov 27.

A differential k-mer analysis pipeline for comparing RNA-Seq transcriptome and meta-transcriptome datasets without a reference

Affiliations

A differential k-mer analysis pipeline for comparing RNA-Seq transcriptome and meta-transcriptome datasets without a reference

Chon-Kit Kenneth Chan et al. Funct Integr Genomics. 2019 Mar.

Abstract

Next-generation DNA sequencing technologies, such as RNA-Seq, currently dominate genome-wide gene expression studies. A standard approach to analyse this data requires mapping sequence reads to a reference and counting the number of reads which map to each gene. However, for many transcriptome studies, a suitable reference genome is unavailable, especially for meta-transcriptome studies which assay gene expression from mixed populations of organisms. Where a reference is unavailable, it is possible to generate a reference by the de novo assembly of the sequence reads. However, the high cost of generating high-coverage data for de novo assembly hinders this approach and more importantly the accurate assembly of such data is challenging, especially for meta-transcriptome data, and resulting assemblies frequently suffer from collapsed regions or chimeric sequences. As an alternative to the standard reference mapping approach, we have developed a k-mer-based analysis pipeline (DiffKAP) to identify differentially expressed reads between RNA-Seq datasets without the requirement for a reference. We compared the DiffKAP approach with the traditional Tophat/Cuffdiff method using RNA-Seq data from soybean, which has a suitable reference genome. We subsequently examined differential gene expression for a coral meta-transcriptome where no reference is available, and validated the results using qRT-PCR. We conclude that DiffKAP is an accurate method to study differential gene expression in complex meta-transcriptomes without the requirement of a reference genome.

Keywords: Coral; Host-microbe symbiosis; K-mer analysis; Meta-transcriptome; RNA-Seq; Soybean.

PubMed Disclaimer

Similar articles

References

    1. Genome Biol. 2002 Jun 18;3(7):RESEARCH0034 - PubMed
    1. Neurosci Lett. 2003 Mar 13;339(1):62-6 - PubMed
    1. Nucleic Acids Res. 2003 Oct 1;31(19):5676-84 - PubMed
    1. Plant Physiol. 1981 Nov;68(5):1144-9 - PubMed
    1. Methods Mol Biol. 2007;406:89-112 - PubMed

LinkOut - more resources