Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 May 23;8(5):e64546.
doi: 10.1371/journal.pone.0064546. Print 2013.

IMSA: integrated metagenomic sequence analysis for identification of exogenous reads in a host genomic background

Affiliations

IMSA: integrated metagenomic sequence analysis for identification of exogenous reads in a host genomic background

Michelle T Dimon et al. PLoS One. .

Abstract

Metagenomics, the study of microbial genomes within diverse environments, is a rapidly developing field. The identification of microbial sequences within a host organism enables the study of human intestinal, respiratory, and skin microbiota, and has allowed the identification of novel viruses in diseases such as Merkel cell carcinoma. There are few publicly available tools for metagenomic high throughput sequence analysis. We present Integrated Metagenomic Sequence Analysis (IMSA), a flexible, fast, and robust computational analysis pipeline that is available for public use. IMSA takes input sequence from high throughput datasets and uses a user-defined host database to filter out host sequence. IMSA then aligns the filtered reads to a user-defined universal database to characterize exogenous reads within the host background. IMSA assigns a score to each node of the taxonomy based on read frequency, and can output this as a taxonomy report suitable for cluster analysis or as a taxonomy map (TaxMap). IMSA also outputs the specific sequence reads assigned to a taxon of interest for downstream analysis. We demonstrate the use of IMSA to detect pathogens and normal flora within sequence data from a primary human cervical cancer carrying HPV16, a primary human cutaneous squamous cell carcinoma carrying HPV 16, the CaSki cell line carrying HPV16, and the HeLa cell line carrying HPV18.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. IMSA results on CaSki positive control dataset.
A) Bar chart showing the number of reads in the dataset at each step of the IMSA pipeline. B) Breakdown of the division of reads left after host filtering, as determined by BLAST to NCBI’s nt database. C) The number of reads that align within each 100 base pair bin along the HPV16 genome in the unfiltered dataset compared to the IMSA filtered dataset.
Figure 2
Figure 2. TaxMap of bacterial reads in a primary cutaneous SCC.
TaxMap of shows the breakdown of bacterial read scores at the kingdom, family, genus and species levels. This TaxMap has been filtered to only show nodes with a score above 50.
Figure 3
Figure 3. TaxMap of viral reads in a combined HeLa and CaSki dataset.
IMSA is able to accurately identify both alphapapillomaviridae species 7 (HPV18) and species 9 (HPV16) in the merged dataset.This TaxMap has been filtered to only show nodes with a score above 50.
Figure 4
Figure 4. Comparison of filtering databases.
NCBI’s RefSeq database includes viral sequence mis-annotated as human; using this as a host filter results in loss of HPV16 reads (black). Filtering against the human genome (hg19) alone allows detection of these reads (gray).

References

    1. Handelsman J, Rondon MR, Brady SF, Clardy J, Goodman RM (1998) Molecular biological access to the chemistry of unknown soil microbes: a new frontier for natural products. Chem Biol 5: R245–249. - PubMed
    1. Borewicz K, Pragman AA, Kim HB, Hertz M, Wendt C, et al... (2012) Longitudinal Analysis of the Lung Microbiome in Lung Transplantation. FEMS Microbiol Lett. - PMC - PubMed
    1. Lagier JC, Million M, Hugon P, Armougom F, Raoult D (2012) Human gut microbiota: repertoire and variations. Front Cell Infect Microbiol 2: 136. - PMC - PubMed
    1. Pragman AA, Kim HB, Reilly CS, Wendt C, Isaacson RE (2012) The lung microbiome in moderate and severe chronic obstructive pulmonary disease. PLoS ONE 7: e47305. - PMC - PubMed
    1. Feng H, Shuda M, Chang Y, Moore PS (2008) Clonal integration of a polyomavirus in human Merkel cell carcinoma. Science 319: 1096–1100. - PMC - PubMed

Publication types