Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 23;50(17):e101.
doi: 10.1093/nar/gkac543.

StrainXpress: strain aware metagenome assembly from short reads

Affiliations

StrainXpress: strain aware metagenome assembly from short reads

Xiongbin Kang et al. Nucleic Acids Res. .

Abstract

Next-generation sequencing-based metagenomics has enabled to identify microorganisms in characteristic habitats without the need for lengthy cultivation. Importantly, clinically relevant phenomena such as resistance to medication, virulence or interactions with the environment can vary already within species. Therefore, a major current challenge is to reconstruct individual genomes from the sequencing reads at the level of strains, and not just the level of species. However, strains of one species can differ only by minor amounts of variants, which makes it difficult to distinguish them. Despite considerable recent progress, related approaches have remained fragmentary so far. Here, we present StrainXpress, as a comprehensive solution to the problem of strain aware metagenome assembly from next-generation sequencing reads. In experiments, StrainXpress reconstructs strain-specific genomes from metagenomes that involve up to >1000 strains and proves to successfully deal with poorly covered strains. The amount of reconstructed strain-specific sequence exceeds that of the current state-of-the-art approaches by on average 26.75% across all data sets (first quartile: 18.51%, median: 26.60%, third quartile: 35.05%).

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Workflow of StrainXpress. StrainXpress consists of three stages: ‘Clustering Reads’ (1), ‘Local Assembly’ (2) and ‘Global Assembly’ (3). All stages are based on overlap graphs as underlying data structure. The workflow follows a ‘Divide-And-Conquer’ strategy. While (1) and (2) reflect the ‘Divide’ part, (3) reflects the ‘Conquer’ part.
Figure 2.
Figure 2.
Genome fraction versus the coverage of strains on the high complexity data set (2× 250 bp). The high complexity data set contains 1057 strains from 376 species. The average coverage of the 1057 strains is 10× but varies according to a log-normal distribution. We display the genome fraction of the different strains in the different coverage intervals. Different colors denote different assembly methods.
Figure 3.
Figure 3.
We generated simulation data sets with different coverage, which contains 10 salmonella strains. The synthetic reads are mixed with real gut metagenome sequencing data and then assemble them with different approaches. The figure presents the change of Genome Fraction in distinct assembly methods with the increase of coverage of the ten salmonella strains.
Figure 4.
Figure 4.
Genome Fraction (%) of ‘Gut Metagenome’. In the 5 real gut metagenome sequencing data, StrainEst predicted 11 strains applicable to serve as ground truth. The performance of the different methods is evaluated with METAQUAST. The first column displays the GenBank access numbers of the 11 strains. Numbers in the heatmap correspond to genome fraction.

References

    1. Ling L.L., Schneider T., Peoples A.J., Spoering A.L., Engels I., Conlon B.P., Mueller A., Schäberle T.F., Hughes D.E., Epstein S.et al. .. A new antibiotic kills pathogens without detectable resistance. Nature. 2015; 517:455–459. - PMC - PubMed
    1. Fierer N. Embracing the unknown: disentangling the complexities of the soil microbiome. Nat. Rev. Microbiol. 2017; 15:579–590. - PubMed
    1. Moran M.A. The global ocean microbiome. Science. 2015; 350:aac8455. - PubMed
    1. Methé B.A., Nelson K.E., Pop M., Creasy H.H., Giglio M.G., Huttenhower C., Gevers D., Petrosino J.F., Abubucker S., Badger J.H.et al. .. A framework for human microbiome research. Nature. 2012; 486:215. - PMC - PubMed
    1. Strazzulli A., Fusco S., Cobucci-Ponzano B., Moracci M., Contursi P.. Metagenomics of microbial and viral life in terrestrial geothermal environments. Rev. Environ. Sci. Bio/Technol. 2017; 16:425–454.

Publication types