MetaCC allows scalable and integrative analyses of both long-read and short-read metagenomic Hi-C data
- PMID: 37802989
- PMCID: PMC10558524
- DOI: 10.1038/s41467-023-41209-6
MetaCC allows scalable and integrative analyses of both long-read and short-read metagenomic Hi-C data
Abstract
Metagenomic Hi-C (metaHi-C) can identify contig-to-contig relationships with respect to their proximity within the same physical cell. Shotgun libraries in metaHi-C experiments can be constructed by next-generation sequencing (short-read metaHi-C) or more recent third-generation sequencing (long-read metaHi-C). However, all existing metaHi-C analysis methods are developed and benchmarked on short-read metaHi-C datasets and there exists much room for improvement in terms of more scalable and stable analyses, especially for long-read metaHi-C data. Here we report MetaCC, an efficient and integrative framework for analyzing both short-read and long-read metaHi-C datasets. MetaCC outperforms existing methods on normalization and binning. In particular, the MetaCC normalization module, named NormCC, is more than 3000 times faster than the current state-of-the-art method HiCzin on a complex wastewater dataset. When applied to one sheep gut long-read metaHi-C dataset, MetaCC binning module can retrieve 709 high-quality genomes with the largest species diversity using one single sample, including an expansion of five uncultured members from the order Erysipelotrichales, and is the only binner that can recover the genome of one important species Bacteroides vulgatus. Further plasmid analyses reveal that MetaCC binning is able to capture multi-copy plasmids.
© 2023. Springer Nature Limited.
Conflict of interest statement
The authors declare no competing interests.
Figures




Similar articles
-
Imputing Metagenomic Hi-C Contacts Facilitates the Integrative Contig Binning Through Constrained Random Walk with Restart.J Comput Biol. 2024 Oct;31(10):1008-1021. doi: 10.1089/cmb.2024.0663. Epub 2024 Sep 9. J Comput Biol. 2024. PMID: 39246231
-
METAMVGL: a multi-view graph-based metagenomic contig binning algorithm by integrating assembly and paired-end graphs.BMC Bioinformatics. 2021 Jul 22;22(Suppl 10):378. doi: 10.1186/s12859-021-04284-4. BMC Bioinformatics. 2021. PMID: 34294039 Free PMC article.
-
metaFlye: scalable long-read metagenome assembly using repeat graphs.Nat Methods. 2020 Nov;17(11):1103-1110. doi: 10.1038/s41592-020-00971-x. Epub 2020 Oct 5. Nat Methods. 2020. PMID: 33020656 Free PMC article.
-
Classification of metagenomic sequences: methods and challenges.Brief Bioinform. 2012 Nov;13(6):669-81. doi: 10.1093/bib/bbs054. Epub 2012 Sep 8. Brief Bioinform. 2012. PMID: 22962338 Review.
-
Recovering metagenome-assembled genomes from shotgun metagenomic sequencing data: Methods, applications, challenges, and opportunities.Microbiol Res. 2022 Jul;260:127023. doi: 10.1016/j.micres.2022.127023. Epub 2022 Apr 8. Microbiol Res. 2022. PMID: 35430490 Review.
Cited by
-
MetaHiCNet: a web server for normalizing and visualizing microbial Hi-C interaction networks.Nucleic Acids Res. 2025 Jul 7;53(W1):W383-W389. doi: 10.1093/nar/gkaf340. Nucleic Acids Res. 2025. PMID: 40287822 Free PMC article.
-
Synthetic community Hi-C benchmarking provides a baseline for virus-host inferences.bioRxiv [Preprint]. 2025 Apr 23:2025.02.12.637985. doi: 10.1101/2025.02.12.637985. bioRxiv. 2025. PMID: 39990352 Free PMC article. Preprint.
-
Benchmarking metagenomic binning tools on real datasets across sequencing platforms and binning modes.Nat Commun. 2025 Mar 24;16(1):2865. doi: 10.1038/s41467-025-57957-6. Nat Commun. 2025. PMID: 40128535 Free PMC article.
-
Analysis of metagenomic data.Nat Rev Methods Primers. 2025;5:5. doi: 10.1038/s43586-024-00376-6. Epub 2025 Jan 23. Nat Rev Methods Primers. 2025. PMID: 40688383 Free PMC article.
References
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources