CD-HIT Suite: a web server for clustering and comparing biological sequences
- PMID: 20053844
- PMCID: PMC2828112
- DOI: 10.1093/bioinformatics/btq003
CD-HIT Suite: a web server for clustering and comparing biological sequences
Abstract
CD-HIT is a widely used program for clustering and comparing large biological sequence datasets. In order to further assist the CD-HIT users, we significantly improved this program with more functions and better accuracy, scalability and flexibility. Most importantly, we developed a new web server, CD-HIT Suite, for clustering a user-uploaded sequence dataset or comparing it to another dataset at different identity levels. Users can now interactively explore the clusters within web browsers. We also provide downloadable clusters for several public databases (NCBI NR, Swissprot and PDB) at different identity levels.
Availability: Free access at http://cd-hit.org
Figures
Similar articles
-
BOV--a web-based BLAST output visualization tool.BMC Genomics. 2008 Sep 15;9:414. doi: 10.1186/1471-2164-9-414. BMC Genomics. 2008. PMID: 18793422 Free PMC article.
-
Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.Bioinformatics. 2006 Jul 1;22(13):1658-9. doi: 10.1093/bioinformatics/btl158. Epub 2006 May 26. Bioinformatics. 2006. PMID: 16731699
-
PLAN: a web platform for automating high-throughput BLAST searches and for managing and mining results.BMC Bioinformatics. 2007 Feb 9;8:53. doi: 10.1186/1471-2105-8-53. BMC Bioinformatics. 2007. PMID: 17291345 Free PMC article.
-
CD-HIT: accelerated for clustering the next-generation sequencing data.Bioinformatics. 2012 Dec 1;28(23):3150-2. doi: 10.1093/bioinformatics/bts565. Epub 2012 Oct 11. Bioinformatics. 2012. PMID: 23060610 Free PMC article.
-
SALIGN: a web server for alignment of multiple protein sequences and structures.Bioinformatics. 2012 Aug 1;28(15):2072-3. doi: 10.1093/bioinformatics/bts302. Epub 2012 May 21. Bioinformatics. 2012. PMID: 22618536 Free PMC article.
Cited by
-
Epidemiology and genetic diversity of circulating dengue viruses in Medellin, Colombia: a fever surveillance study.BMC Infect Dis. 2020 Jul 2;20(1):466. doi: 10.1186/s12879-020-05172-7. BMC Infect Dis. 2020. PMID: 32615988 Free PMC article.
-
Meta-2OM: A multi-classifier meta-model for the accurate prediction of RNA 2'-O-methylation sites in human RNA.PLoS One. 2024 Jun 26;19(6):e0305406. doi: 10.1371/journal.pone.0305406. eCollection 2024. PLoS One. 2024. PMID: 38924058 Free PMC article.
-
Plant growth promoting activities of endophytic bacteria from Melia azedarach (Meliaceae) and their influence on plant growth under gnotobiotic conditions.Heliyon. 2024 Aug 5;10(15):e35814. doi: 10.1016/j.heliyon.2024.e35814. eCollection 2024 Aug 15. Heliyon. 2024. PMID: 39170558 Free PMC article.
-
Phage-encoded ribosomal protein S21 expression is linked to late-stage phage replication.ISME Commun. 2022 Mar 30;2(1):31. doi: 10.1038/s43705-022-00111-w. ISME Commun. 2022. PMID: 37938675 Free PMC article.
-
The human transmembrane proteome.Biol Direct. 2015 May 28;10:31. doi: 10.1186/s13062-015-0061-x. Biol Direct. 2015. PMID: 26018427 Free PMC article.
References
-
- Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22:1658–1659. - PubMed
-
- Li W, et al. Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics. 2001;17:282–283. - PubMed
-
- Li W, et al. Tolerating some redundancy significantly speeds up clustering of large protein databases. Bioinformatics. 2002;18:77–82. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources