Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2010 Mar 1;26(5):680-2.
doi: 10.1093/bioinformatics/btq003. Epub 2010 Jan 6.

CD-HIT Suite: a web server for clustering and comparing biological sequences

Affiliations
Comparative Study

CD-HIT Suite: a web server for clustering and comparing biological sequences

Ying Huang et al. Bioinformatics. .

Abstract

CD-HIT is a widely used program for clustering and comparing large biological sequence datasets. In order to further assist the CD-HIT users, we significantly improved this program with more functions and better accuracy, scalability and flexibility. Most importantly, we developed a new web server, CD-HIT Suite, for clustering a user-uploaded sequence dataset or comparing it to another dataset at different identity levels. Users can now interactively explore the clusters within web browsers. We also provide downloadable clusters for several public databases (NCBI NR, Swissprot and PDB) at different identity levels.

Availability: Free access at http://cd-hit.org

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Screenshots of CD-HIT Suite. (a) Cluster Explorer for investigating clusters. (b) A cluster distribution plot to explore the global structure of a whole dataset.

Similar articles

Cited by

References

    1. Letunic I, et al. SMART 6: recent updates and new developments. Nucleic Acids Res. 2009;37:D229–D232. - PMC - PubMed
    1. Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22:1658–1659. - PubMed
    1. Li W, et al. Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics. 2001;17:282–283. - PubMed
    1. Li W, et al. Tolerating some redundancy significantly speeds up clustering of large protein databases. Bioinformatics. 2002;18:77–82. - PubMed
    1. Li W, et al. Probing metagenomics by rapid cluster analysis of very large datasets. PLoS ONE. 2008;3:e3375. - PMC - PubMed

Publication types