Development of a novel clustering tool for linear peptide sequences
- PMID: 30014462
- PMCID: PMC6187223
- DOI: 10.1111/imm.12984
Development of a novel clustering tool for linear peptide sequences
Abstract
Epitopes identified in large-scale screens of overlapping peptides often share significant levels of sequence identity, complicating the analysis of epitope-related data. Clustering algorithms are often used to facilitate these analyses, but available methods are generally insufficient in their capacity to define biologically meaningful epitope clusters in the context of the immune response. To fulfil this need we developed an algorithm that generates epitope clusters based on representative or consensus sequences. This tool allows the user to cluster peptide sequences on the basis of a specified level of identity by selecting among three different method options. These include the 'clique method', in which all members of the cluster must share the same minimal level of identity with each other, and the 'connected graph method', in which all members of a cluster must share a defined level of identity with at least one other member of the cluster. In cases where it is not possible to define a clear consensus sequence with the connected graph method, a third option provides a novel 'cluster-breaking algorithm' for consensus sequence driven sub-clustering. Herein we demonstrate the tool's clustering performance and applicability using (i) a selection of dengue virus epitopes for the 'clique method', (ii) sets of allergen-derived peptides from related species for the 'connected graph method' and (iii) large data sets of eluted ligand, major histocompatibility complex binding and T-cell recognition data captured within the Immune Epitope Database (IEDB) with the newly developed 'cluster-breaking algorithm'. This novel clustering tool is accessible at http://tools.iedb.org/cluster2/.
Keywords: Allergy; Antigens/Peptides/Epitopes; Bioinformatics>; MHC/HLA; Viral.
© 2018 The Authors. Immunology Published by John Wiley & Sons Ltd.
Figures





Similar articles
-
Next-generation IEDB tools: a platform for epitope prediction and analysis.Nucleic Acids Res. 2024 Jul 5;52(W1):W526-W532. doi: 10.1093/nar/gkae407. Nucleic Acids Res. 2024. PMID: 38783079 Free PMC article.
-
Hammock: a hidden Markov model-based peptide clustering algorithm to identify protein-interaction consensus motifs in large datasets.Bioinformatics. 2016 Jan 1;32(1):9-16. doi: 10.1093/bioinformatics/btv522. Epub 2015 Sep 5. Bioinformatics. 2016. PMID: 26342231 Free PMC article.
-
ImmunomeBrowser: a tool to aggregate and visualize complex and heterogeneous epitopes in reference proteins.Bioinformatics. 2018 Nov 15;34(22):3931-3933. doi: 10.1093/bioinformatics/bty463. Bioinformatics. 2018. PMID: 29878047 Free PMC article.
-
Methods and protocols for prediction of immunogenic epitopes.Brief Bioinform. 2007 Mar;8(2):96-108. doi: 10.1093/bib/bbl038. Epub 2006 Oct 31. Brief Bioinform. 2007. PMID: 17077136 Review.
-
Epitope peptides and immunotherapy.Curr Protein Pept Sci. 2007 Feb;8(1):109-18. doi: 10.2174/138920307779941569. Curr Protein Pept Sci. 2007. PMID: 17305564 Review.
Cited by
-
Characterization and epitope identification of the T cell response in non-allergic individuals exposed to mouse allergen.World Allergy Organ J. 2019 Apr 20;12(4):100026. doi: 10.1016/j.waojou.2019.100026. eCollection 2019. World Allergy Organ J. 2019. PMID: 31044023 Free PMC article.
-
T cell reactivity to Bordetella pertussis is highly diverse regardless of childhood vaccination.Cell Host Microbe. 2023 Aug 9;31(8):1404-1416.e4. doi: 10.1016/j.chom.2023.06.015. Epub 2023 Jul 24. Cell Host Microbe. 2023. PMID: 37490913 Free PMC article.
-
The Discovery of Peptide Macrocycle Rescuers of Pathogenic Protein Misfolding and Aggregation by Integrating SICLOPPS Technology and Ultrahigh-Throughput Screening in Bacteria.Methods Mol Biol. 2022;2371:215-246. doi: 10.1007/978-1-0716-1689-5_12. Methods Mol Biol. 2022. PMID: 34596851
-
Comparison of HLA ligand elution data and binding predictions reveals varying prediction performance for the multiple motifs recognized by HLA-DQ2.5.Immunology. 2021 Feb;162(2):235-247. doi: 10.1111/imm.13279. Epub 2020 Nov 3. Immunology. 2021. PMID: 33064841 Free PMC article.
-
Benchmark datasets of immune receptor-epitope structural complexes.BMC Bioinformatics. 2019 Oct 10;20(1):490. doi: 10.1186/s12859-019-3109-6. BMC Bioinformatics. 2019. PMID: 31601176 Free PMC article.
References
-
- Vivona S, Gardy JL, Ramachandran S, Brinkman FS, Raghava GP, Flower DR et al Computer‐aided biotechnology: from immuno‐informatics to reverse vaccinology. Trends Biotechnol 2008; 26:190–200. - PubMed
-
- Dhanda SK, Usmani SS, Agrawal P, Nagpal G, Gautam A, Raghava GPS. Novel in silico tools for designing peptide‐based subunit vaccines and immunotherapeutics. Brief Bioinform 2017; 18:467–78. - PubMed
-
- Paradela A, Alvarez I, Garcia‐Peydro M, Sesma L, Ramos M, Vazquez J et al Limited diversity of peptides related to an alloreactive T cell epitope in the HLA‐B27‐bound peptide repertoire results from restrictions at multiple steps along the processing‐loading pathway. J Immunol 2000; 164:329–37. - PubMed
-
- Altenburg AF, Rimmelzwaan GF, de Vries RD. Virus‐specific T cells as correlate of (cross‐)protective immunity against influenza. Vaccine 2015; 33:500–6. - PubMed
-
- Sette A, Sidney J. Nine major HLA class I supertypes account for the vast preponderance of HLA‐A and ‐B polymorphism. Immunogenetics 1999; 50:201–12. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials