Pattern recognition in bioinformatics
- PMID: 23559637
- DOI: 10.1093/bib/bbt020
Pattern recognition in bioinformatics
Abstract
Pattern recognition is concerned with the development of systems that learn to solve a given problem using a set of example instances, each represented by a number of features. These problems include clustering, the grouping of similar instances; classification, the task of assigning a discrete label to a given instance; and dimensionality reduction, combining or selecting features to arrive at a more useful representation. The use of statistical pattern recognition algorithms in bioinformatics is pervasive. Classification and clustering are often applied to high-throughput measurement data arising from microarray, mass spectrometry and next-generation sequencing experiments for selecting markers, predicting phenotype and grouping objects or genes. Less explicitly, classification is at the core of a wide range of tools such as predictors of genes, protein function, functional or genetic interactions, etc., and used extensively in systems biology. A course on pattern recognition (or machine learning) should therefore be at the core of any bioinformatics education program. In this review, we discuss the main elements of a pattern recognition course, based on material developed for courses taught at the BSc, MSc and PhD levels to an audience of bioinformaticians, computer scientists and life scientists. We pay attention to common problems and pitfalls encountered in applications and in interpretation of the results obtained.
Keywords: bioinformatics; classification; clustering; dimensionality reduction; pattern recognition.
Similar articles
-
Penalized and weighted K-means for clustering with scattered objects and prior information in high-throughput biological data.Bioinformatics. 2007 Sep 1;23(17):2247-55. doi: 10.1093/bioinformatics/btm320. Epub 2007 Jun 27. Bioinformatics. 2007. PMID: 17597097
-
Online clustering algorithms for radar emitter classification.IEEE Trans Pattern Anal Mach Intell. 2005 Aug;27(8):1185-96. doi: 10.1109/TPAMI.2005.166. IEEE Trans Pattern Anal Mach Intell. 2005. PMID: 16119259
-
A novel approach for clustering proteomics data using Bayesian fast Fourier transform.Bioinformatics. 2005 May 15;21(10):2210-24. doi: 10.1093/bioinformatics/bti383. Epub 2005 Mar 15. Bioinformatics. 2005. PMID: 15769836
-
A review of feature selection techniques in bioinformatics.Bioinformatics. 2007 Oct 1;23(19):2507-17. doi: 10.1093/bioinformatics/btm344. Epub 2007 Aug 24. Bioinformatics. 2007. PMID: 17720704 Review.
-
[An overview of feature selection algorithm in bioinformatics].Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2011 Apr;28(2):410-4. Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2011. PMID: 21604512 Review. Chinese.
Cited by
-
Recognition and reconstruction of cell differentiation patterns with deep learning.PLoS Comput Biol. 2023 Oct 27;19(10):e1011582. doi: 10.1371/journal.pcbi.1011582. eCollection 2023 Oct. PLoS Comput Biol. 2023. PMID: 37889897 Free PMC article.
-
English Feature Recognition Based on GA-BP Neural Network Algorithm and Data Mining.Comput Intell Neurosci. 2021 Aug 30;2021:1890120. doi: 10.1155/2021/1890120. eCollection 2021. Comput Intell Neurosci. 2021. Retraction in: Comput Intell Neurosci. 2023 Jul 26;2023:9892876. doi: 10.1155/2023/9892876. PMID: 34504519 Free PMC article. Retracted.
-
Systems Biology to Support Nanomaterial Grouping.Adv Exp Med Biol. 2017;947:143-171. doi: 10.1007/978-3-319-47754-1_6. Adv Exp Med Biol. 2017. PMID: 28168668 Review.
-
A primer on the use of machine learning to distil knowledge from data in biological psychiatry.Mol Psychiatry. 2024 Feb;29(2):387-401. doi: 10.1038/s41380-023-02334-2. Epub 2024 Jan 4. Mol Psychiatry. 2024. PMID: 38177352 Free PMC article. Review.
-
Visualisation of the T cell differentiation programme by Canonical Correspondence Analysis of transcriptomes.BMC Genomics. 2014 Nov 27;15(1):1028. doi: 10.1186/1471-2164-15-1028. BMC Genomics. 2014. PMID: 25428805 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources