Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jun;54(6):869-76.
doi: 10.1007/s11517-015-1382-8. Epub 2015 Sep 4.

A novel sparse coding algorithm for classification of tumors based on gene expression data

Affiliations

A novel sparse coding algorithm for classification of tumors based on gene expression data

Morteza Kolali Khormuji et al. Med Biol Eng Comput. 2016 Jun.

Abstract

High-dimensional genomic and proteomic data play an important role in many applications in medicine such as prognosis of diseases, diagnosis, prevention and molecular biology, to name a few. Classifying such data is a challenging task due to the various issues such as curse of dimensionality, noise and redundancy. Recently, some researchers have used the sparse representation (SR) techniques to analyze high-dimensional biological data in various applications in classification of cancer patients based on gene expression datasets. A common problem with all SR-based biological data classification methods is that they cannot utilize the topological (geometrical) structure of data. More precisely, these methods transfer the data into sparse feature space without preserving the local structure of data points. In this paper, we proposed a novel SR-based cancer classification algorithm based on gene expression data that takes into account the geometrical information of all data. Precisely speaking, we incorporate the local linear embedding algorithm into the sparse coding framework, by which we can preserve the geometrical structure of all data. For performance comparison, we applied our algorithm on six tumor gene expression datasets, by which we demonstrate that the proposed method achieves higher classification accuracy than state-of-the-art SR-based tumor classification algorithms.

Keywords: Gene expression data; Linear classification; Local linear embedding; Logistic loss function; Sparse representation.

PubMed Disclaimer

References

    1. Med Biol Eng Comput. 2007 Aug;45(8):769-80 - PubMed
    1. Cancer Cell. 2002 Mar;1(2):203-9 - PubMed
    1. Nat Med. 2002 Jan;8(1):68-74 - PubMed
    1. Proc Natl Acad Sci U S A. 2001 Nov 20;98(24):13790-5 - PubMed
    1. Conf Proc IEEE Eng Med Biol Soc. 2011;2011:3362-6 - PubMed

LinkOut - more resources