Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2008 Jun:Chapter 7:7.12.1-7.12.39.
doi: 10.1002/0471250953.bi0712s22.

Using GenePattern for gene expression analysis

Affiliations
Review

Using GenePattern for gene expression analysis

Heidi Kuehn et al. Curr Protoc Bioinformatics. 2008 Jun.

Abstract

The abundance of genomic data now available in biomedical research has stimulated the development of sophisticated statistical methods for interpreting the data, and of special visualization tools for displaying the results in a concise and meaningful manner. However, biologists often find these methods and tools difficult to understand and use correctly. GenePattern is a freely available software package that addresses this issue by providing more than 100 analysis and visualization tools for genomic research in a comprehensive user-friendly environment for users at all levels of computational experience and sophistication. This unit demonstrates how to prepare and analyze microarray data in GenePattern.

PubMed Disclaimer

Figures

Figure 7.12.1
Figure 7.12.1
all_aml_train.gct as it appears in Excel. GenePattern File Formats (http://genepattern.org/tutorial/gp_fileformats.html) fully describes the GCT file format.
Figure 7.12.2
Figure 7.12.2
all_aml_train.cls as it appears in Notepad. GenePattern File Formats (http://genepattern.org/tutorial/gp_fileformats.html) fully describes the CLS file format.
Figure 7.12.3
Figure 7.12.3
GenePattern Web Client start page. The Modules & Pipelines pane lists all modules installed on the GenePattern server. For illustration purposes, we installed only the modules used in this protocol. Typically, more modules are listed.
Figure 7.12.4
Figure 7.12.4
PreprocessDataset parameters. Table 7.12.2 describes the PreprocessDataset parameters.
Figure 7.12.5
Figure 7.12.5
ComparativeMarkerSelection parameters. Table 7.12.3 describes the ComparativeMarkerSelection parameters.
Figure 7.12.6
Figure 7.12.6
ComparativeMarkerSelection Viewer.
Figure 7.12.7
Figure 7.12.7
Heat map for the top 100 differentially expressed genes.
Figure 7.12.8
Figure 7.12.8
HierarchicalClustering parameters. Table 7.12.5 describes the HierarchicalClustering parameters.
Figure 7.12.9
Figure 7.12.9
HierarchicalClustering Viewer.
Figure 7.12.10
Figure 7.12.10
KNNXValidation parameters. Table 7.12.7 describes the parameters for the k-nearest neighbors (KNN) class prediction method.
Figure 7.12.11
Figure 7.12.11
PredictionResults Viewer. Each point represents a sample, with color indicating the predicted class. Absolute confidence value indicates the probability that the sample belongs to the predicted class.
Figure 7.12.12
Figure 7.12.12
FeatureSummary Viewer.
Figure 7.12.13
Figure 7.12.13
KNN parameters. Table 7.12.7 describes the parameters for the k-nearest neighbors (KNN) class prediction method.
Figure 7.12.14
Figure 7.12.14
Create Pipeline for KNN classification analysis. The Pipeline Designer form defines the steps that will replicate the KNN classification analysis. Click the arrow icon next to a step to collapse or expand that step. When the form opens, all steps are expanded. This figure shows the first step collapsed.

References

    1. Benjamini Y, Hochberg Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B. 1995;57:289–300.
    1. Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and regression trees. Wadsworth & Brooks/Cole Advanced Books & Software; Monterey, Calif: 1984.
    1. Brunet J, Tamayo P, Golub TR, Mesirov JP. Metagenes and molecular pattern discovery using matrix factorization. Proc. Natl. Acad. Sci. U.S.A. 2004;101:4164–4169. - PMC - PubMed
    1. Cover TM, Hart PE. Nearest neighbor pattern classification. IEEE Trans. Info. Theory. 1967;13:21–27.
    1. D'haeseleer P. How does gene expression clustering work? Nat. Biotechnol. 2005;23:1499–1501. - PubMed

LinkOut - more resources