. 2009 Oct 15;10 Suppl 12(Suppl 12):S1.

doi: 10.1186/1471-2105-10-S12-S1.

Data recovery and integration from public databases uncovers transformation-specific transcriptional downregulation of cAMP-PKA pathway-encoding genes

Chiara Balestrieri¹, Lilia Alberghina, Marco Vanoni, Ferdinando Chiaradonna

Affiliations

PMID: 19828069
PMCID: PMC2762058
DOI: 10.1186/1471-2105-10-S12-S1

Data recovery and integration from public databases uncovers transformation-specific transcriptional downregulation of cAMP-PKA pathway-encoding genes

Chiara Balestrieri et al. BMC Bioinformatics. 2009.

. 2009 Oct 15;10 Suppl 12(Suppl 12):S1.

doi: 10.1186/1471-2105-10-S12-S1.

Authors

Chiara Balestrieri¹, Lilia Alberghina, Marco Vanoni, Ferdinando Chiaradonna

Affiliation

¹ Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza della Scienza 2, Milan, Italy. c.balestrieri@campus.unimib.it

PMID: 19828069
PMCID: PMC2762058
DOI: 10.1186/1471-2105-10-S12-S1

Abstract

Background: The integration of data from multiple genome-wide assays is essential for understanding dynamic spatio-temporal interactions within cells. Such integration, which leads to a more complete view of cellular processes, offers the opportunity to rationalize better the high amount of "omics" data freely available in several public databases.In particular, integration of microarray-derived transcriptome data with other high-throughput analyses (genomic and mutational analysis, promoter analysis) may allow us to unravel transcriptional regulatory networks under a variety of physio-pathological situations, such as the alteration in the cross-talk between signal transduction pathways in transformed cells.

Results: Here we sequentially apply web-based and statistical tools to a case study: the role of oncogenic activation of different signal transduction pathways in the transcriptional regulation of genes encoding proteins involved in the cAMP-PKA pathway. To this end, we first re-analyzed available genome-wide expression data for genes encoding proteins of the downstream branch of the PKA pathway in normal tissues and human tumor cell lines. Then, in order to identify mutation-dependent transcriptional signatures, we classified cancer cells as a function of their mutational state. The results of such procedure were used as a starting point to analyze the structure of PKA pathway-encoding genes promoters, leading to identification of specific combinations of transcription factor binding sites, which are neatly consistent with available experimental data and help to clarify the relation between gene expression, transcriptional factors and oncogenes in our case study.

Conclusions: Genome-wide, large-scale "omics" experimental technologies give different, complementary perspectives on the structure and regulatory properties of complex systems. Even the relatively simple, integrated workflow presented here offers opportunities not only for filtering data noise intrinsic in high throughput data, but also to progressively extract novel information that would have remained hidden otherwise. In fact we have been able to detect a strong transcriptional repression of genes encoding proteins of cAMP/PKA pathway in cancer cells of different genetic origins. The basic workflow presented herein may be easily extended by incorporating other tools and can be applied even by researchers with poor bioinformatics skills.

PubMed Disclaimer

Figures

**Figure 1**
**Statistical analysis of the 41 PKA pathway-encoding genes expression in normal and transformed samples.** 81 transcriptional profiles from normal tissues and from the NCI60 cancer cell line collection, were recovered from the GEO database. After normalization (see Methods), the expression values of 41 PKA pathway-encoding genes were used to perform an ANOVA analysis (p-value 0.0001) to evaluate the statistical significance of the differences between normal and transformed samples. IQR: Interquartile Range. Outliers are also shown.

**Figure 2**
**Hierarchical clustering of the 41 PKA pathway-encoding genes analyzed in this paper.** Two-way (gene, column and cell line, row) hierarchical clustering (see Methods) of the same profiles analyzed in Figure 1. Normalized expression is colour-coded from green (poor expression) to red (strong expression). The name of each gene is colour-coded according to family to which it belongs. The 6 main classes described in the text (red lines on the top of the dendrogram and roman number bottom of the dendrogram) are shown. The distance function is based on Pearson correlation and complete linkage clustering. Legends for expression, condition, gene family and tissue of origin are shown on the right of the dendrogram.

**Figure 3**
**Identification of differentially regulated genes in normal and transformed samples. (A)** Samples were sorted in five groups according to mutational activation: green, normal; yellow, Ras; red, PI3K; blue, Other Mutation; cyan, Not Tested. Principal Component Analysis (PCA) performed on 41 PKA pathway-encoding genes for normal samples and the four classes of mutation-dependent samples. Each sphere represents the comparative averaging of the 41 genes for each pathway identified by mutational analysis. **(B)** For each of the 5 groups described in **(A)**, the 41 PKA-encoding genes were clustered, relative to their level of expression, in three subgroups: Strong (>1, red), Average (=1, black) and Low (<1, green). **(C)** Gene list according to expression level and mutational group of the three subgroups previously indicated, divided for each sample. Color-coding is as follows: blue, common between normal and at least one transformed sample; yellow, specific for normal samples; grey, specific for transformed samples. Percentage of regulated genes for each subgroup is shown at the bottom. **(D)** ANOVA analysis to evaluate the statistical significance of the differences between the five classes of samples described in **(A)**. The right inset shows p-value of the pair-wise comparisons. Statistically significant differences are indicated in red. IQR: Interquartile Range. Outliers are also shown.

**Figure 4**
**Hierarchical clustering of the 41 PKA pathway-encoding genes in transformed samples.** Two-way (gene, column and cell line, row) hierarchical clustering (see Methods) of the profiles from the NCI60 collection only. Normalized expression is colour-coded from green (poor expression) to red (strong expression). The distance function is based on Pearson correlation and complete linkage clustering. The name of each gene is colour-coded according to the family to which it belongs. Legends for expression, condition, gene family and tissue of origin are shown on the right of the dendrogram. The data have been organized on the basis of the tissue of origin of the cancer (Tissue), the specific oncogenic mutations identified in each cell line (Mutation), the putative altered pathway by the specific mutations (Pathway) and the gene family.

**Figure 5**
**TFBS identification by using the enrichment as parameter. (A)** The panel shows for each TFBS, recognized as relevant (present in ≥ 70% of the promoters of 41 PKA pathway-encoding genes) the percentage of promoters in our collection that contain the motif as compared to Matrix Family Library on vertebrates. This percentage has been calculated by dividing the total number of promoters containing the motif (S) by the total number of promoters (T). Color-coding scheme on the right of the panel. **(B)** Schematic representation of the TFBSs (color-coded as shown on the right of the panel) identified in the promoters of the 15 subgroups described in the text and in Figure 3. Each cartoon represents the promoter structure resulting from the average of the TFBS identified in ≥ 70% of the gene promoters for each subgroup. The asterisks on the bottom of the cartoon indicate the over-represented TFBS, as scored in panel A, for all the 41 PKA pathway-encoding genes.

**Figure 6**
**Hierarchical clustering of TFBSs present in the promoters of the 41 PKA pathway-encoding genes, according to total number and frequency.** Two-way (TBFS, column and expression subgroup, see Figure 3, row) hierarchical clustering of the TFBS present within the promoters of the 41 PKA pathway-encoding genes. Clustering was run according to the total number of TBFS present in each group (panel A) or to the frequency, i.e. the total number of a given TBFS divided by the number of promoters (panel (B). The color-coding scale is shown at the top of each panel. The distance function is based on Pearson correlation and complete linkage clustering. The two classes, corresponding to the main arms of the dendrogram, derived from clustering according to "Condition" are shown on the right of each dendrogram.

**Figure 7**
**Flowchart of our web-based and statistical strategy used to elucidate the relation between PKA encoding genes transcriptional profiles and oncogenic mutations. (A)** Flow chart of our web-based and statistical strategy with indication of some of the databases (Source) used, the type of data analyzed (Input), the specific program and statistical test (Tool) used and the result obtained (Output). **(B)** Graphical representation of the block diagram summarizing functional interconnections within the PKA pathway module with indication of the expression level (geometric mean) of each gene belonging to the network -Strong (red), Average (black) and Low (green)- as identified by our analysis both in normal (B, left) and transformed samples (B, right). **(C)** Boxplots of the expression of PKA pathway-encoding genes in normal (C, left) and transformed (C, right) samples, grouped for functional classes (ADCY: adenylyl cyclase; AKAP: A-kinase anchor protein; PDE: phosphodiesterase; PRKACR: PKA regulatory subunit; PRKAC: PKA catalytic subunit). The represented value is the median. **(D)** Schematic representation of the TFBSs (color-coded) identified in the promoters of PKA pathway-encoding genes of normal and transformed samples. Each cartoon represents the promoter structure resulting from the merge of the TFBS identified in ≥ 70% of the gene promoters of all normal samples and transformed samples.

See this image and copyright information in PMC

References

1. Alberghina L, Chiaradonna F, Vanoni M. Systems biology and the molecular circuits of cancer. Chembiochem. 2004;5:1322–1333. doi: 10.1002/cbic.200400170. - DOI - PubMed
1. Li H, Xuan J, Wang Y, Zhan M. Inferring regulatory networks. Front Biosci. 2008;13:263–275. doi: 10.2741/2677. - DOI - PubMed
1. Zhu X, Gerstein M, Snyder M. Getting connected: analysis and principles of biological networks. Genes Dev. 2007;21:1010–1024. doi: 10.1101/gad.1528707. - DOI - PubMed
1. Srinivasan BS, Shah NH, Flannick JA, Abeliuk E, Novak AF, Batzoglou S. Current progress in network research: toward reference networks for key model organisms. Briefings in bioinformatics. 2007;8:318–332. doi: 10.1093/bib/bbm038. - DOI - PubMed
1. Kwoh CK, Ng PY. Network analysis approach for biology. Cell Mol Life Sci. 2007;64:1739–1751. doi: 10.1007/s00018-007-7053-7. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Data recovery and integration from public databases uncovers transformation-specific transcriptional downregulation of cAMP-PKA pathway-encoding genes

Affiliation

Data recovery and integration from public databases uncovers transformation-specific transcriptional downregulation of cAMP-PKA pathway-encoding genes

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources