Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Dec 8:5:e4133.
doi: 10.7717/peerj.4133. eCollection 2017.

A novel approach for human whole transcriptome analysis based on absolute gene expression of microarray data

Affiliations

A novel approach for human whole transcriptome analysis based on absolute gene expression of microarray data

Shirley Bikel et al. PeerJ. .

Abstract

Background: In spite of the emergence of RNA sequencing (RNA-seq), microarrays remain in widespread use for gene expression analysis in the clinic. There are over 767,000 RNA microarrays from human samples in public repositories, which are an invaluable resource for biomedical research and personalized medicine. The absolute gene expression analysis allows the transcriptome profiling of all expressed genes under a specific biological condition without the need of a reference sample. However, the background fluorescence represents a challenge to determine the absolute gene expression in microarrays. Given that the Y chromosome is absent in female subjects, we used it as a new approach for absolute gene expression analysis in which the fluorescence of the Y chromosome genes of female subjects was used as the background fluorescence for all the probes in the microarray. This fluorescence was used to establish an absolute gene expression threshold, allowing the differentiation between expressed and non-expressed genes in microarrays.

Methods: We extracted the RNA from 16 children leukocyte samples (nine males and seven females, ages 6-10 years). An Affymetrix Gene Chip Human Gene 1.0 ST Array was carried out for each sample and the fluorescence of 124 genes of the Y chromosome was used to calculate the absolute gene expression threshold. After that, several expressed and non-expressed genes according to our absolute gene expression threshold were compared against the expression obtained using real-time quantitative polymerase chain reaction (RT-qPCR).

Results: From the 124 genes of the Y chromosome, three genes (DDX3Y, TXLNG2P and EIF1AY) that displayed significant differences between sexes were used to calculate the absolute gene expression threshold. Using this threshold, we selected 13 expressed and non-expressed genes and confirmed their expression level by RT-qPCR. Then, we selected the top 5% most expressed genes and found that several KEGG pathways were significantly enriched. Interestingly, these pathways were related to the typical functions of leukocytes cells, such as antigen processing and presentation and natural killer cell mediated cytotoxicity. We also applied this method to obtain the absolute gene expression threshold in already published microarray data of liver cells, where the top 5% expressed genes showed an enrichment of typical KEGG pathways for liver cells. Our results suggest that the three selected genes of the Y chromosome can be used to calculate an absolute gene expression threshold, allowing a transcriptome profiling of microarray data without the need of an additional reference experiment.

Discussion: Our approach based on the establishment of a threshold for absolute gene expression analysis will allow a new way to analyze thousands of microarrays from public databases. This allows the study of different human diseases without the need of having additional samples for relative expression experiments.

Keywords: Absolute gene expression; Human; Leukocyte; Microarray; Personalized medicine; Transcriptome; Transcriptomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare there are no competing interests.

Figures

Figure 1
Figure 1. Determination of a threshold to measure the absolute gene expression based in the Y chromosome genes.
(A) Schematic representation of Y-chromosome genes with similar fluorescence intensity between female and male subjects (gene A or C), and genes with sex-dependent fluorescence (gene B). (B) Histogram of fluorescence for the 124 Y-chromosome genes in males (black line) and females subjects (dotted red line). (C) Histogram of fluorescence for the 23 Y-chromosome genes that showed a statistically significant sex-dependent fluorescence. The gridded region showed the fluorescence of the genes overlapped between male and female. The heatmap illustrates the real fluorescence level for these genes in our leukocyte samples. (D) Histogram of fluorescence distribution of the three genes (four probes) used to calculate the absolute gene expression threshold. Their fluorescence does not overlap between male and female subjects. The heatmap illustrates the real fluorescence level for these four probes in our leukocyte samples. (E) Schematic representation of the three genes (four probes) used to calculate the absolute gene expression threshold for the microarray.
Figure 2
Figure 2. Microarray fluorescence of the housekeeping and SRY genes.
The microarrays from male and female subjects are in blue and pink areas, respectively. The fluorescence of the housekeeping genes (green lines) was above the absolute gene expression threshold (dotted red line), while the fluorescence level of the SRY gene was under the absolute gene expression threshold (blue line). Two Y chromosome genes used to calculate the expression threshold were illustrated in orange lines. The microarray identifier corresponding to each sample number is shown in Table S8.
Figure 3
Figure 3. Microarray fluorescence of the genes selected for RT-qPCR.
The fluorescence of the 10 genes selected for RT-qPCR analysis is shown for male (blue area) and female (pink area). The genes with fluorescence above the absolute gene expression threshold is shown in green, orange and red lines, while the fluorescence of genes under the threshold is shown in blue lines.
Figure 4
Figure 4. Correlations between the gene expression levels obtained by RT-qPCR (Ct values) and microarray (fluorescence values).
The blue and red symbols represent male and female samples, respectively. The absolute gene expression threshold (Fi = 4) is represented as a blue dotted line. (A) Genes used to calculate the absolute gene expression threshold. (B) Genes with different fluorescence values (low to high). According to the Ct values and fluorescence levels three clusters were showed: highly expressed genes clustered in group 1, medium expressed genes clustered in group 2, and low expressed genes clustered with the non-expressed genes in group 3.

References

    1. Berta P, Hawkins JR, Sinclair AH, Taylor A, Griffiths BL, Goodfellow PN, Fellous M. Genetic evidence equating SRY and the testis-determining factor. Nature. 1990;348:448–450. doi: 10.1038/348448A0. - DOI - PubMed
    1. Björling E, Lindskog C, Oksvold P, Linné J, Kampf C, Hober S, Uhlén M, Pontén F. A web-based tool for in silico biomarker discovery based on tissue-specific protein profiles in normal and cancer tissues. Molecular & Cellular Proteomics. 2008;7:825–844. doi: 10.1074/mcp.M700411-MCP200. - DOI - PubMed
    1. Chang CW, Cheng WC, Chen CR, Shu WY, Tsai ML, Huang CL, Hsu IC. Identification of human housekeeping genes and tissue-selective genes by microarray meta-analysis. PLOS ONE. 2011;6:e22859. doi: 10.1371/journal.pone.0022859. - DOI - PMC - PubMed
    1. Helena Mangs A, Morris BJ. The human pseudoautosomal region (PAR): origin, function and future. Current Genomics. 2007;8:129–136. doi: 10.2174/138920207780368141. - DOI - PMC - PubMed
    1. Hishiki T, Kawamoto S, Morishita S, Okubo K. BodyMap: a human and mouse gene expression database. Nucleic Acids Research. 2000;28:136–138. doi: 10.1093/nar/28.1.136. - DOI - PMC - PubMed