Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Jul 19;34(12):e87.
doi: 10.1093/nar/gkl485.

An open-access long oligonucleotide microarray resource for analysis of the human and mouse transcriptomes

Affiliations

An open-access long oligonucleotide microarray resource for analysis of the human and mouse transcriptomes

Kévin Le Brigand et al. Nucleic Acids Res. .

Abstract

Two collections of oligonucleotides have been designed for preparing pangenomic human and mouse microarrays. A total of 148,993 and 121,703 oligonucleotides were designed against human and mouse transcripts. Quality scores were created in order to select 25,342 human and 24,109 mouse oligonucleotides. They correspond to: (i) a BLAST-specificity score; (ii) the number of expressed sequence tags matching each probe; (iii) the distance to the 3' end of the target mRNA. Scores were also used to compare in silico the two microarrays with commercial microarrays. The sets described here, called RNG/MRC collections, appear at least as specific and sensitive as those from the commercial platforms. The RNG/MRC collections have now been used by an Anglo-French consortium to distribute more than 3500 microarrays to the academic community. Ad hoc identification of tissue-specific transcripts and a approximately 80% correlation with hybridizations performed on Affymetrix GeneChiptrade mark suggest that the RNG/MRC microarrays perform well. This work provides a comprehensive open resource for investigators working on human and mouse transcriptomes, as well as a generic method to generate new microarray collections in other organisms. All information related to these probes, as well as additional information about commercial microarrays have been stored in a freely-accessible database called MEDIANTE.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Definition of the X_HYBRID specificity score. Typical picture of a probe specificity analysis, as available from the MEDIANTE interface (). Each column represents the number of BLAST hits in the MEDIANTE database (blue), Ensembl database (green), RefSeq database (red) for the BLAST expect-value indicated at bottom. The X_HYBRID score for a probe was calculated as the maximal x_hybrid scores among the three databases. Based on expect-values equal to 10, 1, 10−1, 10−2, 10−3, 10−4, 10−15, 10−20, a ‘rank’ was defined, ranging from 0 for an expect-value equal to 10 to 6 for an expect-value equal to 10−15. For instance, the oligonucleotide depicted in Figure 1 has an extra hit in RefSeq for an expect-value of 10−1, thus defining a rank equal to 2. The number of extra hits between the rank and the lowest expect-value is called delta (Δ). In the example shown in Figure 1, Δ is equal to 1. Δ is always kept in the interval from 1 to 9, meaning that when there are more than 9 extra hits, Δ is kept to 9. A x_hybrid score is defined for each BLAST database (i.e. MEDIANTE, RefSeq, Ensembl) as a decimal number, where the integer part corresponds to the rank, and the leftovers to Δ. The final X_HYBRID score for a probe is defined as the maximal x_hybrid score obtained against the 3 BLAST databases.
Figure 2
Figure 2
Blast-specificities of the different probe collections. (A) Average X_HYBRID scores for the different human and mouse collections. (B) Percentage of probes in each set associated with a X_HYBRID above 2, i.e. less ‘BLAST-specific’. This comparison has been performed on a subset of 16,303 human and 13,073 mouse transcripts, common to all platforms. ALL represents the collection of all probes calculated with OligoArray2.0. RNG/MRC represents the selection of probes used for the fabrication of the microarrays.
Figure 3
Figure 3
Matches with human and mouse EST databases for the different probe collections. (A) Average EST_NUMBER scores for the different human and mouse collections. (B) Percentage of probes matching no ESTs for All MEDIANTE probes, for the RNG/MRC, Agilent, Illumina and Affymetrix probe sets. For human, the comparison was performed on a subset of 7,325 transcripts having ‘BLAST-specific’ probes in all sets, i.e. X_HYBRID lower than 2.0. For mouse, the comparison was performed on a subset of 6,358 such transcripts. A matching EST was defined by a 95% identity between one probe and an EST.
Figure 4
Figure 4
Distribution of the probes according to their DIST_TO_3′ score. (A) human. (B) mouse. More than 90% of probes for the human sets and 98% of probes for the mouse sets are located within 1,500 bases from the 3′-end of target mRNAs. Legend indicates the average DIST_TO_3′ score for each collection. This comparison has been performed on a subset of 16,303 human and 13,073 mouse transcripts, common to all sets.
Figure 5
Figure 5
MEDIANTE screenshot of the summary data for transcript NM_001652. The different exons of each transcript are represented by dark and light blue boxes. The RNG/MRC probes are represented on the first line; the light green box indicates the ‘current optimal probe’. The red box indicates the RNG/MRC probe(s). The blue box indicates a probe selected for a local microarray production. Each set of probes is represented on a distinct line. Affymetrix probe sets are represented by their first and last 25-mer perfect match probes. Additional information about the transcript or probes, such as gene chromosomal location, probe specificity, etc. are provided as clickable links. Subforms provide information about Gene Ontology annotations, bibliographic references or tissue-specificity.
Figure 6
Figure 6
Scatter plot of the ratios measured on Affymetrix GeneChip™ (x-axis) and on RNG/MRC microarrays (y-axis). RNA was derived from either HEK293 cells or a keratinocyte cell line (DK7). 11053 transcripts had at least one Affymetrix probe set and one RNG/MRC probe. Among them, 7054 pairs were further analyzed, as their intensity level was larger than the 25th percentile on both platforms. After quantification of the signals on both platforms, the ratio of the expression levels between the two cell lines was established. The coefficient of correlation was equal to 0.81.
Figure 7
Figure 7
Analysis of 34 mouse transcripts targeted by 2 distinct RNG/MRC probes. Shown are probes with a variation in intensity greater than 2 fold. Each number corresponds to the number of transcripts for which fluorescence intensity varied along with EST_NUMBER, DIST_TO_3′ and/or Tm.

Similar articles

Cited by

References

    1. Fodor S.P., Read J.L., Pirrung M.C., Stryer L., Lu A.T., Solas D. Light-directed, spatially addressable parallel chemical synthesis. Science. 1991;251:767–773. - PubMed
    1. Nuwaysir E.F., Huang W., Albert T.J., Singh J., Nuwaysir K., Pitas A., Richmond T., Gorski T., Berg J.P., Ballin J., et al. Gene expression analysis using oligonucleotide arrays produced by maskless photolithography. Genome Res. 2002;12:1749–1755. - PMC - PubMed
    1. Kronick M.N. Creation of the whole human genome microarray. Expert. Rev. Proteomics. 2004;1:19–28. - PubMed
    1. Ramakrishnan R., Dorris D., Lublinsky A., Nguyen A., Domanus M., Prokhorova A., Gieser L., Touma E., Lockner R., Tata M., et al. An assessment of Motorola CodeLink microarray performance for gene expression profiling applications. Nucleic Acids Res. 2002;30:e30. - PMC - PubMed
    1. Barnes M., Freudenberg J., Thompson S., Aronow B., Pavlidis P. Experimental comparison and cross-validation of the Affymetrix and Illumina gene expression analysis platforms. Nucleic Acids Res. 2005;33:5914–5923. - PMC - PubMed

Publication types

Substances