Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Sep 29;106(39):16752-7.
doi: 10.1073/pnas.0907939106. Epub 2009 Sep 15.

Bioinformatics construction of the human cell surfaceome

Affiliations

Bioinformatics construction of the human cell surfaceome

J P C da Cunha et al. Proc Natl Acad Sci U S A. .

Abstract

Cell surface proteins are excellent targets for diagnostic and therapeutic interventions. By using bioinformatics tools, we generated a catalog of 3,702 transmembrane proteins located at the surface of human cells (human cell surfaceome). We explored the genetic diversity of the human cell surfaceome at different levels, including the distribution of polymorphisms, conservation among eukaryotic species, and patterns of gene expression. By integrating expression information from a variety of sources, we were able to identify surfaceome genes with a restricted expression in normal tissues and/or differential expression in tumors, important characteristics for putative tumor targets. A high-throughput and efficient quantitative real-time PCR approach was used to validate 593 surfaceome genes selected on the basis of their expression pattern in normal and tumor samples. A number of candidates were identified as potential diagnostic and therapeutic targets for colorectal tumors and glioblastoma. Several candidate genes were also identified as coding for cell surface cancer/testis antigens. The human cell surfaceome will serve as a reference for further studies aimed at characterizing tumor targets at the surface of human cells.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Overall representation of our bioinformatics strategy to identify human genes coding for cell surface proteins and to select a subset of these genes for experimental validation. A total of 18,320 known human genes were submitted to Pfam and TMHMM. A nonredundant set of 4,843 genes were classified as having a TM domain. After excluding genes coding for proteins secreted or located in other subcellular compartments, we defined 3,702 genes as belonging to the surfaceome set. A subset of 902 genes were selected for experimental validation.
Fig. 2.
Fig. 2.
Overall pattern of conservation in the surfaceomes of 10 species. (A) Numbers at branching nodes represent the number of surfaceome genes conserved between the species that diverged at that node. To be classified as conserved, a gene must be present in the surfaceome of all descendent species. Numbers at the end of the branches indicate the size of the surfaceome set in the respective species. (B) Pairwise comparisons of the surfaceome for all 10 species. The row in blue represents the level of conservation of the human surfaceome in all other nine species, whether they are surfaceome genes in the respective species. The lower half in green represents the level of conservation when the surfaceomes of both species are taken into account. For example, there are 2,005 genes conserved and classified as surfaceome in both M. musculus and P. troglodytes.
Fig. 3.
Fig. 3.
MPSS expression profile of a subset of surfaceome genes in normal tissues. Surfaceome genes were arbitrarily chosen based on their expression pattern. Genes showing a tissue-biased expression were emphasized, as were genes showing a broad expression pattern (genes classified as “Housekeeping” at the bottom of the heatmap). The heatmap was generated by a log transformation of the normalized frequency of an MPSS tag (tags per million) specific for each gene. Each row represents a single gene, and each column represents a different tissue. Color reflects the expression of a gene in a given tissue, based on the frequency of an MPSS tag specific for that gene.
Fig. 4.
Fig. 4.
Expression profile of cell surface-encoding genes differentially expressed in GBMs (A) and colorectal (B) tumors as evaluated by qPCR in 65 RNA samples of normal tissues, GBMs, and colorectal tumors and cell lines derived from these tumor types. Heatmap was generated by averaging three qPCR experiments presented as fold change values. Each row represents a single gene, and each column represents a sample. Noninformative reactions are represented by white spots. Red squares represent genes overexpressed (fold change three times higher than standard deviation) in relation to the reference. Green squares represent genes down-regulated in relation to the reference. Black squares represent genes equally expressed between sample and reference. Differential expression is shown for GBMs (73 genes) and colorectal tumors (26 genes).

References

    1. Rettig WJ, Old LJ. Immunogenetics of human cell surface differentiation. Annu Rev Immunol. 1989;7:481–511. - PubMed
    1. Clark HF, et al. The secreted protein discovery initiative (SPDI), a large-scale effort to identify novel human secreted and transmembrane proteins: A bioinformatics assessment. Genome Res. 2003;13:2265–2270. - PMC - PubMed
    1. Diehn M, Bhattacharya R, Botstein D, Brown PO. Genome-scale identification of membrane-associated human mRNAs. PLoS Genet. 2006;2:e11. - PMC - PubMed
    1. Finn RD, et al. The Pfam protein families database. Nucleic Acids Res. 2008;36:D281–D288. - PMC - PubMed
    1. Krogh A, Larsson B, von Heijne G, Sonnhammer EL. Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. J Mol Biol. 2001;305:567–580. - PubMed

Publication types