Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2012 Jun;22(3):326-32.
doi: 10.1016/j.sbi.2012.05.002. Epub 2012 May 21.

Structural genomics plucks high-hanging membrane proteins

Affiliations
Review

Structural genomics plucks high-hanging membrane proteins

Edda Kloppmann et al. Curr Opin Struct Biol. 2012 Jun.

Abstract

Recent years have seen the establishment of structural genomics centers that explicitly target integral membrane proteins. Here, we review the advances in targeting these extremely high-hanging fruits of structural biology in high-throughput mode. We observe that the experimental determination of high-resolution structures of integral membrane proteins is increasingly successful both in terms of getting structures and of covering important protein families, for example, from Pfam. Structural genomics has begun to contribute significantly toward this progress. An important component of this contribution is the set up of robotic pipelines that generate a wealth of experimental data for membrane proteins. We argue that prediction methods for the identification of membrane regions and for the comparison of membrane proteins largely suffice to meet the challenges of target selection for structural genomics of membrane proteins. In contrast, we need better methods to prioritize the most promising members in a family of closely related proteins and to annotate protein function from sequence and structure in absence of homology.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Novel Pfam families covered by SG and non-SG alpha-IMP structures
We consider all Pfam families that have at least one structural representative in the PDB. Non-structural genomics (non-SG) and structural genomics (SG) contributions are shown in green and blue, respectively. We considered 1,035 PDB IDs for polytopic alpha-IMP structures. These are all IMP structures found in the OPM (Orientations of Proteins in Membranes, [47]), or PDBTM (Protein Data Bank of Transmembrane Proteins, [7]) databases, that have at least one representative in UniProt [48] and can be mapped to at least one Pfam family (Release 26.0, [17]). According to our definition, an IMP maps to a Pfam family when at least 12 transmembrane residues (annotation from PDBTM) align to the profile hidden Markov model of the family (using alignment coordinates). We found 107 such Pfam families. From this list, we excluded 6 Pfam families. These include two families that represent N- or C-terminal soluble extensions of a transmembrane domain, one case of a dubious Pfam match, one case where the classification, but not the annotation in OPM is wrong, one case where the annotation in PDBTM is wrong and one case where we considered a bitopic IMP chain of an IMP structure with one polytopic and two bitopic chains. Therefore, here we consider 101 “IMP” Pfam families that align to at least one structure of a polytopic IMP. Note: 159 IMPs of the initial set that could not be mapped to either a UniProt sequence or a Pfam family have been excluded from the analysis. Note also: the Pfam family ‘formate/nitrite transporter’ (PF01226) was covered by one SG and one non-SG structure in December 2009. Both structures were deposited in the PDB before the coordinates of the other were released, i.e. became publicly known. We counted PF01226 as half a family for both SG and non-SG.
Figure 2
Figure 2. Taxonomic distribution of IMP structures covering novel Pfam families
We show the number of Pfam families covered by structures from Eukaryotes, Bacteria, Archaea and Viruses (in green, blue, light blue and red, respectively). The data is shown for three different time spans. For each combination of family and kingdom we consider the release date of the first structure solved for this family. For example, a family with several bacterial protein structures is counted in the time range during which the first structure was solved. On the other hand, Pfam families with protein structures from more than one kingdom are counted for each kingdom. For example, a Pfam family with a eukaryotic and bacterial protein structure is counted twice, i.e. once for each kingdom. 17 of the 101 Pfam families are counted for two kingdoms and 8 families have eukaryotic, bacterial and archaeal protein structures. Mapping from PDB to Pfam as described in Figure 1.
Figure 3
Figure 3. Human IMPs: Pfam families and PDB structures
A: Mapping human IMPs to Pfam families. 3,305 polytopic alpha-IMPs were extracted from the 20,247 sequences part of the SwissProt Homo sapiens proteome (UniProt release Feb 22, 2012) using PolyPhobius [49]. Assignment of proteins to Pfam families was done as described in Figure 1 using the transmembrane assignment of PolyPhobius. 3,063 IMPs can be mapped to a Pfam family (orange); 242 IMPs fall outside of the current Pfam collection of families (red). B: Human IMP Pfam families covered by structure. We show human IMP Pfam families with no structural representative (green) and with at least one structural representative (blue: representative is a human protein; light blue: representative is not a human protein).
Figure 4
Figure 4. The TehA protein family. A: Homologous proteins behave differentially
The NYCOMPS SG center cloned 35 homologous prokaryotic proteins belonging to the TehA family. Of the 35 homologous proteins experimentally pursued, 33 could be cloned, 8 and 5 expressed (small and large scale, respectively), and only one yielded a diffracting crystal and finally a high-resolution structure. Note in particular the dramatic attrition rate in the number of successfully cloned to successfully expressed proteins. Targets are cloned by ligation free cloning and C-terminal fusion expression vector. Expression and purification are assessed by Coomassie Blue stained SDS–PAGE gels and stability in the DMM detergent is determined by size exclusion chromatography [35]. Structures of the TehA family representative from H. influenzae have been solved and deposited in the PDB [24]. B: Structure of the SLAC1 homolog TehA. The anion channel structure is shown as seen from the periplasm (PDB ID: 3m71, ribbon representation). The highly conserved Phe262 occluding the ion permeation pathway is shown explicitly. The figure is created using Chimera [50].
Figure 5
Figure 5. Statistics for IMP structural genomics protein production pipelines
Depicted is the number of IMPs that were successful at different stages in the experimental pipelines. Data were extracted from TargetDB [51] in January 2012 for nine membrane protein structural genomics consortia: CSMP, GPCR network, MPID, MPSbyNMR, MPSCB, NYCOMPS, TEMIPS, TMPC and TransportPDB. For the NMR consortium (MPSbyNMR) we do not report data for the steps following purification.

References

    1. von Heijne G. The membrane protein universe: what’s out there and why bother? J Intern Med. 2007;261:543–557. - PubMed
    1. von Heijne G. Membrane-protein topology. Nat Rev Mol Cell Biol. 2006;7:909–918. - PubMed
    1. Liu J, Rost B. Comparing function and structure between entire proteomes. Protein Sci. 2001;10:1970–1979. - PMC - PubMed
    1. Fagerberg L, Jonasson K, von Heijne G, Uhlen M, Berglund L. Prediction of the human membrane proteome. Proteomics. 2010;10:1141–1149. - PubMed
    1. Overington JP, Al-Lazikani B, Hopkins AL. How many drug targets are there? Nat Rev Drug Discov. 2006;5:993–996. - PubMed

Publication types

Substances

LinkOut - more resources