Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Mar;11(1):51-9.
doi: 10.1007/s10969-010-9086-7. Epub 2010 Apr 11.

High-throughput computational structure-based characterization of protein families: START domains and implications for structural genomics

Affiliations

High-throughput computational structure-based characterization of protein families: START domains and implications for structural genomics

Hunjoong Lee et al. J Struct Funct Genomics. 2010 Mar.

Abstract

SkyLine, a high-throughput homology modeling pipeline tool, detects and models true sequence homologs to a given protein structure. Structures and models are stored in SkyBase with links to computational function annotation, as calculated by MarkUs. The SkyLine/SkyBase/MarkUs technology represents a novel structure-based approach that is more objective and versatile than other protein classification resources. This structure-centric strategy provides a multi-dimensional organization and coverage of protein space at the levels of family, function, and genome. The concept of "modelability", the ability to model sequences on related structures, provides a reliable criterion for membership in a protein family ("leverage") and underlies the unique success of this approach. The overall procedure is illustrated by its application to START domains, which comprise a Biomedical Theme for the Northeast Structural Genomics Consortium as part of the Protein Structure Initiative. START domains are typically involved in the non-vesicular transport of lipids. While 19 experimentally determined structures are available, the family, whose evolutionary hierarchy is not well determined, is highly sequence diverse, and the ligand-binding potential of many family members is unknown. The SkyLine/SkyBase/MarkUs approach provides significant insights and predicts: (1) many more family members (approximately 4,000) than any other resource; (2) the function for a large number of unannotated proteins; (3) instances of START domains in genomes from which they were thought to be absent; and (4) the existence of two types of novel proteins, those containing dual START domain and those containing N-terminal START domains.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Flowchart of the SkyLine pipeline. SkyLine [6] starts with a single PDB structure as input. The sequence of the structure is used as a seed in PSI-BLAST profile searches for homologs. Models are built for all non-redundant sequences detected using the input structure as the template and the sequence alignments derived from the PSI-BLAST profiles. Structure evaluation programs discern the reliable models, which comprise family members associated with the input structure. The models and associated information are stored in SkyBase, a web-accessible database.
Figure 2
Figure 2
Overall scheme of the computational structure-based characterization of protein families. START domain sequences and there models, calculated in the SkyLine analysis of a protein structure (PDB id 1LN1, [27]) are retrieved according to a variety of search parameters, including sequence and modeling features. Each model can be manipulated and visualized in a Jmol window, side-by-side with its Verify3D evaluation profile [37]. Links to MarkUs [13] provide access to previously calculated annotation results and the opportunity for further function analysis.
Figure 3
Figure 3
Analysis of the lipid-binding pocket of the human phosphatidylcholine transfer protein (PC-TP, PDB id 1LN1, [27]). A. Conservation of residues lining the pocket is calculated and displayed by ConSurf [33]. B. Electrostatic surface potential is calculated and imaged with GRASP [34]. C. Volume plot of the ligand binding pocket is generated by VOIDOO [35]. D. The PC-TP ligand DLP (1,2-dilinoleoyl-dn-glycero-3-phosphocholine) is shown in molecular representation. Residues lining the ligand-binding pocket and predicted to be functionally important are delineated in C and D.
Figure 4
Figure 4
Analysis of the proposed ligand-binding pocket of the Bacillus halodurans protein BH1534 (PDB id, 1XN5, [24]). A. Conservation of residues lining the pocket is calculated and displayed by ConSurf [33]. B. Curvature and conservation of the pocket surface is calculated with ConSurf [33]. C. Electrostatic surface potential is calculated and imaged with GRASP [34]. D and E. Volume plots of the putative ligand binding pocket were generated by SURFNET [36].

Similar articles

Cited by

References

    1. Schwede T, Sali A, et al. Outcome of a workshop on applications of protein models in biomedical research. Structure. 2009;17(2):151–9. - PMC - PubMed
    1. Dessailly BH, Nair R, et al. PSI-2: structural genomics to cover protein domain family space. Structure. 2009;17(6):869–81. - PMC - PubMed
    1. Terwilliger TC, Stuart D, Yokoyama S. Lessons from structural genomics. Annu Rev Biophys. 2009;38:371–83. - PMC - PubMed
    1. Arnold K, Kiefer F, et al. The Protein Model Portal. J Struct Funct Genomics. 2009;10(1):1–8. - PMC - PubMed
    1. Berman HM, Westbrook JD, et al. The protein structure initiative structural genomics knowledgebase. Nucleic Acids Res. 2009;37(Database issue):D365–8. - PMC - PubMed

Publication types

LinkOut - more resources