Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Apr 15:2012:bas019.
doi: 10.1093/database/bas019. Print 2012.

The PRINTS database: a fine-grained protein sequence annotation and analysis resource--its status in 2012

Affiliations

The PRINTS database: a fine-grained protein sequence annotation and analysis resource--its status in 2012

Teresa K Attwood et al. Database (Oxford). .

Abstract

The PRINTS database, now in its 21st year, houses a collection of diagnostic protein family 'fingerprints'. Fingerprints are groups of conserved motifs, evident in multiple sequence alignments, whose unique inter-relationships provide distinctive signatures for particular protein families and structural/functional domains. As such, they may be used to assign uncharacterized sequences to known families, and hence to infer tentative functional, structural and/or evolutionary relationships. The February 2012 release (version 42.0) includes 2156 fingerprints, encoding 12 444 individual motifs, covering a range of globular and membrane proteins, modular polypeptides and so on. Here, we report the current status of the database, and introduce a number of recent developments that help both to render a variety of our annotation and analysis tools easier to use and to make them more widely available. Database URL: www.bioinf.manchester.ac.uk/dbbrowser/PRINTS/.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Illustration of a hierarchical PRINTS diagnosis. The UniProtKB/TrEMBL entry Q9NSV5_HUMAN was annotated as putative uncharacterized protein DKFZp434D2030; the family- and domain-database cross-references suggested membership of the major intrinsic protein (MIP) superfamily, but provided no specific family affiliation. The FingerPRINTScan result (inset) diagnoses the sequence both as a member of the MIP superfamily and as an aquaporin 6 subtype.
Figure 2.
Figure 2.
Using the MINOTAUR curator-assistant tool to generate a protein report and extract structure-related sentences from the literature: (a) shows the BLAST-PRECIS input option, with putative G-protein-coupled receptor, Q9C929_ARATH, as the query sequence; (b) shows the returned PRECIS report from the top 7 BLAST hits, which suggests the sequence really belongs to the LanC-like protein family; (c) selection of relevant sentences from the PubMed query results, confirming that the sequence is unlikely to be a GPCR.
Figure 3.
Figure 3.
Using MINOTAUR to generate a PRECIS report for query sequence, Q30HW6_9CICH. The report, culled from the top 11 BLAST hits, suggests the sequence is a green-sensitive opsin—accordingly, the annotation extracted from UniProtKB/Swiss-Prot relates to the function of opsins. However, the hierarchical PRINTS diagnosis suggests that the sequence is a rhodopsin. To shed light on the discrepancy, the sequence can be used to generate possible PubMed queries—the inset shows that the recommended query is ‘Green-sensitive AND opsin’, and the SVM-based ‘rank and extract function sentences’ qualifier has been selected to extract sentences from the 63 retrieved abstracts.
Figure 4.
Figure 4.
Using MINOTAUR to select function-related sentences relevant to query sequence, Q30HW6_9CICH. The top sentences are shown, following use of the search options illustrated in Figure 3. The parent abstracts for each group of sentences may be quickly viewed by clicking on the appropriate icon (inset). In the examples highlighted, a set of green-sensitive opsins is noted to belong to a distinct ‘rhodopsin-like’ phylogenetic group, being more similar to rhodopsins than they are to other green pigments. This helps to resolve the apparent ambiguity in the PRINTS cross-reference to rhodopsins rather than to green-sensitive opsins: sequences in this group clearly have a rhodopsin-like sequence signature and not a ‘green’ one.
Figure 5.
Figure 5.
Illustration of the PRECIS and FingerPRINTScan Web-service plugins integrated within Utopia’s CINEMA alignment editor: (a) shows an alignment of sequence Q9C929_ARATH with LanC-like proteins, with the context-sensitive menu invoking a Web-service plugin; (b) shows the report generated for this group of sequences by the PRECIS plugin; and (c) shows the FingerPRINTScan-plugin result for a PRINTS search with Q9C929_ARATH, which diagnoses the sequence as a eurkaryotic LanC-like protein belonging to the LanC-like superfamily.

References

    1. Akrigg DA, Attwood TK, Bleasby AJ, et al. SERPENT - an information storage and analysis resource for protein sequences. CABIOS. 1992;8:295–296. - PubMed
    1. Bairoch A. PROSITE: a dictionary of sites and patterns in proteins. Nucleic Acids Res. 1991;19:2241–2245. - PMC - PubMed
    1. Apweiler R, Attwood TK, Bairoch A, et al. The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res. 2001;29:37–40. - PMC - PubMed
    1. Attwood TK, Beck ME, Bleasby AJ, et al. PRINTS - A database of protein motif fingerprints. Nucleic Acids Res. 1994;22:3590–3596. - PMC - PubMed
    1. Sonnhammer EL, Eddy SR, Durbin R. Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins. 1997;28:405–420. - PubMed

Publication types