Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jan 8;49(D1):D344-D354.
doi: 10.1093/nar/gkaa977.

The InterPro protein families and domains database: 20 years on

Affiliations

The InterPro protein families and domains database: 20 years on

Matthias Blum et al. Nucleic Acids Res. .

Abstract

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. InterProScan is the underlying software that allows protein and nucleic acid sequences to be searched against InterPro's signatures. Signatures are predictive models which describe protein families, domains or sites, and are provided by multiple databases. InterPro combines signatures representing equivalent families, domains or sites, and provides additional information such as descriptions, literature references and Gene Ontology (GO) terms, to produce a comprehensive resource for protein classification. Founded in 1999, InterPro has become one of the most widely used resources for protein family annotation. Here, we report the status of InterPro (version 81.0) in its 20th year of operation, and its associated software, including updates to database content, the release of a new website and REST API, and performance improvements in InterProScan.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
InterPro coverage of amino acid residues in UniProtKB. (A) Unique residue coverage of UniProtKB by signatures integrated into InterPro, member database signatures awaiting integration, intrinsically disordered regions, and other regions predicted to be signal peptides, transmembrane domains or coiled-coils. (B) Residue coverage of InterPro's contributing member databases. Residues matched by signatures integrated into InterPro are shown in blue, and residues found only in signatures not yet integrated are shown in orange.
Figure 2.
Figure 2.
The InterPro protein viewer for the structure PDB:1CUK chain A of E. coli protein RuvA. Three sections of the image are highlighted, (A) viewer options, (B) secondary structure track and (C) Genome3D annotations.
Figure 3.
Figure 3.
The InterPro protein viewer for the isoform P04637-3 of protein P04637.
Figure 4.
Figure 4.
The InterPro multiple sequence alignment viewer for the P53 DNA-binding domain (https://www.ebi.ac.uk/interpro/entry/pfam/PF00870/entry_alignments/).
Figure 5.
Figure 5.
The InterPro Domain Architecture search interface.
Figure 6.
Figure 6.
Mesh keyword network for papers mentioning InterPro. The image was generated with VOSviewer using the Europe PubMedCentral API option to search for papers that mention InterPro within the title or abstract. Mesh keywords must be mentioned at least four times in the 426 papers matched to be included in the network.

References

    1. Sillitoe I., Dawson N., Lewis T.E., Das S., Lees J.G., Ashford P., Tolulope A., Scholes H.M., Senatorov I., Bujan A. et al. .. CATH: expanding the horizons of structure-based functional annotations for genome sequences. Nucleic Acids Res. 2019; 47:D280–D284. - PMC - PubMed
    1. Lu S., Wang J., Chitsaz F., Derbyshire M.K., Geer R.C., Gonzales N.R., Gwadz M., Hurwitz D.I., Marchler G.H., Song J.S. et al. .. CDD/SPARCLE: the conserved domain database in 2020. Nucleic Acids Res. 2020; 48:D265–D268. - PMC - PubMed
    1. Pedruzzi I., Rivoire C., Auchincloss A.H., Coudert E., Keller G., de Castro E., Baratin D., Cuche B.A., Bougueleret L., Poux S. et al. .. HAMAP in 2015: updates to the protein family classification and annotation system. Nucleic Acids Res. 2015; 43:D1064–D1070. - PMC - PubMed
    1. Mi H., Muruganujan A., Ebert D., Huang X., Thomas P.D.. PANTHER version 14: more genomes, a new PANTHER GO-slim and improvements in enrichment analysis tools. Nucleic Acids Res. 2019; 47:D419–D426. - PMC - PubMed
    1. El-Gebali S., Mistry J., Bateman A., Eddy S.R., Luciani A., Potter S.C., Qureshi M., Richardson L.J., Salazar G.A., Smart A. et al. .. The Pfam protein families database in 2019. Nucleic Acids Res. 2019; 47:D427–D432. - PMC - PubMed

Publication types