. 2020 Nov 20:2020:baaa087.

doi: 10.1093/database/baaa087.

GPCR-PEnDB: a database of protein sequences and derived features to facilitate prediction and classification of G protein-coupled receptors

Khodeza Begum^{1

2}, Jonathon E Mohl^{2

3

4}, Fredrick Ayivor¹, Eder E Perez⁴, Ming-Ying Leung^{1

2

3

4}

Affiliations

¹ Computational Science Program, The University of Texas at El Paso, 500 West University Avenue, El Paso, Texas 79968, USA.
² Border Biomedical Research Center, The University of Texas at El Paso, 500 West University Avenue, El Paso, Texas 79968, USA.
³ Bioinformatics Program, The University of Texas at El Paso, 500 West University Avenue, El Paso, Texas 79968, USA and.
⁴ Department of Mathematical Sciences, The University of Texas at El Paso, 500 West University Avenue, El Paso, Texas 79968, USA.

PMID: 33216895
PMCID: PMC7678784
DOI: 10.1093/database/baaa087

GPCR-PEnDB: a database of protein sequences and derived features to facilitate prediction and classification of G protein-coupled receptors

Khodeza Begum et al. Database (Oxford). 2020.

. 2020 Nov 20:2020:baaa087.

doi: 10.1093/database/baaa087.

Authors

Khodeza Begum^{1

2}, Jonathon E Mohl^{2

3

4}, Fredrick Ayivor¹, Eder E Perez⁴, Ming-Ying Leung^{1

2

3

4}

Affiliations

¹ Computational Science Program, The University of Texas at El Paso, 500 West University Avenue, El Paso, Texas 79968, USA.
² Border Biomedical Research Center, The University of Texas at El Paso, 500 West University Avenue, El Paso, Texas 79968, USA.
³ Bioinformatics Program, The University of Texas at El Paso, 500 West University Avenue, El Paso, Texas 79968, USA and.
⁴ Department of Mathematical Sciences, The University of Texas at El Paso, 500 West University Avenue, El Paso, Texas 79968, USA.

PMID: 33216895
PMCID: PMC7678784
DOI: 10.1093/database/baaa087

Abstract

G protein-coupled receptors (GPCRs) constitute the largest group of membrane receptor proteins in eukaryotes. Due to their significant roles in various physiological processes such as vision, smell and inflammation, GPCRs are the targets of many prescription drugs. However, the functional and sequence diversity of GPCRs has kept their prediction and classification based on amino acid sequence data as a challenging bioinformatics problem. There are existing computational approaches, mainly using machine learning and statistical methods, to predict and classify GPCRs based on amino acid sequence and sequence derived features. In this paper, we describe a searchable MySQL database, named GPCR-PEnDB (GPCR Prediction Ensemble Database), of confirmed GPCRs and non-GPCRs. It was constructed with the goal of allowing users to conveniently access useful information of GPCRs in a wide range of organisms and to compile reliable training and testing datasets for different combinations of computational tools. This database currently contains 3129 confirmed GPCR and 3575 non-GPCR sequences collected from the UniProtKB/Swiss-Prot protein database, encompassing over 1200 species. The non-GPCR entries include transmembrane proteins for evaluating various prediction programs' abilities to distinguish GPCRs from other transmembrane proteins. Each protein is linked to information about its source organism, classification, sequence lengths and composition, and other derived sequence features. We present examples of using this database along with its graphical user interface, to query for GPCRs with specific sequence properties and to compare the accuracies of five tools for GPCR prediction. This initial version of GPCR-PEnDB will provide a framework for future extensions to include additional sequence and feature data to facilitate the design and assessment of software tools and experimental studies to help understand the functional roles of GPCRs. Database URL: gpcr.utep.edu/database.

PubMed Disclaimer

Figures

**Figure 1.**
Different regions of a typical GPCR molecule. GPCR consists of a single polypeptide chain of amino acids folded into seven transmembrane helices (TMH1–7) between an extracellular N-terminal and an intracellular C-terminal. The seven transmembrane helices are connected by three extracellular loops (ECL1–3) and three intracellular loops (ICL1–3).

**Figure 5.**
Web interface of GPCR-PEnDB, showing both Quick Search (top) and Advanced Search options (bottom).

**Figure 2.**
G protein-coupled receptor Prediction Ensemble Database (GPCR-PEnDB) overview showing the tables in the database, number of sequence entries, available web-server search options, and different types of algorithms for GPCR prediction and classification.

**Figure 3.**
Number of sequences in different groups of organisms in the GPCR datasets. Groups with more than 40 sequences are shown as separate bars. The remaining ones are grouped as “Others”.

**Figure 4.**
MySQL query asking for GPCRs in Class A with more than 10% serine and C-terminal longer than 300 amino acid residues.

**Figure 6.**
Results table from the search of GPCR sequences longer than 3000 amino acids using the web server. The table entries can be downloaded in CSV format by clicking on the “Result table” link, and the corresponding protein sequences can be downloaded in FASTA format by clicking on the “FASTA file” link.

See this image and copyright information in PMC

Cited by

Stringent in-silico identification of putative G-protein-coupled receptors (GPCRs) of the entomopathogenic nematode Heterorhabditis bacteriophora.
Kundu A, Jaiswal N, Rao U, Somvanshi VS. Kundu A, et al. J Nematol. 2023 Nov 23;55(1):20230038. doi: 10.2478/jofnem-2023-0038. eCollection 2023 Feb. J Nematol. 2023. PMID: 38026552 Free PMC article.
Rapid transcriptomic and physiological changes in the freshwater pennate diatom Mayamaea pseudoterrestris in response to copper exposure.
Suzuki S, Ota S, Yamagishi T, Tuji A, Yamaguchi H, Kawachi M. Suzuki S, et al. DNA Res. 2022 Dec 1;29(6):dsac037. doi: 10.1093/dnares/dsac037. DNA Res. 2022. PMID: 36197113 Free PMC article.
Chromosome-level assembly of the Phytophthora agathidicida genome reveals adaptation in effector gene families.
Cox MP, Guo Y, Winter DJ, Sen D, Cauldron NC, Shiller J, Bradley EL, Ganley AR, Gerth ML, Lacey RF, McDougal RL, Panda P, Williams NM, Grunwald NJ, Mesarich CH, Bradshaw RE. Cox MP, et al. Front Microbiol. 2022 Nov 2;13:1038444. doi: 10.3389/fmicb.2022.1038444. eCollection 2022. Front Microbiol. 2022. PMID: 36406440 Free PMC article.
AnnoPRO: a strategy for protein function annotation based on multi-scale protein representation and a hybrid deep learning of dual-path encoding.
Zheng L, Shi S, Lu M, Fang P, Pan Z, Zhang H, Zhou Z, Zhang H, Mou M, Huang S, Tao L, Xia W, Li H, Zeng Z, Zhang S, Chen Y, Li Z, Zhu F. Zheng L, et al. Genome Biol. 2024 Feb 1;25(1):41. doi: 10.1186/s13059-024-03166-1. Genome Biol. 2024. PMID: 38303023 Free PMC article.
What Makes GPCRs from Different Families Bind to the Same Ligand?
Dankwah KO, Mohl JE, Begum K, Leung MY. Dankwah KO, et al. Biomolecules. 2022 Jun 21;12(7):863. doi: 10.3390/biom12070863. Biomolecules. 2022. PMID: 35883418 Free PMC article.

See all "Cited by" articles

References

1. Vaidehi N., Floriano W.B., Trabanino R. et al. (2002) Prediction of structure and function of G protein-coupled receptors. Proc. Natl. Acad. Sci. U. S. A, 99, 12622–12627. doi: 10.1073/pnas.122357199 - DOI - PMC - PubMed
1. Insel P.A., Sriram K., Wiley S.Z. et al. (2018) GPCRomics: GPCR expression in cancer cells and tumors identifies new, potential biomarkers and therapeutic targets. Front. Pharmacol., doi: 10.3389/fphar.2018.00431 - DOI - PMC - PubMed
1. Zhang J., Feng H., Xu S. et al. (2016) Hijacking GPCRs by viral pathogens and tumor. Biochem. Pharmacol., 114, 69–81. doi: 10.1016/j.bcp.2016.03.021 - DOI - PMC - PubMed
1. Cash J.L., Norling L.V. and Perretti M. (2014) Resolution of inflammation: targeting GPCRs that interact with lipids and peptides. Drug Discov. Today, 19, 186–192. doi: 10.1016/j.drudis.2014.06.023 - DOI - PMC - PubMed
1. Jo M. and Jung S.T. (2016) Engineering therapeutic antibodies targeting G-protein-coupled receptors. Exp. Mol. Med., 48, e207. doi: 10.1038/emm.2015.105 - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

G12 MD007592/MD/NIMHD NIH HHS/United States

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

GPCR-PEnDB: a database of protein sequences and derived features to facilitate prediction and classification of G protein-coupled receptors

Affiliations

GPCR-PEnDB: a database of protein sequences and derived features to facilitate prediction and classification of G protein-coupled receptors

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources