Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;7(2):e30869.
doi: 10.1371/journal.pone.0030869. Epub 2012 Feb 21.

iNR-PhysChem: a sequence-based predictor for identifying nuclear receptors and their subfamilies via physical-chemical property matrix

Affiliations

iNR-PhysChem: a sequence-based predictor for identifying nuclear receptors and their subfamilies via physical-chemical property matrix

Xuan Xiao et al. PLoS One. 2012.

Abstract

Nuclear receptors (NRs) form a family of ligand-activated transcription factors that regulate a wide variety of biological processes, such as homeostasis, reproduction, development, and metabolism. Human genome contains 48 genes encoding NRs. These receptors have become one of the most important targets for therapeutic drug development. According to their different action mechanisms or functions, NRs have been classified into seven subfamilies. With the avalanche of protein sequences generated in the postgenomic age, we are facing the following challenging problems. Given an uncharacterized protein sequence, how can we identify whether it is a nuclear receptor? If it is, what subfamily it belongs to? To address these problems, we developed a predictor called iNR-PhysChem in which the protein samples were expressed by a novel mode of pseudo amino acid composition (PseAAC) whose components were derived from a physical-chemical matrix via a series of auto-covariance and cross-covariance transformations. It was observed that the overall success rate achieved by iNR-PhysChem was over 98% in identifying NRs or non-NRs, and over 92% in identifying NRs among the following seven subfamilies: NR1--thyroid hormone like, NR2--HNF4-like, NR3--estrogen like, NR4--nerve growth factor IB-like, NR5--fushi tarazu-F1 like, NR6--germ cell nuclear factor like, and NR0--knirps like. These rates were derived by the jackknife tests on a stringent benchmark dataset in which none of protein sequences included has ≥60% pairwise sequence identity to any other in a same subset. As a user-friendly web-server, iNR-PhysChem is freely accessible to the public at either http://www.jci-bioinfo.cn/iNR-PhysChem or http://icpr.jci.edu.cn/bioinfo/iNR-PhysChem. Also a step-by-step guide is provided on how to use the web-server to get the desired results without the need to follow the complicated mathematics involved in developing the predictor. It is anticipated that iNR-PhysChem may become a useful high throughput tool for both basic research and drug design.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. An illustration to show two types of covariance.
(a) The auto-covariance refers to the coupling between two subsequences from a same sequence when they are separated by formula image unit. (b) The cross-covariance refers to the coupling between two subsequences from two different sequences as indicated by two open curly braces.
Figure 2
Figure 2. A flowchart to show the prediction process of iNR-PhysChem.
T1 represents the benchmark dataset from for training the 1st-level prediction; T2 represents the benchmark dataset from for training the 2nd-level prediction. See the text for further explanation.
Figure 3
Figure 3. An illustration to show the predicted results fallen into four different quadrants.
(I) TP, the true positive quadrant (green) for correct prediction of positive dataset, (II) FP, the false positive quadrant (red) for incorrect prediction of negative dataset; (III) TN, the true negative quadrant (blue) for correct prediction of negative dataset; and (IV) FN, the false negative quadrant (pink) for incorrect prediction of positive dataset.
Figure 4
Figure 4. A semi-screenshot to see the top page of iNR-PhysChem.
The web-server is at either http://www.jci-bioinfo.cn/iNR-PhysChem or http://icpr.jci.edu.cn/bioinfo/iNR-PhysChem.
Figure 5
Figure 5. The 3D graph to show the success rates by the 5-fold cross-validation with different values of C and in the SVM engine.
(a) The results obtained for the 1st-level prediction. (b) The results obtained for the 2nd-level prediction.

Similar articles

Cited by

References

    1. Evans RM. The steroid and thyroid hormone receptor superfamily. Science. 1988;240:889–895. - PMC - PubMed
    1. Olefsky JM. Nuclear Receptor Minireview Series. Journal of Biological Chemistry. 2001;276:36863–36864. - PubMed
    1. Altucci L, Gronemeyer H. Nuclear receptors in cell life and death. Trends in Endocrinology and Metabolism. 2001;12:460–468. - PubMed
    1. Florence H, Gerrit V, Fred EC. Collecting and harvesting biological data: the GPCRDB and NucleaRDB information systems. Nucleic Acids Research. 2001;29:346–349. - PMC - PubMed
    1. Mangelsdorf DJ, Thummel C, Beato M, Herrlich P, Schultz G, et al. The nuclear receptor superfamily: The second decade. Cell. 1995;83:835–839. - PMC - PubMed

Publication types

Substances