Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Feb 4:8:7.
doi: 10.1186/s13321-016-0121-y. eCollection 2016.

Selectivity profiling of BCRP versus P-gp inhibition: from automated collection of polypharmacology data to multi-label learning

Affiliations

Selectivity profiling of BCRP versus P-gp inhibition: from automated collection of polypharmacology data to multi-label learning

Floriane Montanari et al. J Cheminform. .

Abstract

Background: The human ATP binding cassette transporters Breast Cancer Resistance Protein (BCRP) and Multidrug Resistance Protein 1 (P-gp) are co-expressed in many tissues and barriers, especially at the blood-brain barrier and at the hepatocyte canalicular membrane. Understanding their interplay in affecting the pharmacokinetics of drugs is of prime interest. In silico tools to predict inhibition and substrate profiles towards BCRP and P-gp might serve as early filters in the drug discovery and development process. However, to build such models, pharmacological data must be collected for both targets, which is a tedious task, often involving manual and poorly reproducible steps.

Results: Compounds with inhibitory activity measured against BCRP and/or P-gp were retrieved by combining Open Data and manually curated data from literature using a KNIME workflow. After determination of compound overlap, machine learning approaches were used to establish multi-label classification models for BCRP/P-gp. Different ways of addressing multi-label problems are explored and compared: label-powerset, binary relevance and classifiers chain. Label-powerset revealed important molecular features for selective or polyspecific inhibitory activity. In our dataset, only two descriptors (the numbers of hydrophobic and aromatic atoms) were sufficient to separate selective BCRP inhibitors from selective P-gp inhibitors. Also, dual inhibitors share properties with both groups of selective inhibitors. Binary relevance and classifiers chain allow improving the predictivity of the models.

Conclusions: The KNIME workflow proved a useful tool to merge data from diverse sources. It could be used for building multi-label datasets of any set of pharmacological targets for which there is data available either in the open domain or in-house. By applying various multi-label learning algorithms, important molecular features driving transporter selectivity could be retrieved. Finally, using the dataset with missing annotations, predictive models can be derived in cases where no accurate dense dataset is available (not enough data overlap or no well balanced class distribution).Graphical abstract.

Keywords: BCRP; Binary relevance; Classifiers chain; KNIME; Multi-label classification; Open Data; Open PHACTS; P-glycoprotein; Polyspecific inhibition; Selective inhibition.

PubMed Disclaimer

Figures

Graphical abstract
Graphical abstract
.
Fig. 1
Fig. 1
Depiction of the data collection workflow
Fig. 2
Fig. 2
Analysis of the scaffolds present in at least five compounds of the dense dataset. A On top left distribution of compounds sharing the scaffolds. Down depiction of the six scaffolds (af). B Binary heat map representations of inhibitory activities for BCRP and P-gp of the compounds sharing scaffolds a, c and d (left heat map), scaffold e (middle heat map) or f (right heat map): red bars inhibitors; blue bars non-inhibitors; abscissae: targets; ordinates: compounds annotated with ChEMBL compound IDs
Fig. 3
Fig. 3
Projection of the dense dataset (yellow dots) over the PCA transformations obtained for the sparse dataset (black dots) using MACCS fingerprints
Fig. 4
Fig. 4
Distribution of SlogP among the three kinds of inhibitors. Inhibitors of P-gp only: red bars (class 1); inhibitors of BCRP only: green bars (class 2); inhibitors of both P-gp and BCRP: blue bars (class 3). Top panel bar plot of the counts per binned value of SlogP. Middle panel proportions of each class in each bin, by putting each bin count to 100 %. Lower panel Matthews Correlation Coefficient (MCC) that would be obtained by splitting the data at each SlogP value. MCC values that peak above or below 0 show ideal thresholds to separate the data between classes. The colored dotted lines corresponds to the peaks of MCC and the corresponding SlogP values (between 3 and 4) for separating class 1 from 2 (red dotted lines) and class 2 from 3 (green dotted lines)
Fig. 5
Fig. 5
View of the 3D embedding proposed by CheS-Mapper using the descriptors SlogP, a_hyd, a_aro, vsa_acc, BalabanJ, a_donacc, weinerPath. In red P-gp-selective inhibitors (class 1), in green BCRP-selective inhibitors (class 2). In blue dual inhibitors (class 3)
Fig. 6
Fig. 6
Tree depiction of the JRip model to separate P-gp-selective inhibitors (red leaf) from BCRP-selective inhibitors (green leaves). a_hyd number of hydrophobic atoms, a_aro number of aromatic atoms. The numbers in the leaves correspond to the number of compounds in the training set that ended up in that leaf (left number) and the number of compounds in the training set that were mispredicted in that leaf (right number)

Similar articles

Cited by

References

    1. Bento AP, Gaulton A, Hersey A, Bellis LJ, Chambers J, Davies M, Krüger FA, Light Y, Mak L, McGlinchey S, Nowotka M, Papadatos G, Santos R, Overington JP. The ChEMBL bioactivity database: an update. Nucleic Acids Res. 2014;42:D1083–D1090. doi: 10.1093/nar/gkt1031. - DOI - PMC - PubMed
    1. Wang Y, Suzek T, Zhang J, Wang J, He S, Cheng T, Shoemaker BA, Gindulyte A, Bryant SH. PubChem BioAssay: 2014 update. Nucleic Acids Res. 2014;42(Database issue):D1075–D1082. doi: 10.1093/nar/gkt978. - DOI - PMC - PubMed
    1. Gray AJG, Groth P, Loizou A, Askjaer S, Brenninkmeijer C, Burger K, Chichester C, Evelo CT, Goble C, Harland L, Pettifer S, Thompson M, Waagmeester A, Williams AJ. Applying linked data approaches to pharmacology: architectural decisions and implementation. Semant Web. 2014;5:101–113. doi: 10.5121/ijwest.2014.5407. - DOI
    1. Jones PM, George AM. The ABC transporter structure and mechanism: perspectives on recent research. Cell Mol Life Sci CMLS. 2004;61:682–699. doi: 10.1007/s00018-003-3336-9. - DOI - PMC - PubMed
    1. Leslie EM, Deeley RG, Cole SPC. Multidrug resistance proteins: role of P-glycoprotein, MRP1, MRP2, and BCRP (ABCG2) in tissue defense. Toxicol Appl Pharmacol. 2005;204:216–237. doi: 10.1016/j.taap.2004.10.012. - DOI - PubMed

LinkOut - more resources