Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 7;14(1):44.
doi: 10.1186/s13321-022-00619-2.

Blood-brain barrier penetration prediction enhanced by uncertainty estimation

Affiliations

Blood-brain barrier penetration prediction enhanced by uncertainty estimation

Xiaochu Tong et al. J Cheminform. .

Abstract

Blood-brain barrier is a pivotal factor to be considered in the process of central nervous system (CNS) drug development, and it is of great significance to rapidly explore the blood-brain barrier permeability (BBBp) of compounds in silico in early drug discovery process. Here, we focus on whether and how uncertainty estimation methods improve in silico BBBp models. We briefly surveyed the current state of in silico BBBp prediction and uncertainty estimation methods of deep learning models, and curated an independent dataset to determine the reliability of the state-of-the-art algorithms. The results exhibit that, despite the comparable performance on BBBp prediction between graph neural networks-based deep learning models and conventional physicochemical-based machine learning models, the GROVER-BBBp model shows greatly improvement when using uncertainty estimations. In particular, the strategy combined Entropy and MC-dropout can increase the accuracy of distinguishing BBB + from BBB - to above 99% by extracting predictions with high confidence level (uncertainty score < 0.1). Case studies on preclinical/clinical drugs for Alzheimer' s disease and marketed antitumor drugs that verified by literature proved the application value of uncertainty estimation enhanced BBBp prediction model, that may facilitate the drug discovery in the field of CNS diseases and metastatic brain tumors.

Keywords: BBBp prediction; Blood–brain barrier penetration; Uncertainty estimation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Analyzing molecules’ defects in M-data and the distribution of chemical space of S-data and M-data. a A list of defective molecules in M-data. b The distribution of max similarities inside M-data (blue) and max similarity of each molecule in S-data relative to M-data based on ECFP4 (red). c t-SNE distribution of M-data and S-data based on ECFP4
Fig. 2
Fig. 2
Prediction performance on S-data by BBBp prediction models. Each histogram with an error bar indicates the mean and variance of 5 runs of the model, respectively. Statistical t-tests were applied between the model with the highest metric score and others, and statistically significant test results were noted (*p < 0.05)
Fig. 3
Fig. 3
Prediction performance by introducing different uncertainty estimation methods for BBBp prediction models. a The MCC curves for different uncertainty estimation methods in GROVER, namely Entropy, MC-dropout, Multi-initial, FPsDist, LatentDist and random method. The x-axis is the proportion of remaining compounds in S-data when the compounds with high uncertainty are sequentially discarded, and y-axis is corresponding MCC of the BBBp prediction model. The MCC_AUC is shown in parentheses. b The MCC curves for different uncertainty estimation methods in Attentive FP. c The MCC curves for different uncertainty estimation methods in MLP(PCP). d The MCC curves for different uncertainty estimation methods in RF(PCP)
Fig. 4
Fig. 4
Prediction results from GROVER-BBBp model on S-data within different uncertainty ranges, and corresponding numbers of molecules. a Entropy method. b MC-dropout method. c Multi-initial method. d FPsDist method. e LatentDist method. f Random method
Fig. 5
Fig. 5
Prediction performance by introducing ensemble uncertainty and t-SNE distribution for molecules in M-data and S-data. a The MCC curves of Entropy, MC-dropout and ensemble of them. b Prediction results of molecules in S-data within different range of the ensemble uncertainty, and corresponding numbers of molecules. c t-SNE distribution of M-data based on latent representation of GROVER. d t-SNE distribution of S-data based on latent representation of GROVER, and the size of the point represents the uncertainty of the prediction. The larger the size of the point, the smaller the uncertainty value

Similar articles

Cited by

References

    1. Di L, Rong H, Feng B. Demystifying brain penetration in central nervous system drug discovery. J Med Chem. 2013;56:2–12. doi: 10.1021/jm301297f. - DOI - PubMed
    1. Kola I, Landis J. Can the pharmaceutical industry reduce attrition rates? Nat Rev Drug Discov. 2004;3:711–716. doi: 10.1038/nrd1470. - DOI - PubMed
    1. Colclough N, Chen K, Johnstrom P, Strittmatter N, Yan Y, Wrigley GL, Schou M, Goodwin R, Varnas K, Adua SJ, et al. Preclinical comparison of the blood-brain barrier permeability of osimertinib with other EGFR TKIs. Clin Cancer Res. 2021;27:189–201. doi: 10.1158/1078-0432.CCR-19-1871. - DOI - PubMed
    1. Brown PD, Ahluwalia MS, Khan OH, Asher AL, Wefel JS, Gondi V. Whole-brain radiotherapy for brain metastases: evolution or revolution? J Clin Oncol. 2017;36:483–491. doi: 10.1200/JCO.2017.75.9589. - DOI - PMC - PubMed
    1. Patel NC. Methods to optimize CNS exposure of drug candidates. Bioorg Med Chem Lett. 2020;30:127503. doi: 10.1016/j.bmcl.2020.127503. - DOI - PubMed

LinkOut - more resources