LGBMDF: A cascade forest framework with LightGBM for predicting drug-target interactions

Yu Peng¹, Shouwei Zhao¹, Zhiliang Zeng¹, Xiang Hu¹, Zhixiang Yin¹

Affiliations

PMID: 36687573
PMCID: PMC9849804
DOI: 10.3389/fmicb.2022.1092467

LGBMDF: A cascade forest framework with LightGBM for predicting drug-target interactions

Yu Peng et al. Front Microbiol. 2023.

. 2023 Jan 5:13:1092467.

doi: 10.3389/fmicb.2022.1092467. eCollection 2022.

Authors

Yu Peng¹, Shouwei Zhao¹, Zhiliang Zeng¹, Xiang Hu¹, Zhixiang Yin¹

Affiliation

¹ School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai, China.

PMID: 36687573
PMCID: PMC9849804
DOI: 10.3389/fmicb.2022.1092467

Abstract

Prediction of drug-target interactions (DTIs) plays an important role in drug development. However, traditional laboratory methods to determine DTIs require a lot of time and capital costs. In recent years, many studies have shown that using machine learning methods to predict DTIs can speed up the drug development process and reduce capital costs. An excellent DTI prediction method should have both high prediction accuracy and low computational cost. In this study, we noticed that the previous research based on deep forests used XGBoost as the estimator in the cascade, we applied LightGBM instead of XGBoost to the cascade forest as the estimator, then the estimator group was determined experimentally as three LightGBMs and three ExtraTrees, this new model is called LGBMDF. We conducted 5-fold cross-validation on LGBMDF and other state-of-the-art methods using the same dataset, and compared their Sn, Sp, MCC, AUC and AUPR. Finally, we found that our method has better performance and faster calculation speed.

Keywords: LightGBM; deep forest; drug-target interactions; machine learning; prediction.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 1**
The pipeline of LGBMDF. After getting the features of drugs and targets, we process these features with cascade forest, and set 3 LightGBMs and 3 ExtraTrees for each level as estimators, each estimator outputs a 2-dimensional class vector, and then concatenate the output class vector and the original feature vector as the input vector for the next layer.

**Figure 2**
The construction of histogram.

**Figure 3**
Subtract the histogram of sibling node from the histogram of the parent node so that the speed can be doubled.

**Figure 4**
Comparison of tree growth patterns between XGBoost and LightGBM. **(A)** XGBoost uses the level-wise growth strategy, which can split the leaves of the same level at the same time by traversing the data once. **(B)** LightGBM uses the leaf-wise growth strategy, which finds the leaf with the largest splitting gain from all the current leaves, and then splits it.

**Figure 5**
Bind mutually exclusive features into a single feature.

**Figure 6**
Model performance comparison under each estimator setting. **(A)** AUC and AUPR for 4 estimator combinations. **(B)** Computational time for 4 estimator combinations.

**Figure 7**
Sn, Sp, MCC, AUC and AUPR of LGBMDF, AOPEDF, NEDTP, RF, SVM.

See this image and copyright information in PMC

Cited by

ISLRWR: A network diffusion algorithm for drug-target interactions prediction.
Sun L, Yin Z, Lu L. Sun L, et al. PLoS One. 2025 Jan 30;20(1):e0302281. doi: 10.1371/journal.pone.0302281. eCollection 2025. PLoS One. 2025. PMID: 39883675 Free PMC article.
Drug-target interaction prediction through fine-grained selection and bidirectional random walk methodology.
Wang Y, Yin Z. Wang Y, et al. Sci Rep. 2024 Aug 5;14(1):18104. doi: 10.1038/s41598-024-69186-w. Sci Rep. 2024. PMID: 39103483 Free PMC article.
Machine learning and radiomics for predicting efficacy of programmed cell death protein 1 inhibitor for small cell lung cancer: A multicenter cohort study.
Li P, Huang L, Han R, Tang M, Fei G, Zeng D, Wang R. Li P, et al. Clin Transl Med. 2024 Jun;14(6):e1673. doi: 10.1002/ctm2.1673. Clin Transl Med. 2024. PMID: 38840331 Free PMC article. No abstract available.
Machine learning to predict the occurrence of thyroid nodules: towards a quantitative approach for judicious utilization of thyroid ultrasonography.
Liang Q, Qi Z, Li Y. Liang Q, et al. Front Endocrinol (Lausanne). 2024 May 7;15:1385836. doi: 10.3389/fendo.2024.1385836. eCollection 2024. Front Endocrinol (Lausanne). 2024. PMID: 38774231 Free PMC article.
Prediction of miRNA-disease association based on multisource inductive matrix completion.
Wang Y, Yin Z. Wang Y, et al. Sci Rep. 2024 Nov 11;14(1):27503. doi: 10.1038/s41598-024-78212-w. Sci Rep. 2024. PMID: 39528650 Free PMC article.

References

1. Al Daoud E. (2019). Comparison between XGBoost, light GBM and cat boost using a home credit dataset. Int. J. Comput. Inf. Eng. 13, 6–10. doi: 10.5281/zenodo.3607805 - DOI
1. An Q., Yu L. (2021). A heterogeneous network embedding framework for predicting similarity-based drug-target interactions. Brief. Bioinform. 22:bbab275. doi: 10.1093/bib/bbab275, PMID: - DOI - PubMed
1. Apweiler R., Bairoch A., Wu C. H., Barker W. C., Boeckmann B., Ferro S., et al. . (2004). Uni Prot: the universal protein knowledgebase. Nucleic Acids Res. 32, 115D–1119D. doi: 10.1093/nar/gkh131, PMID: - DOI - PMC - PubMed
1. Bagherian M., Kim R. B., Jiang C., Sartor M. A., Derksen H., Najarian K. (2021). Coupled matrix–matrix and coupled tensor–matrix completion methods for predicting drug–target interactions. Brief. Bioinform. 22, 2161–2171. doi: 10.1093/bib/bbaa025, PMID: - DOI - PMC - PubMed
1. Breiman L. (2001). Random forests. Mach. Learn. 45, 5–32. doi: 10.1023/A:1010933404324 - DOI

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

LGBMDF: A cascade forest framework with LightGBM for predicting drug-target interactions

Affiliation

LGBMDF: A cascade forest framework with LightGBM for predicting drug-target interactions

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources