PROTAC-PatentDB: A PROTAC Patent Compound Dataset
- PMID: 41261151
- PMCID: PMC12630821
- DOI: 10.1038/s41597-025-06136-9
PROTAC-PatentDB: A PROTAC Patent Compound Dataset
Abstract
Proteolysis-targeting chimeras (PROTAC) are emerging and promising molecules for targeted protein degradation which have the potential to overcome critical bottlenecks in traditional small molecule drug development. However, the scarcity of publicly available data on molecular compound structures has significantly hindered computational drug discovery and AI-aided drug discovery/design (AIDD) in this field. Patents are an important but underutilized source of novel chemical structures in medicinal chemistry. In this study, we collected PROTAC patents published in 2013-2023 and the associated chemical structures disclosed therein. Through manual screening and expert curation, we identified 63,136 unique PROTAC compounds under 590 patent families, along with 252 targets. Additionally, we employed the ADMETlab 3.0 platform to predict 120 physicochemical properties for all compounds. The dataset is publicly available on the Figshare platform, and an online webserver ( http://protacpatentdb.com ) has also been established. Given the rapid growth of PROTAC patent literature, this dataset can be further expanded as new patents are continuously published.
© 2025. The Author(s).
Conflict of interest statement
Competing interests: The authors declare no competing interests.
Figures
References
-
- Ge, J., Hsieh, C. Y., Fang, M., Sun, H. Y. & Hou, T. Development of PROTACs using computational approaches. Trends Pharmacol Sci45, 1162–1174, 10.1016/j.tips.2024.10.006 (2024). - PubMed
-
- Gharbi, Y. & Mercado, R. A comprehensive review of emerging approaches in machine learning for de novo PROTAC design. Digital Discovery3, 2158–2176, 10.1039/D4DD00177J (2024).
-
- Tan, S. Y., Chen, Z. L., Lu, R. Q., Liu, H. X. & Yao, X. J. Rational Proteolysis Targeting Chimera Design Driven by Molecular Modeling and Machine Learning. WIREs: Computational Molecular Science15, 10.1002/wcms.70013 (2025).
Publication types
MeSH terms
Grants and funding
- No.: MYRG-CRG2023-00007-ICMS-IAS, and MYRG-GRG2024-00268-ICMS-UMDF/Universidade de Macau (University of Macau)
- No.: MYRG-CRG2023-00007-ICMS-IAS, and MYRG-GRG2024-00268-ICMS-UMDF/Universidade de Macau (University of Macau)
- No.: MYRG-CRG2023-00007-ICMS-IAS, and MYRG-GRG2024-00268-ICMS-UMDF/Universidade de Macau (University of Macau)
LinkOut - more resources
Full Text Sources
