KRAGEN: a knowledge graph-enhanced RAG framework for biomedical problem solving using large language models
- PMID: 38830083
- PMCID: PMC11164829
- DOI: 10.1093/bioinformatics/btae353
KRAGEN: a knowledge graph-enhanced RAG framework for biomedical problem solving using large language models
Abstract
Motivation: Answering and solving complex problems using a large language model (LLM) given a certain domain such as biomedicine is a challenging task that requires both factual consistency and logic, and LLMs often suffer from some major limitations, such as hallucinating false or irrelevant information, or being influenced by noisy data. These issues can compromise the trustworthiness, accuracy, and compliance of LLM-generated text and insights.
Results: Knowledge Retrieval Augmented Generation ENgine (KRAGEN) is a new tool that combines knowledge graphs, Retrieval Augmented Generation (RAG), and advanced prompting techniques to solve complex problems with natural language. KRAGEN converts knowledge graphs into a vector database and uses RAG to retrieve relevant facts from it. KRAGEN uses advanced prompting techniques: namely graph-of-thoughts (GoT), to dynamically break down a complex problem into smaller subproblems, and proceeds to solve each subproblem by using the relevant knowledge through the RAG framework, which limits the hallucinations, and finally, consolidates the subproblems and provides a solution. KRAGEN's graph visualization allows the user to interact with and evaluate the quality of the solution's GoT structure and logic.
Availability and implementation: KRAGEN is deployed by running its custom Docker containers. KRAGEN is available as open-source from GitHub at: https://github.com/EpistasisLab/KRAGEN.
© The Author(s) 2024. Published by Oxford University Press.
Conflict of interest statement
None declared.
Figures

Similar articles
-
ESCARGOT: an AI agent leveraging large language models, dynamic graph of thoughts, and biomedical knowledge graphs for enhanced reasoning.Bioinformatics. 2025 Feb 4;41(2):btaf031. doi: 10.1093/bioinformatics/btaf031. Bioinformatics. 2025. PMID: 39842860 Free PMC article.
-
Improving Dietary Supplement Information Retrieval: Development of a Retrieval-Augmented Generation System With Large Language Models.J Med Internet Res. 2025 Mar 19;27:e67677. doi: 10.2196/67677. J Med Internet Res. 2025. PMID: 40106799 Free PMC article.
-
Detecting emergencies in patient portal messages using large language models and knowledge graph-based retrieval-augmented generation.J Am Med Inform Assoc. 2025 Jun 1;32(6):1032-1039. doi: 10.1093/jamia/ocaf059. J Am Med Inform Assoc. 2025. PMID: 40220286 Free PMC article.
-
Integrating Retrieval-Augmented Generation with Large Language Models in Nephrology: Advancing Practical Applications.Medicina (Kaunas). 2024 Mar 8;60(3):445. doi: 10.3390/medicina60030445. Medicina (Kaunas). 2024. PMID: 38541171 Free PMC article. Review.
-
Techniques for optimization of queries on integrated biological resources.J Bioinform Comput Biol. 2004 Jun;2(2):375-411. doi: 10.1142/s0219720004000648. J Bioinform Comput Biol. 2004. PMID: 15297988 Review.
Cited by
-
Evaluation of the integration of retrieval-augmented generation in large language model for breast cancer nursing care responses.Sci Rep. 2024 Dec 28;14(1):30794. doi: 10.1038/s41598-024-81052-3. Sci Rep. 2024. PMID: 39730573 Free PMC article.
-
Retrieval augmented generation for large language models in healthcare: A systematic review.PLOS Digit Health. 2025 Jun 11;4(6):e0000877. doi: 10.1371/journal.pdig.0000877. eCollection 2025 Jun. PLOS Digit Health. 2025. PMID: 40498738 Free PMC article.
-
Automatic biomarker discovery and enrichment with BRAD.Bioinformatics. 2025 May 6;41(5):btaf159. doi: 10.1093/bioinformatics/btaf159. Bioinformatics. 2025. PMID: 40323323 Free PMC article.
-
Automatic Controversy Detection Based on Heterogeneous Signed Attributed Network and Deep Dual-Layer Self-Supervised Community Analysis.Entropy (Basel). 2025 Apr 27;27(5):473. doi: 10.3390/e27050473. Entropy (Basel). 2025. PMID: 40422428 Free PMC article.
-
Lightweight technology stacks for assistive linked annotations.Genomics Inform. 2024 Oct 10;22(1):17. doi: 10.1186/s44342-024-00021-4. Genomics Inform. 2024. PMID: 39390526 Free PMC article.
References
-
- Besta M, Blach N, Kubicek A. et al. Graph of thoughts: solving elaborate problems with large language models. Proc AAAI Conf AI 2023;38:17682–90.
-
- Brate R, Dang M-H, Hoppe F. et al. Improving language model predictions via prompts enriched with knowledge graphs. In: CEUR Workshop Proceedings. 2022. 10.5445/IR/1000151291. - DOI
-
- Ji Z, Lee N, Frieske R. et al. Survey of hallucination in natural language generation. ACM Comput Surv 2023;55:248. 10.1145/3571730. - DOI
-
- Kojima T, Gu SS, Reid M. et al. Large language models are zero-shot reasoners. In: Proceedings of the 36th International Conference on Neural Information Processing Systems (NIPS '22). Red Hook, NY, USA: Curran Associates Inc., 2022, pp. 22199–213.
-
- Lewis P, Perez E, Piktus A. et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. Adv Neural Inf Process Syst 2020;33:9459–74.