This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

[Preprint]. 2025 Jul 8:2025.07.04.663250.

doi: 10.1101/2025.07.04.663250.

GAME: Genomic API for Model Evaluation

Ishika Luthra¹, Satyam Priyadarshi¹, Rui Guo², Lukas Mahieu³, Niklas Kempynck³, Damion Dooley⁴, Dmitry Penzar⁵, Ilya Vorontsov⁵, Yilun Sheng⁶, Xinming Tu⁶, Adam Klie⁷, Shiron Drusinsky^{8

9}, Alexander Floren⁹, Ethan Armand¹⁰, Kaur Alasoo¹¹, Georg Seelig^{6

12}, Ryan Tewhey¹³, Peter Koo¹⁴, Vikram Agarwal¹⁵, Sager Gosai¹⁶, Luca Pinello^{17

18

19}, Michael A White²⁰, Avantika Lal²¹, Julia Zeitlinger²², Katherine S Pollard^{8

9

23}, Maxwell Libbrecht²⁴, Hannah Carter⁷, Sara Mostafavi⁶, Ivan Kulakovskiy⁵, Will Hsiao^{4

25}, Stein Aerts³, Jian Zhou², Carl G de Boer¹

Affiliations

¹ School of Biomedical Engineering, University of British Columbia, Vancouver, BC, Canada.
² Lyda Hill Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, USA.
³ VIB Center for AI & Computational Biology, VIB-KU Leuven Center for Brain and Disease Research & KU Leuven Department of Human Genetics, Leuven, Belgium.
⁴ Centre for Infectious Disease Genomics and One Health, Simon Fraser University, Burnaby, BC, Canada.
⁵ Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia.
⁶ Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA.
⁷ Department of Medicine, Division of Genomics and Precision Medicine, University of California San Diego, La Jolla, CA, USA.
⁸ University of California, San Francisco, San Francisco, CA 94143, USA.
⁹ Gladstone Institute of Data Science & Biotechnology, San Francisco, CA, USA.
¹⁰ Integrative Biology Laboratory, Salk Institute for Biological Studies, 10010 N. Torrey Pines Road, La Jolla, CA 92037, USA.
¹¹ Institute of Computer Science, University of Tartu, Tartu, Estonia.
¹² Department of Electrical & Computer Engineering, University as Washington.
¹³ The Jackson Laboratory, Bar Harbor, ME, USA.
¹⁴ Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA.
¹⁵ mRNA Center of Excellence, Sanofi, Waltham, MA 02451, USA.
¹⁶ Broad Institute of MIT and Harvard, Cambridge, MA, USA.
¹⁷ Molecular Pathology Unit, Krantz Family Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA.
¹⁸ Gene Regulation Observatory, The Broad Institute of Harvard and MIT, Cambridge, MA, USA.
¹⁹ Department of Pathology, Harvard Medical School, Boston, MA, USA.
²⁰ Department of Genetics, Washington University in St. Louis, St. Louis, MO, 63110, USA.
²¹ Biology Research | AI Development, gRED Computational Sciences, Genentech, South San Francisco, CA, USA.
²² Stowers Institute for Medical Research, Kansas City, MO 64110, USA.
²³ Chan Zuckerberg Biohub SF, San Francisco, CA, USA.
²⁴ School of Computing Science, Simon Fraser University, Burnaby, British Columbia V51 1S6, Canada.
²⁵ Faculty of Health Sciences, Simon Fraser University, Burnaby, BC, Canada.

PMID: 40672207
PMCID: PMC12265512
DOI: 10.1101/2025.07.04.663250

GAME: Genomic API for Model Evaluation

Ishika Luthra et al. bioRxiv. 2025.

[Preprint]. 2025 Jul 8:2025.07.04.663250.

doi: 10.1101/2025.07.04.663250.

Authors

Affiliations

¹ School of Biomedical Engineering, University of British Columbia, Vancouver, BC, Canada.
² Lyda Hill Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, USA.
³ VIB Center for AI & Computational Biology, VIB-KU Leuven Center for Brain and Disease Research & KU Leuven Department of Human Genetics, Leuven, Belgium.
⁴ Centre for Infectious Disease Genomics and One Health, Simon Fraser University, Burnaby, BC, Canada.
⁵ Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia.
⁶ Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA.
⁷ Department of Medicine, Division of Genomics and Precision Medicine, University of California San Diego, La Jolla, CA, USA.
⁸ University of California, San Francisco, San Francisco, CA 94143, USA.
⁹ Gladstone Institute of Data Science & Biotechnology, San Francisco, CA, USA.
¹⁰ Integrative Biology Laboratory, Salk Institute for Biological Studies, 10010 N. Torrey Pines Road, La Jolla, CA 92037, USA.
¹¹ Institute of Computer Science, University of Tartu, Tartu, Estonia.
¹² Department of Electrical & Computer Engineering, University as Washington.
¹³ The Jackson Laboratory, Bar Harbor, ME, USA.
¹⁴ Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA.
¹⁵ mRNA Center of Excellence, Sanofi, Waltham, MA 02451, USA.
¹⁶ Broad Institute of MIT and Harvard, Cambridge, MA, USA.
¹⁷ Molecular Pathology Unit, Krantz Family Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA.
¹⁸ Gene Regulation Observatory, The Broad Institute of Harvard and MIT, Cambridge, MA, USA.
¹⁹ Department of Pathology, Harvard Medical School, Boston, MA, USA.
²⁰ Department of Genetics, Washington University in St. Louis, St. Louis, MO, 63110, USA.
²¹ Biology Research | AI Development, gRED Computational Sciences, Genentech, South San Francisco, CA, USA.
²² Stowers Institute for Medical Research, Kansas City, MO 64110, USA.
²³ Chan Zuckerberg Biohub SF, San Francisco, CA, USA.
²⁴ School of Computing Science, Simon Fraser University, Burnaby, British Columbia V51 1S6, Canada.
²⁵ Faculty of Health Sciences, Simon Fraser University, Burnaby, BC, Canada.

PMID: 40672207
PMCID: PMC12265512
DOI: 10.1101/2025.07.04.663250

Abstract

The rapid expansion of genomics datasets and the application of machine learning has produced sequence-to-activity genomics models with ever-expanding capabilities. However, benchmarking these models on practical applications has been challenging because individual projects evaluate their models in ad hoc ways, and there is substantial heterogeneity of both model architectures and benchmarking tasks. To address this challenge, we have created GAME, a system for large-scale, community-led standardized model benchmarking on user-defined evaluation tasks. We borrow concepts from the Application Programming Interface (API) paradigm to allow for seamless communication between pre-trained models and benchmarking tasks, ensuring consistent evaluation protocols. Because all models and benchmarks are inherently compatible in this framework, the continual addition of new models and new benchmarks is easy. We also developed a Matcher module powered by a large language model (LLM) to automate ambiguous task alignment between benchmarks and models. Containerization of these modules enhances reproducibility and facilitates the deployment of models and benchmarks across computing platforms. By focusing on predicting underlying biochemical phenomena (e.g. gene expression, open chromatin, DNA binding), we ensure that tasks remain technology-independent. We provide examples of benchmarks and models implementing this framework, and anticipate that the community will contribute their own, leading to an ever-expanding and evolving set of models and evaluation tasks. This resource will accelerate genomics research by illuminating the best models for a given task, motivating novel functional genomic benchmarks, and providing a more nuanced understanding of model abilities.

PubMed Disclaimer

Conflict of interest statement

Competing Interests V.A. is an employee of Sanofi. A.L. is an employee of Genentech, inc. The remaining authors declare no competing interests.

Figures

**Figure 1:. GAME framework.**
GAME includes three modules: The Evaluator, containing a benchmark dataset; the Predictor, encompassing a sequence-to-activity model; and the Matcher, capturing relationships between tasks. All GAME modules are inherently interoperable by communicating in the GAME API protocol over TCP. For each benchmark, the Evaluator requests a prediction from the Predictor, which consults the Matcher to determine the closest task the Predictor can complete. Once the Matcher returns the best match, the Predictor will complete its prediction and return it to the Evaluator, which will evaluate performance. Members of the genomics community will contribute modules to enable continual evaluation of more models across more benchmarks.

**Figure 2:. Sample benchmarking done with GAME.**
a, Expression evaluation tasks. Models (x axis) were evaluated for their correlation to measured expression levels (colours) across a variety of tasks (y axis). b, Chromatin conformation tasks. Correlation of Orca predictions vs measured chromatin contact frequencies (colours) for two Orca test-set chromosomes and one validation-set chromosome (y axis). c, Consistency evaluation for accessibility. Point and track based accessibility consistency evaluators (y axis) were used to evaluate the correlation between predictions for forward and reverse complement sequences (colours) in models of DNA accessibility (x axis). *Correlation values from down sampled datasets. Grey cells mark values that were not calculated due models that could not complete the Evaluator’s requested task.

See this image and copyright information in PMC

References

1. Guerrini M. M., Oguchi A., Suzuki A. & Murakawa Y. Cap analysis of gene expression (CAGE) and noncoding regulatory elements. Semin Immunopathol 44, 127–136 (2022). - PubMed
1. Tang Z., Toneyan S. & Koo P. K. Current approaches to genomic deep learning struggle to fully capture human genetic variation. Nat Genet 55, 2021–2022 (2023). - PubMed
1. Sasse A. et al. Benchmarking of deep neural networks for predicting personal gene expression from DNA sequence highlights shortcomings. Nat Genet 55, 2060–2064 (2023). - PubMed
1. Huang C. et al. Personal transcriptome variation is poorly explained by current genomic deep learning models. Nat Genet 55, 2056–2059 (2023). - PMC - PubMed
1. robson eyes s. & Ioannidis N. M. GUANinE v1.0: Benchmark Datasets for Genomic AI Sequence-to-Function Models. bioRxiv 2023.10.12.562113 (2024) doi: 10.1101/2023.10.12.562113. - DOI

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

This is a preprint.

GAME: Genomic API for Model Evaluation

Affiliations

GAME: Genomic API for Model Evaluation

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

Grants and funding

LinkOut - more resources

Full Text Sources