This is a preprint.
The Open Pediatric Cancer Project
- PMID: 39026781
- PMCID: PMC11257555
- DOI: 10.1101/2024.07.09.599086
The Open Pediatric Cancer Project
Update in
-
The Open Pediatric Cancer Project.Gigascience. 2025 Jan 6;14:giaf093. doi: 10.1093/gigascience/giaf093. Gigascience. 2025. PMID: 40891528 Free PMC article.
Abstract
Background: In 2019, the Open Pediatric Brain Tumor Atlas (OpenPBTA) was created as a global, collaborative open-science initiative to genomically characterize 1,074 pediatric brain tumors and 22 patient-derived cell lines. Here, we present an extension of the OpenPBTA called the Open Pediatric Cancer (OpenPedCan) Project, a harmonized open-source multi-omic dataset from 6,112 pediatric cancer patients with 7,096 tumor events across more than 100 histologies. Combined with RNA-Seq from the Genotype-Tissue Expression (GTEx) and The Cancer Genome Atlas (TCGA), OpenPedCan contains nearly 48,000 total biospecimens (24,002 tumor and 23,893 normal specimens).
Findings: We utilized Gabriella Miller Kids First (GMKF) workflows to harmonize WGS, WXS, RNA-seq, and Targeted Sequencing datasets to include somatic SNVs, InDels, CNVs, SVs, RNA expression, fusions, and splice variants. We integrated summarized CPTAC whole cell proteomics and phospho-proteomics data, miRNA-Seq data, and have developed a methylation array harmonization workflow to include m-values, beta-vales, and copy number calls. OpenPedCan contains reproducible, dockerized workflows in GitHub, CAVATICA, and Amazon Web Services (AWS) to deliver harmonized and processed data from over 60 scalable modules which can be leveraged both locally and on AWS. The processed data are released in a versioned manner and accessible through CAVATICA or AWS S3 download (from GitHub), and queryable through PedcBioPortal and the NCI's pediatric Molecular Targets Platform. Notably, we have expanded PBTA molecular subtyping to include methylation information to align with the WHO 2021 Central Nervous System Tumor classifications, allowing us to create research-grade integrated diagnoses for these tumors.
Conclusions: OpenPedCan data and its reproducible analysis module framework are openly available and can be utilized and/or adapted by researchers to accelerate discovery, validation, and clinical translation.
Keywords: Docker; OpenPedCan; Pediatric cancer; multi-omics; open science; reproducibility.
Conflict of interest statement
Declarations of Interest The authors declare no conflicts.
Figures



References
-
- OpenPBTA: The Open Pediatric Brain Tumor Atlas Shapiro Joshua A, Gaonkar Krutika S, Spielman Stephanie J, Savonen Candace L, Bethell Chante J, Jin Run, Rathi Komal S, Zhu Yuankun, Egolf Laura E, Farrow Bailey K, … Taroni Jaclyn N Cell Genomics (2023-July) https://doi.org/gr92p6 DOI: 10.1016/j.xgen.2023.100340 - DOI - PMC - PubMed
-
- Genomic Profiling of Childhood Tumor Patient-Derived Xenograft Models to Enable Rational Clinical Trial Design Rokita Jo Lynne, Rathi Komal S, Cardenas Maria F, Upton Kristen A, Jayaseelan Joy, Cross Katherine L, Pfeil Jacob, Egolf Laura E, Way Gregory P, Farrel Alvin, … Maris John M Cell Reports (2019-November) https://doi.org/gg596n DOI: 10.1016/j.celrep.2019.09.071 - DOI - PMC - PubMed
-
- Transcriptomic profiling of 39 commonly-used neuroblastoma cell lines Harenza Jo Lynne, Diamond Maura A, Adams Rebecca N, Song Michael M, Davidson Heather L, Hart Lori S, Dent Maiah H, Fortina Paolo, Reynolds CPatrick, Maris John M Scientific Data (2017-March-28) https://doi.org/f9v8hh DOI: 10.1038/sdata.2017.33 - DOI - PMC - PubMed
-
- Michigan Center for Translational Pathology. https://mctp.med.umich.edu
-
- Integrated Proteogenomic Characterization across Major Histological Types of Pediatric Brain Cancer Petralia Francesca, Tignor Nicole, Reva Boris, Koptyra Mateusz, Chowdhury Shrabanti, Rykunov Dmitry, Krek Azra, Ma Weiping, Zhu Yuankun, Ji Jiayi, … Bocik William E Cell (2020-December) https://doi.org/ghqjkz DOI: 10.1016/j.cell.2020.10.044 - DOI - PMC - PubMed