Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 6:14:giaf093.
doi: 10.1093/gigascience/giaf093.

The Open Pediatric Cancer Project

Affiliations

The Open Pediatric Cancer Project

Zhuangzhuang Geng et al. Gigascience. .

Abstract

Background: In 2019, the Open Pediatric Brain Tumor Atlas (OpenPBTA) was created as a global, collaborative open-science initiative to genomically characterize 1,074 pediatric brain tumors and 22 patient-derived cell lines. Here, we present an extension of the OpenPBTA called the Open Pediatric Cancer (OpenPedCan) Project, a harmonized open-source multiomic dataset from 6,112 pediatric cancer patients with 7,096 tumor events across more than 100 histologies. Combined with RNA sequencing (RNA-seq) from the Genotype-Tissue Expression and The Cancer Genome Atlas projects, OpenPedCan contains nearly 48,000 total biospecimens (24,002 tumor and 23,893 normal specimens).

Findings: We utilized Gabriella Miller Kids First workflows to harmonize whole-genome sequencing (WGS), whole exome sequencing (WXS), RNA-seq, and Targeted Sequencing datasets to include somatic SNVs, indels, copy number variants, structural variants, RNA expression, fusions, and splice variants. We integrated summarized Clinical Proteomic Tumor Analysis Consortium whole-cell proteomics and phospho-proteomics data and miRNA sequencing data, as well as developed a methylation array harmonization workflow to include m-values, beta-values, and copy number calls. OpenPedCan contains reproducible, dockerized workflows in GitHub, CAVATICA, and Amazon Web Services (AWS) to deliver harmonized and processed data from over 60 scalable modules, which can be leveraged both locally and on AWS. The processed data are released in a versioned manner and accessible through CAVATICA or AWS S3 download (from GitHub) and queryable through PedcBioPortal and the National Cancer Institute's pediatric Molecular Targets Platform. Notably, we have expanded Pediatric Brain Tumor Atlas molecular subtyping to include methylation information to align with the World Health Organization 2021 Central Nervous System Tumor classifications, allowing us to create research-grade integrated diagnoses for these tumors.

Conclusions: OpenPedCan data and its reproducible analysis module framework are openly available and can be utilized and/or adapted by researchers to accelerate discovery, validation, and clinical translation.

Keywords: Docker; OpenPedCan; multiomics; open science; pediatric cancer; reproducibility.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1:
Figure 1:
OpenPedCan data. (A) OpenPedCan contains multiomic data from 7 cohorts of pediatric tumors (A, B) with counts by tumor event, RNA-seq from adult tumors from The Cancer Genome Atlas (TCGA) Program (C, D), and RNA-seq from normal adult tissues from the Genotype-Tissue Expression (GTeX) project (E) with counts by specimen. Abbreviations: TARGET = Therapeutically Applicable Research to Generate Effective Treatments; PPTC = Pediatric Preclinical Testing Consortium; PBTA = Pediatric Brain Tumor Atlas; Maris = Neuroblastoma cell lines from the Maris Laboratory at CHOP; GMKF = Gabriella Miller Kids First; DGD = Division of Genomic Diagnostics at CHOP; CPTAC = Clinical Proteomic Tumor Analysis Consortium.
Figure 2:
Figure 2:
OpenPedCan analysis workflow. Depicted are the datasets (yellow, orange, and gray) contained within OpenPedCan. These datasets are made available in a harmonized manner through primary analysis workflows (blue) for DNA, RNA, and/or proteogenomics data. Files derived from the primary analysis workflows (green) are released within OpenPedCan. Additional analysis modules developed within OpenPedCan (red) also generate results files (green), which are released within OpenPedCan (Figure created in BioRender, https://BioRender.com/05gdk8k).
Figure 3:
Figure 3:
Medulloblastoma sample clustering. (A) UMAP projection of 271 MB tumors and (B), 63 SHH-activated MB tumors using methylation beta values of the 20,000 most variable probes from the Infinium MethylationEPIC array. (C) UMAP projection of MB, SHH-activated samples indicating copy number status of SHH subgroup known somatic driver genes CCND2, GLI2, MYCN, and PTEN.

Update of

  • The Open Pediatric Cancer Project.
    Geng Z, Wafula E, Corbett RJ, Zhang Y, Jin R, Gaonkar KS, Shukla S, Rathi KS, Hill D, Lahiri A, Miller DP, Sickler A, Keith K, Blackden C, Chroni A, Brown MA, Kraya AA, Clark KL, Rood BR, Resnick AC, Van Kuren N, Maris JM, Farrel A, Koptyra MP, Trooskin GR, Coleman N, Zhu Y, Stefankiewicz S, Abdullaev Z, Chinwalla AT, Santi M, Naqvi AS, Mason JL, Koschmann CJ, Huang X, Diskin SJ, Aldape K, Farrow BK, Ma W, Zhang B, Ennis BM, Tasian S, Phul S, Lueder MR, Zhong C, Dybas JM, Wang P, Taylor D, Rokita JL. Geng Z, et al. bioRxiv [Preprint]. 2025 Jun 28:2024.07.09.599086. doi: 10.1101/2024.07.09.599086. bioRxiv. 2025. Update in: Gigascience. 2025 Jan 6;14:giaf093. doi: 10.1093/gigascience/giaf093. PMID: 39026781 Free PMC article. Updated. Preprint.

References

    1. Molecular Targets Platform . https://moleculartargets.ccdi.cancer.gov/. Accessed 14 July 2025.
    1. Children's Brain Tumor Network . https://cbtn.org/. Accessed 14 July 2025.
    1. Lilly JV, Rokita JL, Mason JL, et al. The Children's Brain Tumor Network (CBTN)—accelerating research in pediatric central nervous system tumors through collaboration and open science. Neoplasia. 2023;35:100846. 10.1016/j.neo.2022.100846. - DOI - PMC - PubMed
    1. Gabriella Kids First Pediatric Research Program Data Resource Center . https://kidsfirstdrc.org/. Accessed 14 July 2025.
    1. Pediatric Neuro-Oncology Consortium . https://pnoc.us/. Accessed 14 July 2025.