Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Aug 2;14(8):1403-1417.
doi: 10.1158/2159-8290.CD-23-1441.

St. Jude Survivorship Portal: Sharing and Analyzing Large Clinical and Genomic Datasets from Pediatric Cancer Survivors

Affiliations

St. Jude Survivorship Portal: Sharing and Analyzing Large Clinical and Genomic Datasets from Pediatric Cancer Survivors

Gavriel Y Matt et al. Cancer Discov. .

Erratum in

Abstract

Childhood cancer survivorship studies generate comprehensive datasets comprising demographic, diagnosis, treatment, outcome, and genomic data from survivors. To broadly share this data, we created the St. Jude Survivorship Portal (https://survivorship.stjude.cloud), the first data portal for sharing, analyzing, and visualizing pediatric cancer survivorship data. More than 1,600 phenotypic variables and 400 million genetic variants from more than 7,700 childhood cancer survivors can be explored on this free, open-access portal. Summary statistics of variables are computed on-the-fly and visualized through interactive and customizable charts. Survivor cohorts can be customized and/or divided into groups for comparative analysis. Users can also seamlessly perform cumulative incidence and regression analyses on the stored survivorship data. Using the portal, we explored the ototoxic effects of platinum-based chemotherapy, uncovered a novel association between mental health, age, and limb amputation, and discovered a novel haplotype in MAGI3 strongly associated with cardiomyopathy specifically in survivors of African ancestry. Significance: The St. Jude Survivorship Portal is the first data portal designed to share and explore clinical and genetic data from childhood cancer survivors. The portal provides both open- and controlled-access features and will fulfill a wide range of data sharing needs of the survivorship research community and beyond. See co-corresponding author Xin Zhou discuss this research article, published simultaneously at the AACR Annual Meeting 2024: https://vimeo.com/932617204/7d99fa4958.

PubMed Disclaimer

Conflict of interest statement

K. Shelton reports grants from NIH and other support from American Lebanese Syrian Associated Charities during the conduct of the study. C. Im reports grants from US National Cancer Institute outside the submitted work. K.K. Ness reports grants from National Institutes of Health during the conduct of the study. G.T. Armstrong reports grants from NIH during the conduct of the study. M.M. Hudson reports grants from National Cancer Institute during the conduct of the study. J. Zhang reports grants from National Cancer Institute during the conduct of the study. No disclosures were reported by the other authors.

Figures

Figure 1.
Figure 1.
Overview of the St. Jude Survivorship Portal. A, Survivorship cohorts and data stored on the portal. B, Overview of portal features. Navigation tabs of the portal are shown in the middle. The “COHORT” tab is used to select a cohort. The “FILTER” tab is used to refine the selected cohort by specifying variables. The “GROUPS” tab is used to define custom groups for comparative analyses. The “CHARTS” tab is used to launch features for analyzing, visualizing, and exporting the cohort data. These features include the data dictionary, summary plots, cumulative incidence analysis, regression analysis, genome browser, and data download. All features are open access, except for the data download feature, which is under access control. LD, linkage disequilibrium; PC, principal component.
Figure 2.
Figure 2.
Exploration of phenotypic and genetic data on the portal. A, Navigation tabs with the “CHARTS” tab selected. The data dictionary and genome browser buttons are highlighted. B, Hierarchical organization of the data dictionary, which contains branches (+/− gray buttons) and variables (blue buttons). Selection of two variables (“Diagnosis Group” and “Genetically defined race”) creates a stratified bar chart. C, Genome browser with tracks for genetic variants, gene models, and LD values at the ARID5B locus. In the variants track, variants are color-coded according to their LD values relative to the selected variant (red box). Variants are also plotted according to their −log10(P value) computed by the group comparison shown at the top. In this comparison, allele frequencies of variants are compared between group 1 (survivors of European ancestry diagnosed with ALL) and group 2 (gnomAD noncancer cohort) using the Fisher’s Exact Test. Note that group 1 is specified using dictionary variables shown in B. AFR, African ancestry; EAS, Asian ancestry; EUR, European ancestry.
Figure 3.
Figure 3.
Analyzing the ototoxicity of platinum-based chemotherapy. A, “GROUPS” tab containing four custom groups defined by exposure to cisplatin and carboplatin. These groups were used to create a custom variable that can be accessed by other features of the portal to conduct comparative analysis between the groups. B, Bar chart of diagnosis groups overlaid with the custom variable from A. Diagnosis groups with a relatively high percentage (>35%) of survivors exposed to platinum-based chemotherapy are indicated by black dots. C, “FILTER” tab showing that the cohort was filtered for survivors from the seven diagnosis groups indicated in B. D, Bar chart of maximum hearing loss grades in the filtered cohort from C, overlaid with the custom variable from A.
Figure 4.
Figure 4.
Analyzing the association between mental health, amputation, and age. A, Violin plot of SF36 mental health summary scores of SJLIFE survivors stratified by their amputation status. Circles indicate individual data points. Red lines indicate median values. P value was computed on the portal using the Wilcoxon rank sum test. B, Violin plot of SF36 scores of survivors with amputation (red box), stratified by their age at cancer diagnosis. C, Logistic regression analysis of mental health, amputation, and age. Outcome and independent variables are indicated. The age and amputation variables were specified to form an interaction term (dashed line). D, Results of the analysis in C. Odds ratios were used to compute the odds of poor mental health in survivors who received an amputation at age 10 or older (blue arrow) and in survivors who received an amputation under the age of 10 (pink arrow) relative to survivors who did not receive an amputation.
Figure 5.
Figure 5.
Comparative analysis of cardiomyopathy between African and European ancestries. A, Cumulative incidence analysis of cardiomyopathy (grades 3–5) in SJLIFE survivors stratified by their genetically defined ancestry. Survivors of Asian ancestry and Multi-Ancestry-Admixed were excluded due to absence of cardiomyopathy events. P value was computed using Gray’s test. B, Same analysis as in A, except that survivors were divided into four groups defined by two variables: genetically defined ancestry and sex. C, Genetic association analysis of cardiomyopathy. A logistic regression analysis was set up on the portal with the same outcome variable, independent variables, and inclusion criteria that were used by ref. . For the genetic variable (“Variants in a locus”), its genomic region was restricted to that of the rs6689879 variant. D, Genetic association analysis results. Top, results for survivors of African ancestry. Genome browser view of the rs6689879 variant, which was plotted according to its −log10(P value) computed by the regression analysis. Hovering over the variant on the portal displays a panel of regression statistics for the variant. Bottom, results for survivors of European ancestry. E, Zoomed out view of the region from D, showing the entire MAGI3 locus. Top, Zoomed out view for survivors of African ancestry. The same regression analysis was performed separately for each variant within the region. Variants are color-coded according to their LD values relative to the rs6689879 variant (red box). Circle-shaped variants are common variants (effect allele frequency ≥5%) analyzed by standard regression model-fitting. Triangle-shaped variants are rare variants analyzed by Fisher’s exact test. Labeled variants are those that were selected for haplotype analysis. Bottom, Zoomed out view for survivors of European ancestry. Variants are color-coded in the same way as for survivors of African ancestry. AFR, African ancestry; EUR, European ancestry.

Similar articles

Cited by

References

    1. Ehrhardt MJ, Krull KR, Bhakta N, Liu Q, Yasui Y, Robison LL, et al. . Improving quality and quantity of life for childhood cancer survivors globally in the twenty-first century. Nat Rev Clin Oncol 2023;20:678–96. - PubMed
    1. Siegel RL, Miller KD, Fuchs HE, Jemal A. Cancer statistics, 2021. CA Cancer J Clin 2021;71:7–33. - PubMed
    1. Robison LL, Hudson MM. Survivors of childhood and adolescent cancer: life-long risks and responsibilities. Nat Rev Cancer 2014;14:61–70. - PMC - PubMed
    1. Bhakta N, Liu Q, Ness KK, Baassiri M, Eissa H, Yeo F, et al. . The cumulative burden of surviving childhood cancer: an initial report from the St Jude Lifetime Cohort Study (SJLIFE). Lancet 2017;390:2569–82. - PMC - PubMed
    1. Park ER, Kirchhoff AC, Nipp RD, Donelan K, Leisenring WM, Armstrong GT, et al. . Assessing health insurance coverage characteristics and impact on health care cost, worry, and access: a report from the childhood cancer survivor study. JAMA Intern Med 2017;177:1855–8. - PMC - PubMed