An Analytic Pipeline to Obtain Reliable Genetic Ancestry Estimates from Tumor-Derived RNA Sequencing Data
- PMID: 40622249
- PMCID: PMC12340684
- DOI: 10.1158/1055-9965.EPI-25-0371
An Analytic Pipeline to Obtain Reliable Genetic Ancestry Estimates from Tumor-Derived RNA Sequencing Data
Abstract
Background: Germline genetics may influence tumor molecular characteristics and ultimately cancer survival. Studies of tumor characteristics, including our epithelial ovarian cancer (EOC) studies of Black women in the United States, may have RNA sequencing (RNA-seq) data from archival tumor tissue but lack germline DNA for at least some individuals. Incomplete germline DNA measurements impede analyses of important measures such as global genetic ancestry, often used in downstream analyses, by reducing sample sizes.
Methods: The study population consists of 184 women who participated in two population-based studies of EOC with both germline and formalin-fixed, paraffin-embedded (FFPE) tumor samples and an additional 58 women diagnosed with EOC from the same two studies with only FFPE tumor tissue. We used tumor RNA-seq data to calculate proportions of African, European, and Asian genetic ancestry using a pipeline built on the packages SeqKit, HISAT2, SAMtools, BCFtools, PLINK, and ADMIXTURE. Women from the 1000 Genomes Project were used as the reference populations, and germline genetic ancestry estimates from blood or saliva were used as the baseline comparison. We evaluated multiple quality control strategies to improve genetic ancestry estimation.
Results: Correlations between tumor RNA-seq-derived estimates of genetic ancestry from our pipeline and germline-derived African and European genetic ancestry ranged between 0.76 and 0.94.
Conclusions: RNA-seq data from archival FFPE tumor tissue can be confidently and efficiently used to approximate global genetic ancestry in an admixed population when germline DNA is unavailable.
Impact: This approach supports analyses of genetic ancestry and cancer when germline samples are not available.
©2025 American Association for Cancer Research.
Conflict of interest statement
Conflict of interest:
The authors declare no potential conflicts of interest.
Figures




Similar articles
-
Prescription of Controlled Substances: Benefits and Risks.2025 Jul 6. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2025 Jan–. 2025 Jul 6. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2025 Jan–. PMID: 30726003 Free Books & Documents.
-
Can a Liquid Biopsy Detect Circulating Tumor DNA With Low-passage Whole-genome Sequencing in Patients With a Sarcoma? A Pilot Evaluation.Clin Orthop Relat Res. 2025 Jan 1;483(1):39-48. doi: 10.1097/CORR.0000000000003161. Epub 2024 Jun 21. Clin Orthop Relat Res. 2025. PMID: 38905450
-
Taxane monotherapy regimens for the treatment of recurrent epithelial ovarian cancer.Cochrane Database Syst Rev. 2022 Jul 12;7(7):CD008766. doi: 10.1002/14651858.CD008766.pub3. Cochrane Database Syst Rev. 2022. PMID: 35866378 Free PMC article.
-
Intraoperative frozen section analysis for the diagnosis of early stage ovarian cancer in suspicious pelvic masses.Cochrane Database Syst Rev. 2016 Mar 1;3(3):CD010360. doi: 10.1002/14651858.CD010360.pub2. Cochrane Database Syst Rev. 2016. PMID: 26930463 Free PMC article.
-
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340. Health Technol Assess. 2006. PMID: 16959170
References
-
- Arora K, Tran TN, Kemel Y, Mehine M, Liu YL, Nandakumar S, et al. Genetic Ancestry Correlates with Somatic Differences in a Real-World Clinical Cancer Sequencing Cohort. Cancer Discov. 2022;12(11):2552. doi: 10.1158/2159-8290.CD-22-0312 - DOI - PMC - PubMed
-
- Martini R, Delpe P, Chu TR, Arora K, Lord B, Verma A, et al. African Ancestry-Associated Gene Expression Profiles in Triple-Negative Breast Cancer Underlie Altered Tumor Biology and Clinical Outcome in Women of African Descent. Cancer Discov. 2022;12(11):2530–2551. doi: 10.1158/2159-8290.CD-22-0138 - DOI - PMC - PubMed
MeSH terms
Grants and funding
- R01 CA188943/CA/NCI NIH HHS/United States
- R01 CA237170/CA/NCI NIH HHS/United States
- R01 CA237318/CA/NCI NIH HHS/United States
- U19 CA148112/CA/NCI NIH HHS/United States
- R01 CA142081/CA/NCI NIH HHS/United States
- R00 CA218681/CA/NCI NIH HHS/United States
- P30 CA042014/CA/NCI NIH HHS/United States
- R01 CA200854/CA/NCI NIH HHS/United States
- R01 CA076016/CA/NCI NIH HHS/United States
- R01 CA200854/CA/NCI NIH HHS/United States
- R01 CA076016/CA/NCI NIH HHS/United States
- R01 CA188943/CA/NCI NIH HHS/United States
- U19-CA148112/National Cancer Institute (NCI)
- R01 CA142081/CA/NCI NIH HHS/United States
- R01 CA237170/CA/NCI NIH HHS/United States
- K99/R00CA218681/National Cancer Institute (NCI)
- R01 CA237318/CA/NCI NIH HHS/United States
LinkOut - more resources
Full Text Sources