Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov 17:13:1246-1257.
doi: 10.18632/oncotarget.28308.

Treasures from trash in cancer research

Affiliations

Treasures from trash in cancer research

Fabiano Cordeiro Moreira et al. Oncotarget. .

Abstract

Introduction: Cancer research has significantly improved in recent years, primarily due to next-generation sequencing (NGS) technology. Consequently, an enormous amount of genomic and transcriptomic data has been generated. In most cases, the data needed for research goals are used, and unwanted reads are discarded. However, these eliminated data contain relevant information. Aiming to test this hypothesis, genomic and transcriptomic data were acquired from public datasets.

Materials and methods: Metagenomic tools were used to explore genomic cancer data; additional annotations were used to explore differentially expressed ncRNAs from miRNA experiments, and variants in adjacent to tumor samples from RNA-seq experiments were also investigated.

Results: In all analyses, new data were obtained: from DNA-seq data, microbiome taxonomies were characterized with a similar performance of dedicated metagenomic research; from miRNA-seq data, additional differentially expressed sncRNAs were found; and in tumor and adjacent to tumor tissue data, somatic variants were found.

Conclusions: These findings indicate that unexplored data from NGS experiments could help elucidate carcinogenesis and discover putative biomarkers with clinical applications. Further investigations should be considered for experimental design, providing opportunities to optimize data, saving time and resources while granting access to multiple genomic perspectives from the same sample and experimental run.

Keywords: RNA-Seq variant calling; cancer metagenomics; cancer sncRNA expression.

PubMed Disclaimer

Conflict of interest statement

CONFLICTS OF INTEREST

Authors have no conflicts of interest to declare.

Figures

Figure 1
Figure 1. Most abundant bacteria taxa found in bladder analysis.
In (A), relative genus abundance among samples. In (B), presence/absence Venn diagram: Bladder cancer tissue metagenomic profile obtained from Whole Genomic Sequencing captured data (WGScd) compared with literature research of urine bladder cancer metagenomic profile obtained from sequencing rRNA 16s amplicon. Taxon data were converted to genus since not all works present results in species resolution.
Figure 2
Figure 2
Metagenomic relative abundance in the genus rank from genomic sequencing of (A) gastric and (B) prostate cancers (WGScd).
Figure 3
Figure 3
In (A) and (B), alpha diversity box plot: comparison with literature data indicates that metagenomic analyses from WGScd seem to be as capable of representing community diversity as are dedicated metagenomic analyses. In (C) and (D), presence/absence Venn diagram: Gastric and prostate cancers microbiome profile obtained from WGScd compared with dedicated metagenomic data of gastric and prostate cancers. Taxon data were converted to genera since not all works present results in species resolution. *Yow et al. (2017) performed two different analyses: v2/v3 rRNA 16s regions and v4 rRNA 16s region.
Figure 4
Figure 4. Additional sncRNAs differential expression analysis of gastric, prostate, and bladder cancers obtained from miRNAs expression analyses data.
In (A), sncRNAs relative abundance of each sequencing. In (B), (C) and (D) volcano plot identifying differentially expressed (DE) sncRNAs (adjusted p-value < 0.05; |log2(fold-change)| > 1).
Figure 5
Figure 5. Somatic variants identified in both tumor and adjacent tissue.
In (A) common variants by impact. In (B) potential consequences from high impact common variants. Variants impact and consequences were predicted by the ENSEMBL VEP (https://www.ensembl.org/info/genome/variation/prediction/predicted_data.html).

References

    1. Manzoni C, Kia DA, Vandrovcova J, Hardy J, Wood NW, Lewis PA, Ferrari R. Genome, transcriptome and proteome: the rise of omics data and their integration in biomedical sciences. Brief Bioinform. 2018; 19:286–302. 10.1093/bib/bbw114. - DOI - PMC - PubMed
    1. Dash S, Shakyawar SK, Sharma M, Kaushik S. Big data in healthcare: management, analysis and future prospects. Journal of Big Data. 2019; 6:1–25. 10.1186/s40537-019-0217-0 - DOI
    1. D’Argenio V. The High-Throughput Analyses Era: Are We Ready for the Data Struggle? High Throughput. 2018; 7:E8. 10.3390/ht7010008. - DOI - PMC - PubMed
    1. Hulsen T, Jamuar SS, Moody AR, Karnes JH, Varga O, Hedensted S, Spreafico R, Hafler DA, McKinney EF. From Big Data to Precision Medicine. Front Med (Lausanne). 2019; 6:34. 10.3389/fmed.2019.00034. - DOI - PMC - PubMed
    1. Wheeler DA, Wang L. From human genome to cancer genome: the first decade. Genome Res. 2013; 23:1054–62. 10.1101/gr.157602.113. - DOI - PMC - PubMed

Publication types