Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun 3;40(6):btae343.
doi: 10.1093/bioinformatics/btae343.

HE2Gene: image-to-RNA translation via multi-task learning for spatial transcriptomics data

Affiliations

HE2Gene: image-to-RNA translation via multi-task learning for spatial transcriptomics data

Xingjian Chen et al. Bioinformatics. .

Abstract

Motivation: Tissue context and molecular profiling are commonly used measures in understanding normal development and disease pathology. In recent years, the development of spatial molecular profiling technologies (e.g. spatial resolved transcriptomics) has enabled the exploration of quantitative links between tissue morphology and gene expression. However, these technologies remain expensive and time-consuming, with subsequent analyses necessitating high-throughput pathological annotations. On the other hand, existing computational tools are limited to predicting only a few dozen to several hundred genes, and the majority of the methods are designed for bulk RNA-seq.

Results: In this context, we propose HE2Gene, the first multi-task learning-based method capable of predicting tens of thousands of spot-level gene expressions along with pathological annotations from H&E-stained images. Experimental results demonstrate that HE2Gene is comparable to state-of-the-art methods and generalizes well on an external dataset without the need for re-training. Moreover, HE2Gene preserves the annotated spatial domains and has the potential to identify biomarkers. This capability facilitates cancer diagnosis and broadens its applicability to investigate gene-disease associations.

Availability and implementation: The source code and data information has been deposited at https://github.com/Microbiods/HE2Gene.

PubMed Disclaimer

Conflict of interest statement

No competing interest is declared.

Figures

Figure 1.
Figure 1.
Overview of HE2Gene. (Upper) The whole-slide image is segmented into hundreds of spots using the coordinates obtained from spatial transcriptomics protocols. Spot images are then fed into HE2Gene for gene expression prediction and pathological annotation using multi-task learning. HE2Gene has three prediction tasks, which are to predict the target gene expressions, non-target gene expressions, and pathological annotations, respectively. (Bottom) HE2Gene also includes a spatial-aware loss function to incorporate patch-based spatial dependencies between pathological annotations and tissue morphology. Given a central patch and its neighboring patches, the spatial loss minimizes the discrepancy in the predicted gene expressions between patches with the same pathological annotations.
Figure 2.
Figure 2.
The mean AUCs on the tumor detection task are reported. “Random” means to predict without training. “Linear Probe” and “LightGBM” refer to extracting the image features from the pre-trained ResNet-50 model as the input for a linear classifier and a LightGBM for training. “Single Task” means to separately train the model for the pathological annotation task.
Figure 3.
Figure 3.
The visualization of the expression levels (in z-score) of gene H3-3B in a luminal B breast cancer patient.

Similar articles

Cited by

References

    1. Alshabi AM, Vastrad B, Shaikh IA. et al. Identification of crucial candidate genes and pathways in glioblastoma multiform by bioinformatics analysis. Biomolecules 2019;9:201. - PMC - PubMed
    1. Andersson A, Larsson L, Stenbeck L. et al. Spatial deconvolution of HER2-positive breast tumors reveals novel intercellular relationships. bioRxiv, 2020, 2020-07. doi: 10.1101/2020.07.14.200600 - DOI
    1. Chen G, Ning B, Shi T.. Single-cell RNA-seq technologies and related computational data analysis. Front Genet 2019;10:317. - PMC - PubMed
    1. Chen Y, Li Y, Narayan R. et al. Gene expression inference with deep learning. Bioinformatics 2016;32:1832–9. - PMC - PubMed
    1. Duran-Lopez L, Dominguez-Morales JP, Conde-Martin AF. et al. Prometeo: a CNN-based computer-aided diagnosis system for WSI prostate cancer detection. IEEE Access 2020;8:128613–28.

Publication types

Grants and funding