Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2024 Sep 23;25(6):bbae499.
doi: 10.1093/bib/bbae499.

Bioinformatics approaches for studying molecular sex differences in complex diseases

Affiliations
Review

Bioinformatics approaches for studying molecular sex differences in complex diseases

Rebecca Ting Jiin Loo et al. Brief Bioinform. .

Abstract

Many complex diseases exhibit pronounced sex differences that can affect both the initial risk of developing the disease, as well as clinical disease symptoms, molecular manifestations, disease progression, and the risk of developing comorbidities. Despite this, computational studies of molecular data for complex diseases often treat sex as a confounding variable, aiming to filter out sex-specific effects rather than attempting to interpret them. A more systematic, in-depth exploration of sex-specific disease mechanisms could significantly improve our understanding of pathological and protective processes with sex-dependent profiles. This survey discusses dedicated bioinformatics approaches for the study of molecular sex differences in complex diseases. It highlights that, beyond classical statistical methods, approaches are needed that integrate prior knowledge of relevant hormone signaling interactions, gene regulatory networks, and sex linkage of genes to provide a mechanistic interpretation of sex-dependent alterations in disease. The review examines and compares the advantages, pitfalls and limitations of various conventional statistical and systems-level mechanistic analyses for this purpose, including tailored pathway and network analysis techniques. Overall, this survey highlights the potential of specialized bioinformatics techniques to systematically investigate molecular sex differences in complex diseases, to inform biomarker signature modeling, and to guide more personalized treatment approaches.

Keywords: bioinformatics; biomarker signature modeling; complex diseases; molecular sex differences; pathway and network analysis; personalized medicine.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Roadmap of analysis methods for sex differences in complex diseases. This flowchart outlines a comprehensive approach to analyzing sex differences, progressing from basic data preprocessing to advanced network analysis. Each main category (in colored boxes) represents a distinct class of analytical tools, with sub-categories and specific methods listed below. The hierarchy from top to bottom reflects increasing incorporation of prior biological knowledge and indicates increasing algorithmic complexity. Not every methodology may be required for each application, and the text at the bottom of the boxes indicates when to use each approach. Steps can be iterative and may not always follow a linear progression. While the focus is on interpreting sex differences, these methods are applicable to investigate both sex-dependent and sex-neutral disease-associated changes.
Figure 2
Figure 2
Categorization of disease-associated sex differences according to disease stage. Sex differences can affect an individual’s exposure to risk and protective factors and vulnerability to toxic insults prior to disease onset, as well as the manifestation, response to treatment, and long-term course of a disease.
Figure 3
Figure 3
Representative illustration of different categories of sex dependencies in disease-associated molecular abundance changes (using simulated data). Top left: Sex-specific changes, i.e. changes that are significant only in one sex and do not approach significance in the other sex; top right: Sex-dimorphic changes, i.e. significant changes in both sexes with opposite direction of effect between the sexes; lower left: Sex-modulated changes, i.e. significant changes in both sexes with the same direction of the effect, but a significant difference in the magnitude of the effect; bottom right: Sex-neutral changes, i.e. significant changes in both sexes with the same direction and similar magnitude of effect.
Figure 4
Figure 4
Decision tree for selecting statistical methods to analyze sex-dependent changes in complex diseases. This flowchart guides researchers in choosing between sex-stratified analysis, difference of differences analysis, and interaction term analysis according to the study’s focus and available statistical power. We note that these methods can also be used in combination for a more comprehensive analysis.
Figure 5
Figure 5
Representative principal coordinate analysis (PCoA) plots for different patterns of sex dependencies in disease-associated molecular abundance changes (using simulated data; the patterns are idealized for illustrative purposes, whereas in real data sets mixtures of different patterns as well as stronger influences of noise and biases would be expected). Top left: Sex-specific change (here only male patients); top right: Sex-dimorphic change (i.e. divergent changes between female and male patients); lower left: Sexmodulated changes (i.e. deviations of the patient from control data in both sexes, but with significantly stronger deviations in one sex than in the other); bottom right: Sex-neutral changes (i.e. the deviation of patient data from control data is similar for both sexes; see also the corresponding box plot examples in Fig. 3). The intentional simplicity of these PCoA emphasizes the conceptual nature of these patterns, facilitating understanding of fundamental principles in sex-based analyses of complex diseases.
Figure 6
Figure 6
Illustration of sex-specific differential gene expression in the ‘Neural Crest Differentiation’ pathway from the WikiPathways database in Alzheimer’s disease prefrontal cortex RNA-seq data (shown is a segment of the adapted pathway diagram enriched in male-specific changes). Each box represents a gene, and the colors reflect the estimated log fold change in patients versus controls for females (left side) and males (right side). Over- and under-expression is indicated by a blue to red color gradient (see legend on the right), lack of significant expression changes is indicated by a white color. The diagram shows a male-specific overrepresentation of underexpressed genes in patients (in female patients either no significant change in expression is observed or an increased expression, as for the gene ASCL1).
Figure 7
Figure 7
Illustration of three types of network representations used in biological network analysis: Undirected network graph (e.g. used for protein–protein interaction network analysis); b) directed network graph (e.g. used for signaling network analysis of phosphorylation chains); c) directed network graph with activating interactions (e.g., from gene 1 to gene 2) and inhibiting interactions (e.g., from gene 3 and gene 4 to gene 1; used for transcriptional regulatory network analysis of activating and repressing transcription factor (TF)-target relationships). While regulatory and signaling network analyses take into account information about the topology, directionality, and nature of molecular interactions, most conventional pathway analyses do not fully incorporate this information.
Figure 8
Figure 8
Example of a regulatory network containing genes with sex-modulated expression in Alzheimer’s disease (AD) prefrontal cortex RNA-seq data. Each network node corresponds to a gene, and the colored circles next to the gene represent the log fold changes in a linear model analysis comparing the expression difference between female cases and controls to the difference between male cases and controls. Blue colors represent genes with a significantly lower fold change in AD versus control expression in females than in males, and red colors represent genes with a significantly higher fold change in AD versus control expression in females than in males. The direction of the arrows indicates the regulatory pattern, highlighting that the transcription factor c-Myc controls the expression of the other genes in the graph (green arrows represent activating interactions, gray arrows correspond to interactions with unknown downstream effect; network data were retrieved from human interactions in the mammalian ResNet database using GeneGO MetaCore software, version Q1–2024). Increased expression of c-Myc in neurons in neurodegenerative diseases has been reported to lead to neuronal cell death and the subsequent development of a neurodegenerative phenotype [76, 77].

References

    1. Groban L, Lindsey SH, Wang H. et al. . Chapter 5 - sex and gender differences in cardiovascular disease. Sex Differences in Physiology 2016;1:61–87. 10.1016/B978-0-12-802388-4.00005-7. - DOI
    1. Appelman Y, Rijn BB, Ten Haaf ME. et al. . Sex differences in cardiovascular risk factors and disease prevention. Atherosclerosis 2015;241:211–8. 10.1016/j.atherosclerosis.2015.01.027. - DOI - PubMed
    1. Cook MB, McGlynn KA, Devesa SS. et al. . Sex disparities in cancer mortality and survival. Cancer Epidemiol Biomarkers Prev 2011;20:1629–37. 10.1158/1055-9965.EPI-11-0246. - DOI - PMC - PubMed
    1. Sun T, Plutynski A, Ward S. et al. . An integrative view on sex differences in brain tumors. Cell Mol Life Sci 2015;72:3323–42. 10.1007/s00018-015-1930-2. - DOI - PMC - PubMed
    1. Moores G, Wolff E, Pikula A. et al. . Sex differences in neurology: a scoping review protocol. BMJ Open 2022;12:e054513. 10.1136/bmjopen-2021-054513. - DOI - PMC - PubMed