Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 6;15(1):151.
doi: 10.1186/s12920-022-01298-6.

Identifying intragenic functional modules of genomic variations associated with cancer phenotypes by learning representation of association networks

Collaborators, Affiliations

Identifying intragenic functional modules of genomic variations associated with cancer phenotypes by learning representation of association networks

Minsu Kim et al. BMC Med Genomics. .

Abstract

Background: Genome-wide Association Studies (GWAS) aims to uncover the link between genomic variation and phenotype. They have been actively applied in cancer biology to investigate associations between variations and cancer phenotypes, such as susceptibility to certain types of cancer and predisposed responsiveness to specific treatments. Since GWAS primarily focuses on finding associations between individual genomic variations and cancer phenotypes, there are limitations in understanding the mechanisms by which cancer phenotypes are cooperatively affected by more than one genomic variation.

Results: This paper proposes a network representation learning approach to learn associations among genomic variations using a prostate cancer cohort. The learned associations are encoded into representations that can be used to identify functional modules of genomic variations within genes associated with early- and late-onset prostate cancer. The proposed method was applied to a prostate cancer cohort provided by the Veterans Administration's Million Veteran Program to identify candidates for functional modules associated with early-onset prostate cancer. The cohort included 33,159 prostate cancer patients, 3181 early-onset patients, and 29,978 late-onset patients. The reproducibility of the proposed approach clearly showed that the proposed approach can improve the model performance in terms of robustness.

Conclusions: To our knowledge, this is the first attempt to use a network representation learning approach to learn associations among genomic variations within genes. Associations learned in this way can lead to an understanding of the underlying mechanisms of how genomic variations cooperatively affect each cancer phenotype. This method can reveal unknown knowledge in the field of cancer biology and can be utilized to design more advanced cancer-targeted therapies.

Keywords: Genome-wide Association Study; Machine Learning; Network Representation Learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Identification of functional module candidates within a gene. First, the proposed method generates an association network by calculating the LOA between each variant within a gene. It then learns the associations from the network to produce a representation that can be used to identify module candidates
Fig. 2
Fig. 2
Description of sub-approaches to learning representations of an association network. The first sub-approach is to use SkipGram, the other is to use global matrix factorization
Fig. 3
Fig. 3
Description of the evaluation scheme. It aims to evaluate the robustness of each sub-approach by measuring reproducibility

References

    1. Sud A, Kinnersley B, Houlston RS. Genome-wide association studies of cancer: current insights and future perspectives. Nat Rev Cancer. 2017;17(11):692–704. doi: 10.1038/nrc.2017.82. - DOI - PubMed
    1. Han J, Hankinson SE, Zhang SM, De Vivo I, Hunter DJ. Interaction between genetic variations in DNA repair genes and plasma folate on breast cancer risk. Cancer Epidemiol Prev Biomark. 2004;13(4):520–524. doi: 10.1158/1055-9965.520.13.4. - DOI - PubMed
    1. Lappalainen T, MacArthur DG. From variant to function in human disease genetics. Science. 2021;373(6562):1464–1468. doi: 10.1126/science.abi8207. - DOI - PubMed
    1. Ackermann M, Sikora-Wohlfeld W, Beyer A. Impact of natural genetic variation on gene expression dynamics. PLoS Genet. 2013;9(6):1003514. doi: 10.1371/journal.pgen.1003514. - DOI - PMC - PubMed
    1. Glusman G, Rose PW, Prlić A, Dougherty J, Duarte JM, Hoffman AS, Barton GJ, Bendixen E, Bergquist T, Bock C, et al. Mapping genetic variations to three-dimensional protein structures to enhance variant interpretation: a proposed framework. Genome Med. 2017;9(1):1–10. doi: 10.1186/s13073-017-0509-y. - DOI - PMC - PubMed

Publication types