Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Sep 14;13(1):15230.
doi: 10.1038/s41598-023-41862-3.

Genetic risk assessment based on association and prediction studies

Affiliations

Genetic risk assessment based on association and prediction studies

Nicole Cathlene N Astrologo et al. Sci Rep. .

Abstract

The genetic basis of phenotypic emergence provides valuable information for assessing individual risk. While association studies have been pivotal in identifying genetic risk factors within a population, complementing it with insights derived from predictions studies that assess individual-level risk offers a more comprehensive approach to understanding phenotypic expression. In this study, we established personalized risk assessment models using single-nucleotide polymorphism (SNP) data from 200 Korean patients, of which 100 experienced hepatitis B surface antigen (HBsAg) seroclearance and 100 patients demonstrated high levels of HBsAg. The risk assessment models determined the predictive power of the following: (1) genome-wide association study (GWAS)-identified candidate biomarkers considered significant in a reference study and (2) machine learning (ML)-identified candidate biomarkers with the highest feature importance scores obtained by using random forest (RF). While utilizing all features yielded 64% model accuracy, using relevant biomarkers achieved higher model accuracies: 82% for 52 GWAS-identified candidate biomarkers, 71% for three GWAS-identified biomarkers, and 80% for 150 ML-identified candidate biomarkers. Findings highlight that the joint contributions of relevant biomarkers significantly influence phenotypic emergence. On the other hand, combining ML-identified candidate biomarkers into the pool of GWAS-identified candidate biomarkers resulted in the improved predictive accuracy of 90%, demonstrating the capability of ML as an auxiliary analysis to GWAS. Furthermore, some of the ML-identified candidate biomarkers were found to be linked with hepatocellular carcinoma (HCC), reinforcing previous claims that HCC can still occur despite the absence of HBsAg.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
General workflow of the study.
Figure 2
Figure 2
Performance metrics of ML-based biomarkers model with increasing feature set size.
Figure 3
Figure 3
Model performance of GWAS+ML-based biomarkers model with increasing feature set size.

Similar articles

Cited by

References

    1. Mitchell KJ. What is complex about complex disorders? Genome Biol. 2012;13(1):1–11. doi: 10.1186/gb-2012-13-1-237. - DOI - PMC - PubMed
    1. Jordan B. Genes and non-mendelian diseases: Dealing with complexity. Perspect. Biol. Med. 2014;57(1):118–131. doi: 10.1353/pbm.2014.0002. - DOI - PubMed
    1. Lvovs D, Favorova OO, Favorov AV. A polygenic approach to the study of polygenic diseases. Acta Naturae. 2012;4:59–71. doi: 10.32607/20758251-2012-4-3-59-71. - DOI - PMC - PubMed
    1. Jin W, Qin P, Lou H, Jin L, Xu S. A systematic characterization of genes underlying both complex and mendelian diseases. Hum. Mol. Genet. 2012;21(7):1611–1624. doi: 10.1093/hmg/ddr599. - DOI - PubMed
    1. Cano-Gamez E, Trynka G. From GWAS to function: Using functional genomics to identify the mechanisms underlying complex diseases. Front. Genet. 2020;11:424. doi: 10.3389/fgene.2020.00424. - DOI - PMC - PubMed

Substances