Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 20:12:761141.
doi: 10.3389/fgene.2021.761141. eCollection 2021.

Combining Polygenic Risk Score and Voice Features to Detect Major Depressive Disorders

Affiliations

Combining Polygenic Risk Score and Voice Features to Detect Major Depressive Disorders

Yazheng Di et al. Front Genet. .

Abstract

Background: The application of polygenic risk scores (PRSs) in major depressive disorder (MDD) detection is constrained by its simplicity and uncertainty. One promising way to further extend its usability is fusion with other biomarkers. This study constructed an MDD biomarker by combining the PRS and voice features and evaluated their ability based on large clinical samples. Methods: We collected genome-wide sequences and utterances edited from clinical interview speech records from 3,580 women with recurrent MDD and 4,016 healthy people. Then, we constructed PRS as a gene biomarker by p value-based clumping and thresholding and extracted voice features using the i-vector method. Using logistic regression, we compared the ability of gene or voice biomarkers with the ability of both in combination for MDD detection. We also tested more machine learning models to further improve the detection capability. Results: With a p-value threshold of 0.005, the combined biomarker improved the area under the receiver operating characteristic curve (AUC) by 9.09% compared to that of genes only and 6.73% compared to that of voice only. Multilayer perceptron can further heighten the AUC by 3.6% compared to logistic regression, while support vector machine and random forests showed no better performance. Conclusion: The addition of voice biomarkers to genes can effectively improve the ability to detect MDD. The combination of PRS and voice biomarkers in MDD detection is feasible. This study provides a foundation for exploring the clinical application of genetic and voice biomarkers in the diagnosis of MDD.

Keywords: biomarkers; computer technology; depression; major depressive disorder (MDD); polygenic risk score (PRS); voice biomarkers.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Fivefold cross-validation of voice–gene data. In each fold, the samples were split into a training group and a test group. Voice and genetic sequence data of the training group were used to train the universal background model (UBM) and linear mixed model (LMM) separately. Then, i-vectors for the training and test groups were extracted through the UBM, and the polygenic risk score (PRS) can be calculated through the LMM. The i-vectors and PRS will be concatenated as input features for a machine learning (ML) model.
FIGURE 2
FIGURE 2
Process of i-vector extraction. UBM-GMM is a universal background model adapted by a Gaussian mixture model. n = 256 means there were 256 Gaussian mixture clusters. d = 400 means the dimension of i-vectors is 400.
FIGURE 3
FIGURE 3
Polygenic risk score (PRS) model prediction results with different p-value thresholds (PTs) under different covariate use strategies. no-cov, no covariates were considered during the training and prediction processes; all-cov, all covariates were considered during the training and prediction processes; random-cov, the PRS model was trained with a sample genetic matrix along with covariates, but made predictions on samples whose covariates were replaced with random numbers. AUC, Area under the receiver operating characteristic curve.
FIGURE 4
FIGURE 4
Prediction results with different p-value thresholds (PTs) using different biomarkers. The x-axis is the p-value threshold (PT) used in the gene model and the combined biomarkers. Voice biomarkers are not related to PT and are indicated by a dashed horizontal line. AUC, Area under the receiver operating characteristic curve.
FIGURE 5
FIGURE 5
Stratified population accuracy using different biomarkers. The test samples were divided into three groups according to their predicted polygenic risk scores (PRSs). Accuracies were calculated for the three groups separately.

References

    1. Alexopoulos G. S., Meyers B. S., Young R. C., Campbell S., Silbersweig D., Charlson M. (1997). 'Vascular Depression' Hypothesis. Arch. Gen. Psychiatry 54 (10), 915–922. 10.1001/archpsyc.1997.01830220033006 PubMed Abstract | 10.1001/archpsyc.1997.01830220033006 | Google Scholar - DOI - PubMed
    1. Association A. P. (1994). Diagnostic and Statistical Manual of Mental Disorders. Washington, D.C: American Psychiatric Association. Google Scholar
    1. Badhwar A., McFall G. P., Sapkota S., Black S. E., Chertkow H., Duchesne S., et al. (2020). A Multiomics Approach to Heterogeneity in Alzheimer's Disease: Focused Review and Roadmap. Brain 143 (May), 1315–1331. 10.1093/brain/awz384 PubMed Abstract | 10.1093/brain/awz384 | Google Scholar - DOI - PMC - PubMed
    1. Chang C. C., Chow C. C., Tellier L. C., Vattikuti S., Purcell S. M., Lee J. J. (2015). Second-Generation PLINK: Rising to the Challenge of Larger and Richer Datasets. GigaSci 4 (1), 7. 10.1186/s13742-015-0047-8 PubMed Abstract | 10.1186/s13742-015-0047-8 | Google Scholar - DOI - PMC - PubMed
    1. Chatterjee N., Shi J., García-Closas M. (2016). Developing and Evaluating Polygenic Risk Prediction Models for Stratified Disease Prevention. Nat. Rev. Genet. 17 (7), 392–406. 10.1038/nrg.2016.27 PubMed Abstract | 10.1038/nrg.2016.27 | Google Scholar - DOI - PMC - PubMed