Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 11:15:433-445.
doi: 10.2147/CCID.S339547. eCollection 2022.

A Genome-Wide Association Study and Machine-Learning Algorithm Analysis on the Prediction of Facial Phenotypes by Genotypes in Korean Women

Affiliations

A Genome-Wide Association Study and Machine-Learning Algorithm Analysis on the Prediction of Facial Phenotypes by Genotypes in Korean Women

Hye-Young Yoo et al. Clin Cosmet Investig Dermatol. .

Abstract

Purpose: Changes in facial appearance are affected by various intrinsic and extrinsic factors, which vary from person to person. Therefore, each person needs to determine their skin condition accurately to care for their skin accordingly. Recently, genetic identification by skin-related phenotypes has become possible using genome-wide association studies (GWAS) and machine-learning algorithms. However, because most GWAS have focused on populations with American or European skin pigmentation, large-scale GWAS are needed for Asian populations. This study aimed to evaluate the correlation of facial phenotypes with candidate single-nucleotide polymorphisms (SNPs) to predict phenotype from genotype using machine learning.

Materials and methods: A total of 749 Korean women aged 30-50 years were enrolled in this study and evaluated for five facial phenotypes (melanin, gloss, hydration, wrinkle, and elasticity). To find highly related SNPs with each phenotype, GWAS analysis was used. In addition, phenotype prediction was performed using three machine-learning algorithms (linear, ridge, and linear support vector regressions) using five-fold cross-validation.

Results: Using GWAS analysis, we found 46 novel highly associated SNPs (p < 1×10-05): 3, 20, 12, 6, and 5 SNPs for melanin, gloss, hydration, wrinkle, and elasticity, respectively. On comparing the performance of each model based on phenotypes using five-fold cross-validation, the ridge regression model showed the highest accuracy (r2 = 0.6422-0.7266) in all skin traits. Therefore, the optimal solution for personal skin diagnosis using GWAS was with the ridge regression model.

Conclusion: The proposed facial phenotype prediction model in this study provided the optimal solution for accurately predicting the skin condition of an individual by identifying genotype information of target characteristics and machine-learning methods. This model has potential utility for the development of customized cosmetics.

Keywords: customized cosmetics; genome-wide association study; machine-learning algorithm; microarray; single-nucleotide polymorphism.

PubMed Disclaimer

Conflict of interest statement

Hye-Young Yoo, Ji-Eun Woo, Sung-Ha Park, and Byoung-Jun Park are employees of Kolmar Korea Co., Ltd. Ki-Chan Lee, Sunghoon Lee, Joungsu Joo, Jin-Sik Bae, and Hyuk-Jung Kwon are employees of Eone Diagnomics Genome Center Co., Ltd. The authors report no other potential conflicts of interest in this work.

Figures

Figure 1
Figure 1
Flowchart for predicting skin phenotype using genome-wide association study (GWAS) analysis and a GWAS-based machine-learning approach.
Figure 2
Figure 2
Manhattan plot of study results before and after imputation. Manhattan plot of −log10 (p value) for all tested single-nucleotide polymorphisms against genomic position. Horizontal lines indicate suggestive (blue, p < 1×10−05) and significant (red, p < 1×10−08) thresholds. Skin phenotypes: (A and B) melanin, (C and D) gloss, (E and F) hydration, (G and H) wrinkle, and (I and J) elasticity (A, C, E, G, and I are raw data, and B, D, F, H, and J are results after imputation).
Figure 3
Figure 3
Scatter plot of measured values and machine-learning predicted values. Each skin phenotype: (A) Melanin, linear regression showed a best fit of y = 0.51x + 72.95, r2 = 0.6422; (B) Gloss, linear regression showed a best fit of y = 0.54x + 2.02, r2 = 0.8668; (C) Hydration, linear regression showed a best fit of y = 0.57x + 26.24, r2 = 0.8630; (D) Wrinkle, linear regression showed a best fit of y = 0.52x + 8.21, r2 = 0.8681; and (E) Elasticity, linear regression showed a best fit of y = 0.56x + 0.29, r2 = 0.8309.

Similar articles

Cited by

References

    1. Cho C, Cho E, Kim N, et al. Age‐related biophysical changes of the epidermal and dermal skin in Korean women. Skin Res Technol. 2019;25(4):504–511. doi: 10.1111/srt.12679 - DOI - PubMed
    1. Zhang Y, Jiang R, Kezele I, et al. A new procedure, free from human assessment, that automatically grades some facial skin signs in men from selfie pictures. Application to changes induced by a severe aerial chronic urban pollution. Int J Cosmet Sci. 2020;42(2):185–197. doi: 10.1111/ics.12602 - DOI - PubMed
    1. Guinot C, Malvy DJ-M, Ambroisine L, et al. Relative contribution of intrinsic vs extrinsic factors to skin aging as determined by a validated skin age score. Arch Dermatol. 2002;138(11):1454–1460. doi: 10.1001/archderm.138.11.1454 - DOI - PubMed
    1. Kim M, Park T, Yun JI, Lim HW, Han NR, Lee ST. Investigation of age-related changes in the skin microbiota of Korean women. Microorganisms. 2020;8(10):1581. doi: 10.3390/microorganisms8101581 - DOI - PMC - PubMed
    1. Jang SI, Lee M, Han J, et al. A study of skin characteristics with long‐term sleep restriction in Korean women in their 40s. Skin Res Technol. 2020;26(2):193–199. doi: 10.1111/srt.12797 - DOI - PubMed