Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb 24;27(3):109325.
doi: 10.1016/j.isci.2024.109325. eCollection 2024 Mar 15.

A comprehensive evaluation of the phenotype-first and data-driven approaches in analyzing facial morphological traits

Affiliations

A comprehensive evaluation of the phenotype-first and data-driven approaches in analyzing facial morphological traits

Hui Qiao et al. iScience. .

Abstract

The phenotype-first approach (PFA) and data-driven approach (DDA) have both greatly facilitated anthropological studies and the mapping of trait-associated genes. However, the pros and cons of the two approaches are poorly understood. Here, we systematically evaluated the two approaches and analyzed 14,838 facial traits in 2,379 Han Chinese individuals. Interestingly, the PFA explained more facial variation than the DDA in the top 100 and 1,000 except in the top 10 phenotypes. Accordingly, the ratio of heterogeneous traits extracted from the PFA was much greater, while more homogenous traits were found using the DDA for different sex, age, and BMI groups. Notably, our results demonstrated that the sex factor accounted for 30% of phenotypic variation in all traits extracted. Furthermore, we linked DDA phenotypes to PFA phenotypes with explicit biological explanations. These findings provide new insights into the analysis of multidimensional phenotypes and expand the understanding of phenotyping approaches.

Keywords: Biological sciences; Computer science; Health sciences.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
Association analysis among different phenotypes Circos plot illustrates correlations among phenotypes. Each color represents one type of phenotype. The wider is the band, the greater is the correlation. The subclass of phenotypes are abbreviated as point coordinates (Point), proportion indices (Index), curvatures (Curvature), angular measurements (Angle), triangle area measurements (Triangle_area), Euclidean distances (Euclidean), geodesic distances (Geodesic), Manhattan distances (Manhattan), voluminal measurements (Volume), surface area measurements (Surface_area), principal components of the module (Module_PCs), surface area of the module (Module_surf_area), Moran’s I of the module Z coordinate (Module_mor_z), Moran’s I of the module Gaussian curvature (Module_mor_gau), and Moran’s I of the module mean curvature (Module_mor_mea), respectively. See also Figure S1.
Figure 2
Figure 2
Relationship between phenotypes and basic variables (A) The relationship between facial phenotypes and basic variables. Scatterplot for phenotypes against sex, age, and BMI. (B) Ternary plot diagram showed the distribution of the load weight constrained by sex, age, and BMI from various phenotypes. Each dot represents one phenotype, and the same type of trait is indicated by the common color. The solid red line represents multiple correction significance thresholds (p = 3.37 × 10−6), and phenotypes above the red line indicate significant differences or correlations. See also Tables S1, S2, and S3.
Figure 3
Figure 3
Scatterplot of sex, age, and BMI groups and representative facial features (A) Partial least squares-discriminant analysis (PLS-DA) scatterplot showed separation between male (green) and female (red) clustering with individual phenotypic data. M, male; F, female. (B) Visualization example of the variable importance in projection (VIP) phenotypes. Example 1: The fifth principal component of the 11th module. Example 2: The triangular area of the pronasale, left cheilion, and left superior alar groove. Example 3: The nasal surface area. (C) PLS-DA scatterplot showed separation among age clustering with individual phenotypic data. Y, young; M, middle-aged; O, old. (D) Visualization example of the VIP phenotypes. Example 1: Mean curvature Moran index of the 15th module. Example 2: Mean curvature Moran index of the 31st module. Example 3: Mean curvature Moran index of the 62nd module. (E) PLS-DA scatterplot showed separation among BMI clustering with individual phenotypic data. Uw, underweight; N, normal; Ow, overweight; O, obese. (F) Visualization example of the VIP phenotypes. Example 1: The seventh principal component of the fourth module. Example 2: The second principal component of the 22nd module. Example 3: The first principal component of the 86th module. Colored circles represent 95% confidence intervals. Colored dots represent individual samples. See also Figures S2 and S3 and Tables S4–S9.
Figure 4
Figure 4
Comprehensive evaluation of the phenotype-first approach (PFA) and the data-driven approach (DDA) (A) Objective weights of PFA and DDA. The dotted line represents the mean weight. (B) Top 10, 100, and 1,000 objective weights of the PFA and DDA. The dotted line represents the mean weight. (C) Heterogeneous ratios in the PFA and DDA. (D) Homogenous ratios in the PFA and DDA. Age all and BMI all mean a combination of male and female data. All factors mean sex, age, and BMI were considered. (E) The Adonis tests of the top 10, 100, and 1,000 smallest p values for sex, age, and BMI differences. Colored dots/bars represent each phenotype: phenotype-first (red) and data-driven (green).
Figure 5
Figure 5
Example of mapping relationships among morphological observations, phenotype-first approach (PFA), and data-driven approach (DDA) phenotypes (A) The biological explanation of Gau_MoranI17. (B) The biological explanation of Module6_pc1. (C) The biological explanation of Module21_pc8. The number on the horizontal red line represents the correlation between the two traits. The weight rank represents the weight size of traits obtained using CRITIC. The larger the weight value, the smaller the rank. The traits are abbreviated as the Gaussian curvature Moran index of the 17th module (Gau_MoranI17); the X coordinate direction of the glabella (X1); the Y coordinate direction of the glabella (Y1); the Z coordinate direction of the glabella (Z1); the first principal component of the sixth module (Module6_pc1); the X coordinate direction of the right alare (X14); the angle of the right alare, pronasale, and left superior alar groove (Ang_760a); the angle of the pronasale, right alare, and left alare (Ang_752b); the angle of the right subalare, pronasale, and left subalare (Ang_775a); the eighth principal component of the 21st module (Module21_pc8); the angle of the pronasale, sublabiale, and pogonion (Ang_676b); the angle of the subnasale, stomion, and pogonion (Ang_870b); and the angle of the labral superius, sublabiale, and pogonion (Ang_1117b).

References

    1. Liu D., Ban H.J., El Sergani A.M., Lee M.K., Hecht J.T., Wehby G.L., Moreno L.M., Feingold E., Marazita M.L., Cha S., et al. PRICKLE1 x FOCAD Interaction Revealed by Genome-Wide vQTL Analysis of Human Facial Traits. Front. Genet. 2021;12 - PMC - PubMed
    1. Farkas L.G. Raven Press; 1994. Anthropometry of the Head and Face in Clinical Practice.
    1. Xi H.J., Chen Z. 2nd. Science Press; 2010. Anthropometric Methods.
    1. Boehringer S., van der Lijn F., Liu F., Günther M., Sinigerova S., Nowak S., Ludwig K.U., Herberz R., Klein S., Hofman A., et al. Genetic determination of human facial morphology: links between cleft-lips and normal variation. Eur. J. Hum. Genet. 2011;19:1192–1197. - PMC - PubMed
    1. Liu F., van der Lijn F., Schurmann C., Zhu G., Chakravarty M.M., Hysi P.G., Wollstein A., Lao O., de Bruijne M., Ikram M.A., et al. A Genome-Wide Association Study Identifies Five Loci Influencing Facial Morphology in Europeans. PLoS Genet. 2012;8 - PMC - PubMed

LinkOut - more resources