Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 21;2(3):100053.
doi: 10.1016/j.bvth.2025.100053. eCollection 2025 Aug.

Analyzing 6211 unique variants in the upgraded interactive FVIII web database reveals novel insights into hemophilia A

Affiliations

Analyzing 6211 unique variants in the upgraded interactive FVIII web database reveals novel insights into hemophilia A

Emily H T Print et al. Blood Vessel Thromb Hemost. .

Abstract

Hemophilia A is a rare genetic disease that occurs with mild, moderate, or severe phenotypes and involves dysfunctional or reduced amounts of plasma factor VIII (FVIII). Identifying causal genetic variants in the F8 gene is vital for patient care. Our original interactive MySQL database for FVIII in 2013 presented clinical data on 2014 unique FVIII variants in 5072 patients. Here, we expand our database almost threefold with a new total of 6211 unique FVIII variants in 10 064 patients, spanning 1529 of the 2351 FVIII residues (65%). We have also developed a new full-length FVIII structural model that incorporates both its crystal structure and its disordered B domain, which is not visible in available experimental structures. This enabled the assessment of these variants on FVIII. Of the 6211 unique F8 variants identified, 730 (12%) were associated with mild phenotypes, 526 (8%) with moderate phenotypes, 2509 (39%) with severe phenotypes, 53 (1%) with multiple severities, and 2393 (40%) with unreported phenotypes. Most variants occurred in the disordered B domain (1281 variants), followed by the A1, A2, and A3 domains (1130, 1071, and 923 variants, respectively) and the C1 and C2 domains (442 and 439 variants, respectively). Inhibitors were associated with 451 variants (7%). Our new structural analyses often revealed changes to the residue solvent surface accessibilities caused by many FVIII variants. The FVIII variant analyses are supported by similar observations in the structurally related FV protein. Our web-accessible FVIII database will enable easier and improved clinical analyses of FVIII genetic variants.

PubMed Disclaimer

Conflict of interest statement

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Figures

None
Graphical abstract
Figure 1.
Figure 1.
Domain cartoon of the 6211 unique variants of FVIII. The FVIII protein comprises the A1-a1-A2-a2-B-a3-A3-C1-C2 domains and linkers and is shown as dark gray boxes that are not drawn to scale. N and C represent the N terminus and C terminus, respectively. Residue numbering marks the first and last amino acids that frame each domain, reported in HGVS format. Above each protein domain, the number of variant residues is shown in the upper row, and the number of variants in each domain is shown in the lower row. Intronic variants are included in their respective domains according to their sequence numbering. Variants that occur in splice sites (420), multiple domains (202), or are not reported (47) are not shown. Below the protein domains, the gene arrangement of 26 exons is shown as alternating light gray and maroon boxes drawn to scale. The exons coding for each FVIII protein domain are indicated. UTR, untranslated region.
Figure 2.
Figure 2.
Distribution of the 6211 unique variants found in the F8 gene. Breakdowns of the 6211 FVIII variants into variant type, effect, and location within the F8 gene sequence are shown. “Not reported” refers to variants found within literature or databases that did not contain any information about the severity of the inherited mutation. (A) The relative frequency of different types of unique variants in the F8 gene. (B) Effects of the 4243-point variants found in the F8 gene sequence. (C) Distribution of the 6211 FVIII variants across the F8 gene and FVIII protein domains.
Figure 3.
Figure 3.
Secondary structure and accessibility analysis of variants occurring in the FVIII protein. The FVIII amino acid sequence is shown with secondary structure assignments and solvent accessibilities indicated below each residue. Secondary structures are assigned as H (α-helix), B (β-bridge), E (extended β-strand), G (310 helix), I (π-helix), T (hydrogen-bonded turn), S (bend), or C (undefined coil region). These were determined from the FVIII crystal structure when the residues were visible and present (PDB ID 6MF2). Sequences within this crystal structure are shown in black, and sequences without a crystal structure are shown in blue. For the latter, secondary structure and solvent accessibility predictions were made based on the modeled FVIII AlphaFold structure. The positions of 4243-point variants that occur in the F8 exons are highlighted in yellow, green, and red. These include point missense, point nonsense, and point silent variants. Yellow denotes point variants that occur in ≤4 patients, green denotes point variants that occur in ≥5 patients, and red denotes point variants that occur in >50 patients. Posttranscriptional modifications are shown. These include 25 putative N-glycan sites (highlighted in cyan), 8 numbered Cys-Cys disulfide bridges (highlighted in blue), and 6 sulfated numbered Tyr residues (highlighted in gray).
Figure 3.
Figure 3.
Secondary structure and accessibility analysis of variants occurring in the FVIII protein. The FVIII amino acid sequence is shown with secondary structure assignments and solvent accessibilities indicated below each residue. Secondary structures are assigned as H (α-helix), B (β-bridge), E (extended β-strand), G (310 helix), I (π-helix), T (hydrogen-bonded turn), S (bend), or C (undefined coil region). These were determined from the FVIII crystal structure when the residues were visible and present (PDB ID 6MF2). Sequences within this crystal structure are shown in black, and sequences without a crystal structure are shown in blue. For the latter, secondary structure and solvent accessibility predictions were made based on the modeled FVIII AlphaFold structure. The positions of 4243-point variants that occur in the F8 exons are highlighted in yellow, green, and red. These include point missense, point nonsense, and point silent variants. Yellow denotes point variants that occur in ≤4 patients, green denotes point variants that occur in ≥5 patients, and red denotes point variants that occur in >50 patients. Posttranscriptional modifications are shown. These include 25 putative N-glycan sites (highlighted in cyan), 8 numbered Cys-Cys disulfide bridges (highlighted in blue), and 6 sulfated numbered Tyr residues (highlighted in gray).
Figure 4.
Figure 4.
Substitution grid that summarizes 2863-point missense variants in the F8 gene. The grid illustrates the total of missense variants that occurs for each defined amino acid change. All substitutions result from a single nucleotide change. Any grid substitutions that would require more than a single nucleotide change are shown in dark gray, although none were seen. Pale gray represents silent mutations; pale yellow represents substitutions that occur between 1 and 10 times; orange represents substitutions that occur between 11 and 20 times; and brown represents substitutions that occur ≥20 times. aa, amino acid.
Figure 5.
Figure 5.
Structural and schematic views of variants within the FVIII domains. (A) The AlphaFold model for full-length FVIII is shown in ribbon format. The structure is shown in rainbow colors, starting with blue at the N terminus (N) and ending with red at the C terminus (C). The disordered B domain is depicted as a ribbon encompassing the A and C domains, although its predicted conformation is of very low confidence. (B) The FVIII structure from panel A is shown schematically in cartoon form in the same orientation and colors. The globular A1, A2, A3, C1, and C2 domains are denoted by filled circles. The disordered B domain is schematically represented by a green line. (C) The 2863 missense variants are mapped to the ribbon diagram, in which the occurrence of multiple different variants with mild, moderate, and severe phenotypes are shown as colored spheres. The mild, moderate, and severe variants are overlaid onto the ribbon diagram of panel C. (D) The 2863 missense variants are mapped to the ribbon diagram, in which the phenotype classifications of mild, moderate, and severe effects are denoted as the traffic light colors green, yellow, and red, respectively. (E) The 25 most commonly reported synonymous variants are shown as spheres in the ribbon structure of FVIII shown in panel A. Black spheres denote the 5th to 25th most common variants, and the 4 labeled magenta spheres denote the top 4 most common synonymous variants seen in FVIII.
Figure 6.
Figure 6.
The 6 individual FVIII domain structures and their 2863 missense variants. The 6 structures were taken from the AlphaFold prediction. All 6 regions are shown as ribbon diagrams in rainbow colors from the N terminus (blue) to the C terminus (red). The structurally similar A1, A2, and A3 domains are shown with their secondary structure ribbons depicted in the same orientations and, similarly, the structurally similar C1 and C2 domains. The black spheres denote the missense mutations in each domain (A1, 629 variants; A2, 621 variants; A3, 537 variants; B, 483 variants; C1, 241 variants; C2, 235 variants). These variants total 2746, and the remaining 117 variants occur in the a1 linker (29 variants), the a2 linker (19 variants), the a3 linker (39 variants) and the signal peptide (29 variants). A further 1 variant was reported without information on its location. Note that the modeled structure of the B domain was predicted from AlphaFold with very low confidence and should not be overinterpreted to assume it actually surrounds the A and C domains as pictured.
Figure 7.
Figure 7.
Analyses of the variants in the 6 FVIII domains. (A) The number of missense variants in each of the 6 FVIII domains is shown above the green bars. If the number of the 2863 missense variants is normalized in proportion to the amino acid residues present in each domain, the outcome is shown as orange bars. The missense variants in the linker regions and the signal peptide are not shown. (B) The 7 distribution types of the 1281 variants in the B domain (pink) are compared against those for all 6211 genetic variants that occur across all the protein domains in FVIII (blue). (C) The PolyPhen-2 substitution analyses predict the damaging effects of all 2863 variants from across the entire protein structure, based on the AlphaFold-predicted FVIII structure. (D) The SIFT substitution analyses predict the damaging effects of 2863 variants from across the entire protein structure based on the AlphaFold-predicted FVIII structure. (E) The FATHMM substitution analyses predict the damaging effects of 2584 missense variants from across the entire protein structure based on the AlphaFold-predicted FVIII structure. This excludes synonymous variants that are not considered by this software. (F) Accessibility analyses of 2863 missense variants in the FVIIIa crystal structure (PDB ID 6MF2). The FVIII variants were grouped by their phenotypic classification (severity) and subdivided according to the residue surface accessibility (ACC) determined using DSSP (see “Methods”). Accessibilities of 0 to 1 indicate full side chain burial; values of 2 to 3 indicate increased side-chain exposure to solvent; and values of ≥4 indicate high solvent exposure. The ACC values for the B domain (Figure 3) all indicated full surface exposures, in keeping with its predicted disordered structure.

Similar articles

References

    1. Orlova NA, Kovnir SV, Vorobiev II, Gabibov AG, Vorobiev AI. Blood clotting factor VIII: from evolution to therapy. Acta Naturae. 2013;5(2):19–39. - PMC - PubMed
    1. Vehar GA, Keyt B, Eaton D, et al. Structure of human factor VIII. Nature. 1984;312(5992):337–342. - PubMed
    1. Shahani T, Covens K, Lavend'homme R, et al. Human liver sinusoidal endothelial cells but not hepatocytes contain factor-VIII. J Thromb Haemost. 2014;12(1):36–42. - PubMed
    1. Kalucka J, de Rooij LPMH, Goveia J, et al. Single-cell transcriptome atlas of murine endothelial cells. Cell. 2020;180(4):764–779.e20. - PubMed
    1. Turner NA, Moake JL. Factor VIII is synthesized in human endothelial cells, packaged in Weibel-Palade bodies and secreted bound to ULVWF strings. PLoS One. 2015;10(10) - PMC - PubMed

LinkOut - more resources