Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Oct;32(10):2099-2111.
doi: 10.1038/s41594-025-01582-w. Epub 2025 Jun 13.

Multiplex and multimodal mapping of variant effects in secreted proteins via MultiSTEP

Affiliations

Multiplex and multimodal mapping of variant effects in secreted proteins via MultiSTEP

Nicholas A Popp et al. Nat Struct Mol Biol. 2025 Oct.

Abstract

Despite widespread advances in DNA sequencing, the functional consequences of most genetic variants remain poorly understood. Multiplexed assays of variant effect can measure the function of variants at scale but cannot readily be applied to the ~10% of human genes encoding secreted proteins. Here we develop a flexible, scalable human cell surface display method, multiplexed surface tethering of extracellular proteins (MultiSTEP), to study the consequences of missense variation in coagulation factor IX (FIX), a serine protease in which genetic variation can cause hemophilia B. We combine MultiSTEP with a panel of antibodies to detect FIX secretion and post-translational modification (PTM), measuring 44,816 variant effects for 436 synonymous variants and 8,528 of the 8,759 possible F9 missense variants. Almost half of missense variants impact secretion, PTM or both. We also identify functional constraints on secretion within the signal peptide and for nearly all gain or loss of cysteine variants. Secretion scores correlate strongly with FIX levels in hemophilia B and reveal that loss-of-secretion variants are more often associated with severe disease. Integration of the secretion and PTM scores enables reclassification of 63.1% of F9 variants of uncertain significance in the My Life, Our Future hemophilia genotyping project. Lastly, we show that MultiSTEP can be applied to other secreted proteins, thus demonstrating that MultiSTEP is a multiplexed, multimodal and generalizable method for systematically assessing variant effects in secreted proteins at scale.

PubMed Disclaimer

Conflict of interest statement

Competing interests: J.P.S. was an expert witness for Genentech and Paul, Weiss, Rifkind, Wharton and Garrison. The other authors declare no competing interests.

Figures

Extended Data Figure 1:
Extended Data Figure 1:. MultiSTEP is based on a flexible genomically integrated approach for expressing secreted protein variants.
a. Cartoon depicting integration of a MultiSTEP plasmid construct into a genomically integrated landing pad cassette. (Top): Lentivirally integrated landing pad cassette expressing mTagBFP2+ (royal blue) from a tetON inducible promoter. mTagBFP2 is fused to an inducible caspase-9 (iCasp9, orange) and a blasticidin resistance gene (dark yellow) with 2A sequences (dark pink) expressing mtagBFP2–2A-iCasp9–2A-BlastR from a tetON inducible promoter with a attP serine recombinase recognition site (black). Downstream is a terminator sequence (Term, brown) and tet repressor (tetR, salmon). Bxb1 serine recombinase, expressed from another plasmid, is shown in grey. (Middle): MultiSTEP plasmid construct. Secreted protein coding sequence (pink) is C-terminally fused to flexible linkers (teal), strep II epitope tag (green), and CD28 transmembrane domain (medium blue). IRES (purple) drives co-transcription of mCherry (red). Upstream is an attB serine recombinase recognition sequence (goldenrod) and a unique 18 nucleotide degenerate barcode (BC, light yellow). (Bottom): Landing pad following plasmid integration. attP and attB sequences have been recombined, forming attL and attR sequences. b. Sequential flow cytometry gating scheme for detecting and isolating landing pad cells with an integrated MultiSTEP construct. Dot pseudocolor indicates density of cells. FSC: Forward scatter; SSC: side scatter. c. Comparison of negative control 293-F cells (top) with 293-F cells incubated with lentivirus encoding the landing pad cassette (bottom, n >10,000 cells). d. Comparison of unrecombined landing pad cells (top) with cells transfected with a MultiSTEP plasmid encoding WT FIX (bottom, n > 10,000 cells). e. Comparison of cells transfected with a MultiSTEP construct encoding WT FIX treated with doxycycline (top) or doxycycline and 10 nM AP1903 (bottom, n > 10,000 cells). f. Design iterations of MultiSTEP construct plasmid in (a, top). L1-Strep MultiSTEP construct does not contain an L2 linker. Flow cytometry of MultiSTEP constructs using a anti-Strep II tag antibody (n ~30,000 cells).
Extended Data Figure 2:
Extended Data Figure 2:. A flexible tag-based approach to assessing variant effects on secretion.
a. Heatmap showing strep tag secretion scores for missense FIX variants. Color indicates MultiSTEP score from 0 (blue, lowest 5% of scores) to white (1, WT) to red (increased). Black dots indicate the WT amino acid. Missing data are gray. b. Density distributions of strep tag secretion scores for FIX missense variants (orange) and synonymous variants (blue). Dashed line denotes the 5th percentile of the synonymous variant distribution. c. Scatter plot comparing MultiSTEP-derived strep tag secretion scores for seven different FIX variants (p.C28Y, p.A37T, p.G58E, p.E67K, p.C134R, p.S220T, and p.H267L), WT, and an unrecombined negative control to the geometric mean of Alexa Fluor-647 fluorescence measured using flow cytometry individually (n = 3 replicates). Error bars show standard error of the mean. d-e. Scatter plots of median MultiSTEP-derived strep tag secretion scores and heavy chain (d) or light chain (e) at each position in FIX (n = 3 replicates). Points are colored by chain architecture, using the color scheme as Fig. 2a. Black dashed line indicates the line of perfect correlation between secretion scores. Gray background indicates <0.3 point deviation from perfect correlation. f. Density plots of MultiSTEP-derived synonymous variant scores generated with the indicated antibody. The dashed vertical line shows WT score.
Extended Data Figure 3:
Extended Data Figure 3:. MultiSTEP-derived FIX secretion scores correlate with orthologous measures of FIX secretion.
a. Flow cytometry of p.C28Y and WT controls (n = 10,000 cells) with the FIX library (n = 100,000 cells). b. Comparison of ELISA measurements of eight untethered FIX missense variants (p.C28Y, p.A37T, p.G58E, p.E67K, p.G125V, p.C134R, p.S220T, and p.H267L) expressed from 293-F cells and heavy chain secretion scores (n = 3 replicates). Error bars show the standard error of the mean. Pearson’s correlation coefficient is shown. c. Scatter plot comparing MultiSTEP-derived heavy chain secretion scores for 20 different FIX missense variants, WT, and unrecombined negative control (n = 3 replicates) to the geometric mean of Alexa Fluor-647 fluorescence measured using flow cytometry on cells expressing each variant individually. Error bars show standard error of the mean (n = 10,000 cells). Line Pearson’s correlation coefficient is shown.
Extended Data Figure 4:
Extended Data Figure 4:. Variants near antibody epitopes demonstrate minor effects on secretion scores.
a. Scatter plot of the difference in heavy chain and light chain secretion scores and the distance in angstroms between all α-carbons in the light chain and the nearest light chain epitope α-carbon using the AlphaFold2 model of mature, two-chain FIX. Low-confidence positions with predicted local distance difference test score (pLDDT) of <70 were removed from analysis. Color indicates whether a position was identified in the light chain epitope in Fig. 2h. Horizontal dashed line indicates no difference in secretion scores. Vertical dashed line indicates boundary of likely epitope-adjacent effects on secretion scores by changepoint analysis (9.15 angstroms). b. Scatter plot of the difference in heavy chain and light chain secretion scores and the distance in angstroms between all α-carbons in the heavy chain and the nearest heavy chain epitope α-carbon using the AlphaFold2 model of mature, two-chain FIX. Low-confidence positions with pLDDT of <70 were removed from analysis. Color indicates whether a position was identified in the heavy chain epitope in Fig. 2h. Horizontal dashed line indicates no difference in secretion scores. Vertical dashed line indicates boundary of likely epitope-adjacent effects on secretion scores by changepoint analysis (5.71 angstroms). c. Scatter plot of median MultiSTEP-derived heavy chain and light chain secretion scores at each position in FIX. Points are colored by epitope (Fig. 2h) or epitope-adjacent position as in (a) and (b). Black dashed line indicates the line of perfect correlation between secretion scores. Gray background indicates <0.3 point deviation from perfect correlation.
Extended Data Figure 5:
Extended Data Figure 5:. Effect of missense FIX variation on secretion compared to missense variant effects on abundance in cytosolic or transmembrane proteins.
a. Box plots of the 25th, 50th, and 75th percentiles of secretion (FIX, MultiSTEP) or abundance (all others, VAMP-seq) scores for all nonsynonymous variants across all positions with the indicated WT amino acid for six different proteins–, (n = 29,287 variants). Whiskers span the range of data. b. Box plots of the 25th, 50th, and 75th percentiles of secretion (FIX, MultiSTEP) or abundance (all others, VAMP-seq) for all nonsynonymous variant amino acid substitutions across all positions for six different proteins (n = 29,287 variants).
Extended Data Figure 6:
Extended Data Figure 6:. Sequence conservation strongly influences the effect of variation on FIX secretion.
a. Comparison of light chain secretion scores with Consurf conservation grades (1: least conserved, 9: most conserved). Violin plot shows distribution of points (n = 8,528 variants) with an inset box plot representing the 25th, 50th, and 75th percentiles. Whiskers span the range of data. Dashed horizontal line is the 5th percentile of the synonymous secretion score distribution. b. Comparison of median light chain secretion scores (n = 8,528 variants) with Consurf conservation grades. Violin plot shows distribution of points with an inset box plot representing the 25th, 50th, and 75th percentiles. Whiskers span the range of data. Dashed horizontal line is the 5th percentile of the synonymous secretion score distribution.
Extended Data Figure 7:
Extended Data Figure 7:. Carboxylation-sensitive antibodies identify functional motifs.
a. Multiple sequence alignment of Gla-domain containing proteins (UniProt) that bind the carboxylation-sensitive Gla-motif (ExxxExC) antibody using MUSCLE,. Antibody epitopes for both the carboxylation-sensitive FIX-specific antibody (ω-loop) and the carboxylation-sensitive Gla-motif antibody are shown. hF9: human coagulation factor IX (P00740); hF2: human prothrombin (coagulation factor II, P00734); hF7: human coagulation factor VII (P08709); hF10: human coagulation factor X (P00742); hPC: human protein C (P04070); hPS: human protein S (P07225); hBGP: human osteocalcin (P02818); bBGP: bovine osteocalcin (P02820); hGAS6: human growth arrest-specific protein 6 (P14393); ppVPA: Pseudechis prophyriacus venom prothrombin activator porpharin-D (P58L93); nsVPA: Notechis scutatis venom prothrombin activator notecarin-D1 (P82807); osVPA: Oxyuranus scutellatus venom prothrombin activator oscutarin-C (P58L96). b-c. Fluorescence of unrecombined negative control and WT FIX-expressing cells with and without warfarin pretreatment generated by staining cells with a carboxylation-sensitive FIX-specific (b) or carboxylation-sensitive Gla-motif antibody (c). d-f. Heatmaps showing carboxylation-sensitive FIX-specific carboxylation scores (d), carboxylation-sensitive Gla-motif carboxylation scores (e), or light chain secretion scores (f) for FIX propeptide variants. Furin cleavage site (Furin CS), ω-loop, ExxxExC motif, and aromatic stack (AS) are annotated above (d). Heatmap color indicates antibody score from 0 (blue, lowest 5% of scores) to white (1, WT) to red (increased). Black dots indicate the WT amino acid. Missing data are gray. g-i. Heatmaps showing carboxylation-sensitive FIX-specific carboxylation scores (g), carboxylation-sensitive Gla-motif carboxylation scores (h), or light chain secretion scores (i) for FIX Gla domain variants. ω-loop, ExxxExC motif, and aromatic stack (AS) are annotated above (g). Heatmap color indicates antibody score from 0 (blue, lowest 5% of scores) to white (1, WT) to red (increased). Black dots indicate the WT amino acid. Missing data are gray.
Extended Data Figure 8:
Extended Data Figure 8:. Clinical correlates of secretion and gamma-carboxylation scores map to FIX biochemical features.
a. Scatter plot of the mean and standard error of light chain secretion scores (n = 2 replicates) and FIX plasma antigen from individuals with hemophilia B in the EAHAD database (n = 416 variants). Light chain epitope-adjacent positions identified in Extended Data Fig. 4a are removed (n = 19 variants across 38 individuals). Dashed horizontal line is 40% FIX plasma antigen. Dashed vertical line is the 5th percentile of the synonymous secretion score distribution. b. Comparison of hemophilia B severity from individuals with hemophilia B in the EAHAD database (n = 1,781 variants) with light chain secretion scores. Light chain epitope-adjacent positions identified in Extended Data Fig. 4a are removed (n = 40 variants). Violin plot shows distribution of points with an inset box plot representing the 25th, 50th, and 75th percentiles. Whiskers span the range of data. Dashed horizontal line is the 5th percentile of the synonymous secretion score distribution. p values from a Kruskal–Wallis test adjusted for multiple comparisons by post-hoc Dunn’s test are shown. c. Scatter plot of the mean and standard error of light chain secretion scores (n = 2 replicates) and FIX plasma antigen from individuals harboring gain-of-cysteine variants in the EAHAD database (n = 9 variants across 27 individuals). Dashed horizontal line is 40% FIX plasma antigen. Dashed vertical line is the 5th percentile of the synonymous secretion score distribution. d. Bar plot of hemophilia B disease severity in the EAHAD database for individuals harboring gain-of-cysteine variants. e. Bar plot of the number of FIX variants in the EAHAD database and their classification using the random forest model trained on MultiSTEP functional data, by disease severity. Color indicates model prediction. f. Bar plot of the number of FIX propeptide and Gla domain variants in the EAHAD database and their classification using the random forest model trained on MultiSTEP functional data, by disease severity. Color indicates model prediction.
Extended Data Figure 9:
Extended Data Figure 9:. Random forest model predictions for FIX variants in the EAHAD FIX Variant Database associated with hemophilia B.
a. Spearman correlation of MultiSTEP functional scores with EVE, AlphaMissense, REVEL, and CADD variant effect predictors. b. Histograms of four variant effect predictor scores for F9 missense variants of known effect curated from ClinVar, gnomAD, and MLOF. Color indicates clinical variant interpretation. Data from four variant effect predictors are shown. Black dashed vertical lines indicate the thresholds for each predictor. For AlphaMissense we used the thresholds recommended in the original publication for 90% precision on existing ClinVar annotated variants (≤0.34: benign, 0.34–0.564: uncertain, ≥0.564: pathogenic). For REVEL, we used the thresholds used in the initial publication to assess REVEL’s precision in ClinVar (<0.5: benign, 0.5: uncertain, >0.5 pathogenic). For EVE, we used the thresholds recommended in the original publication for the 75% most confident classifications (≤0.359: benign, 0.359–0.641: uncertain, ≥0.641: pathogenic). For CADD, we used the same thresholds used in the MLOF clinical laboratory (<10: benign, 10–20: uncertain, >20: pathogenic). Number of variants scored by each predictor is annotated. c. Classification accuracy for F9 missense variants of known effect curated from ClinVar, gnomAD, and MLOF in our test set (benign/likely benign, n = 4 variants; pathogenic/likely pathogenic, n = 34 variants) by MultiSTEP variant function classifier and the four variant effect predictors using thresholds defined in (b). True benign/likely benign and pathogenic/likely pathogenic labels are denoted on the x-axis, and columns are colored relative to the classification for each method. Solid colors indicate correct classification, whereas striped colors indicate incorrect classification. For variant effect predictors, missing variants are colored gray with stripes and uncertain predictions are colored yellow with stripes. PPV: positive predictive value; NPV: negative predictive value; Spec: specificity; Sens: sensitivity.
Extended Data Figure 10:
Extended Data Figure 10:. Detection of cell-surface displayed FVIII.
Experimental flow cytometry of B-domain deleted coagulation factor VIII (FVIII) in the MultiSTEP backbone (n = ~30,000 cells per variant). Unrecombined cells (NC) do not display FVIII and serve as a negative control. Fluorescent signal was generated by staining cells with anti-FVIII antibodies specific to each of the five FVIII domains in the heavy chain [A1 (a) and A2 (b)] and light chain [A3 (c), C1 (d), and C2 (e)].
Figure 1:
Figure 1:. MultiSTEP enables at-scale measurement of variant effects in secreted proteins.
a. Secreted proteins (purple) make up approximately 10% of the human proteome. b. Cumulative number of missense variants in secreted proteins deposited in ClinVar from 2016 to 2023, colored by clinical interpretation. c. MultiSTEP retains secreted proteins on the cell surface, establishing a physical link between genotype and phenotype (left panel). Cells expressing a library of variants of the target protein are sorted into bins based upon intensity of fluorescent antibody binding, followed by deep sequencing to derive a functional score for each individual variant (middle panels). The result is a variant effect map (right panel). d. MultiSTEP design. Secreted protein coding sequences (pink) are cloned into an attB-containing landing pad donor plasmid. Secreted proteins are engineered to have C-terminally fused (GGGGS)2 flexible linkers (L1 and L2, teal) attached to a single pass transmembrane domain (TMD, blue). In between the linkers is a strep II epitope tag for surface detection (green). The construct contains an IRES (purple) driving co-transcription of an mCherry fluorophore (red) that serves as a transcriptional control. e-g. Flow cytometry of known well-secreted (p.A37T, p.S220T, WT) and poorly-secreted (p.C28Y) FIX variants displayed using MultiSTEP (n ~30,000 cells per variant). Unrecombined cells do not display FIX and serve as a negative control. Fluorescent signal was generated by staining the library with either an anti-FIX heavy chain antibody (e), an anti-FIX light chain antibody (f), or an anti-strep II tag antibody (g).
Figure 2:
Figure 2:. 17,927 MultiSTEP-derived secretion scores for 8,964 factor IX variants.
a. Factor IX domain and chain architecture. Signal: Signal peptide. Pro: Propeptide. Gla: Gla domain. EGF1: Epidermal growth-like factor 1 domain. EGF2: Epidermal growth-like factor 2 domain. Activation: Activation peptide. Protease: Serine protease domain. b-c. Heatmaps showing FIX heavy chain secretion scores (n = 3 replicates) (b) or FIX light chain secretion scores (n = 2 replicates) (c) for missense FIX variants. Heatmap color indicates secretion score from 0 (blue, lowest 5% of scores) to white (1, WT) to red (increased scores). Black dots indicate the WT amino acid. Missing data are colored gray. d-e. Density distributions of heavy chain (d) or light chain (e) secretion scores for FIX missense (orange) and synonymous (blue) variants. Dashed line denotes the 5th percentile of the synonymous variant distribution. f-g. Scatter plots comparing MultiSTEP-derived heavy chain (f) or light chain (g) secretion scores for seven different FIX variants (p.C28Y, p.A37T, p.G58E, p.E67K, p.C134R, p.S220T, and p.H267L), WT, and an unrecombined negative control to the geometric mean of Alexa Fluor-647 fluorescence measured using flow cytometry individually (n = 3 replicates). Error bars show standard error of the mean. The p.E67K missense variant is not present in (g). h. Scatter plot of median MultiSTEP-derived heavy chain (n = 3 replicates) and light chain (n = 2 replicates) secretion scores at each position in FIX. Points are colored by chain architecture, using the same color scheme as (a). Black dashed line indicates perfect correlation between secretion scores. Pearson’s correlation coefficient is shown. Gray background indicates <0.3 point deviation from perfect correlation. Points with median positional scores outside gray background are labeled with their corresponding FIX position. i. AlphaFold2 model of mature, two-chain FIX (positions 47–191 and 227–461). Putative FIX heavy chain (purple) or light chain (green) epitope positions shown as colored surfaces. j. Magnified view of the putative light chain epitope within the EGF1 domain (orange). Points colored in concordance with (h). k. Magnified view of the putative heavy chain epitope within the FIX serine protease domain (yellow). Points colored in concordance with (h).
Figure 3:
Figure 3:. MultiSTEP reveals biochemical constraints on secretion.
a. Predicted signal peptide regions for WT FIX from SignalP 6.0 (top). Heatmap shows FIX heavy chain secretion scores for signal peptide variants (bottom, n = 3 replicates). Heatmap color indicates secretion score. Black dots indicate the WT amino acid. Missing scores are gray. N: N-region; H: H-region; C: C-region. b. Comparison of MultiSTEP secretion scores with SignalP 6.0 (SP6) functional classification, grouped by signal peptide region. Violin plot shows distribution of points with an inset box plot representing the 25th, 50th, and 75th percentiles. Whiskers span the range of data. Dashed horizontal line is the 5th percentile of the synonymous secretion score distribution. Number of variants in each class is labeled above the violin plot. c. FIX cysteine positions colored by domain architecture (top). Sig: Signal peptide. Gla: Gla domain. EGF1: Epidermal growth-like factor 1 domain. EGF2: Epidermal growth-like factor 2 domain. Protease: Serine protease domain. Disulfide bridges in WT FIX are denoted by black connecting lines,–. Heatmap of FIX heavy chain secretion scores for loss-of-cysteine substitutions, colored as in (b) (bottom, n = 3 replicates). d. Mean (point) and standard error (error bars) of variant effect scores (FIX: MultiSTEP, all others: VAMP-seq) for all loss-of-cysteine substitutions for different proteins–, (n = 1,031 variants). Bonferroni-corrected pairwise two-sided t-test p values are shown. e. Box plots representing the 25th, 50th, and 75th percentiles of secretion scores for all missense variants across all positions with the indicated WT amino acid (n = 8,528 variants). Whiskers span the range of data. f. Mean (point) and standard error (error bars) of variant effect scores (FIX: MultiSTEP, all others: VAMP-seq) for all gain-of-cysteine substitutions for different proteins (n = 1,404 variants). Bonferroni-corrected pairwise two-sided t-test p values are shown. g. Box plots representing the 5th, 25th, 50th, 75th, and 95th percentiles of secretion scores for all missense substitutions of the indicated variant amino acid across all positions (n = 8,528 variants). Whiskers span the range of data.
Figure 4:
Figure 4:. MultiSTEP enables measurement of variant effects on FIX post-translational modification.
a. Factor IX domain and chain architecture. Signal: Signal peptide. Pro: Propeptide. Gla: Gla domain. EGF1: Epidermal growth-like factor 1 domain. EGF2: Epidermal growth-like factor 2 domain. Activation: Activation peptide. Protease: Serine protease domain. b-c. Heatmaps showing carboxylation-sensitive FIX-specific carboxylation scores (b) or carboxylation-sensitive Gla-motif carboxylation scores (c) for nearly all missense FIX variants (n = 2 replicates). Heatmap color indicates antibody score from 0 (blue, lowest 5% of scores) to white (1, WT) to red (increased antibody scores). Black dots indicate the WT amino acid. Missing data scores are colored gray. Furin cleavage site (F), ω-loop (ω), ExxxExC motif (E), and aromatic stack (AS) are annotated above (b) and (c). For higher resolution heatmaps on the propeptide and Gla domains of FIX, please refer to Extended Data Fig. 7d–i. d-e. Density distributions of carboxylation-sensitive FIX-specific (d) or carboxylation-sensitive Gla-motif (e) carboxylation scores for FIX missense variants (orange) and synonymous variants (blue). Dashed line denotes the 5th percentile of the synonymous variant distribution. f. Scatter plot of median MultiSTEP-derived carboxylation-sensitive FIX-specific carboxylation scores and light chain secretion scores at each position in FIX. Points are colored by domain architecture, using the same color scheme as a. Black dashed line indicates >0.2 point deviation threshold from perfect correlation between carboxylation and secretion scores. Points with deviation greater than this threshold are labeled with their corresponding FIX position. Pearson’s correlation coefficient is shown. g. Crystal structure of FIX Gla domain (positions 47–92). Disulfide bridges (yellow) and γ-carboxylated glutamates are shown as sticks. Calcium ions are shown as teal spheres. Residues are colored as the ratio of the median carboxylation-sensitive FIX-specific carboxylation score to median FIX light chain secretion score. Missing positions are colored gray.
Figure 5:
Figure 5:. Secretion and gamma-carboxylation scores reveal clinical features of hemophilia B and enable variant reinterpretation.
a. Scatter plot of the mean and standard error of light chain secretion scores (n = 2 replicates) and FIX plasma antigen from 457 individuals with hemophilia B in the EAHAD database. Dashed horizontal line is 40% FIX plasma antigen. Dashed vertical line is the 5th percentile of the synonymous secretion score distribution. b. Comparison of hemophilia B disease severity in the EAHAD database with light chain secretion scores (n = 490 variants). Violin plot shows distribution of points with an inset box plot representing the 25th, 50th, and 75th percentiles. Whiskers span the range of data. Dashed horizontal line is the 5th percentile of the synonymous secretion score distribution. p values from a Kruskal–Wallis test adjusted for multiple comparisons by post-hoc Dunn’s test are shown. c. Severe hemophilia B disease-associated variants with WT-like light chain secretion scores or FIX-specific γ-carboxylation antibody scores is shown. Bars are colored by domain. d. Comparison of hemophilia B disease severity in the EAHAD database with light chain secretion scores (n = 229 variants). Violin plot shows distribution of points with an inset box plot representing the 25th, 50th, and 75th percentiles. Whiskers span the range of data. Dashed horizontal line is 40% FIX plasma antigen. p values from a Kruskal-Wallis test followed by post-hoc Bonferroni-corrected Dunn’s test are shown. e. Histograms of multiplexed functional scores for F9 missense variants of known effect curated from ClinVar, gnomAD, and MLOF. Color indicates clinical variant interpretation. Data from four antibodies are shown. Dashed vertical line indicates the 5th percentile of synonymous variants used as a threshold for abnormal function. f. Receiver-operator curve for variant function classifier. Dot indicates final classifier performance. g. Histogram depicting F9 missense variant minor allele frequencies (MAF) in hemizygotes in gnomAD 4.1. Color indicates model-based functional classification using MultiSTEP scores. Vertical dashed line indicates estimated prevalence of hemophilia B in hemizygous individuals. h. Sankey diagram of F9 variant reinterpretation using functional data as moderate or strong evidence. Labeled nodes represent the number of variants of each class.
Figure 6:
Figure 6:. MultiSTEP can be applied to diverse secreted proteins.
a. Flow cytometry of cells expressing protein and control constructs in the MultiSTEP backbone following staining with an anti-strep II tag antibody (n ~30,000 cells each). Unrecombined cells do not display FIX. All other constructs contain the MultiSTEP flexible linker, strep II tag, and transmembrane domain. Δstart is a FIX cDNA that lacks a start codon. TM only lacks a secreted protein of interest. FIX Δsignal peptide expresses a FIX molecule without its secretion-targeting signal peptide. b-c. Flow cytometry of B-domain deleted coagulation factor VIII (FVIII) in the MultiSTEP backbone or unrecombined negative control cells (NC) (n ~30,000 cells each) stained with an anti-FVIII A1-A3 antibody, which targets the discontinuous epitope at the interface of the A1 and A3 domains (b) or an anti-FVIII A2 antibody, which targets a discontinuous epitope (positions 497–510 and 584–593) within the A2 domain (c). d-h. Flow cytometry of B-domain deleted coagulation factor VIII (FVIII) and 5 FVIII variants in the MultiSTEP backbone along with unrecombined negative control cells (NC) (n = ~10,000 cells each). Cells were stained with anti-FVIII antibodies specific to the A1 (d), A2 (e), light chain (f), C1 (g), or C2 (h) domains. i-m. Flow cytometry of cells expressing coagulation factor VII (i), coagulation factor X (j), proinsulin (k), plasma protease C1 inhibitor (l), and alpha-1 antitrypsin (m) constructs in the MultiSTEP backbone along with unrecombined negative control (NC) (n ~10,000 cells each) stained with an anti-strep II tag antibody.

Update of

References

    1. Karczewski KJ et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020). - PMC - PubMed
    1. Richards S et al. Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–423 (2015). - PMC - PubMed
    1. Fayer S et al. Closing the gap: Systematic integration of multiplexed functional data resolves variants of uncertain significance in BRCA1, TP53, and PTEN. Am. J. Hum. Genet. 108, 2248–2258 (2021). - PMC - PubMed
    1. Tabet D, Parikh V, Mali P, Roth FP & Claussnitzer M Scalable Functional Assays for the Interpretation of Human Genetic Variation. Annu. Rev. Genet. 56, 441–465 (2022). - PubMed
    1. Uhlén M et al. The human secretome. Sci. Signal. 12, (2019). - PubMed

Methods-only references

    1. Gibson DG et al. Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat. Methods 6, 343–345 (2009). - PubMed
    1. García-Nafría J, Watson JF & Greger IH IVA cloning: A single-tube universal cloning system exploiting bacterial In Vivo Assembly. Sci. Rep. 6, 1–12 (2016). - PMC - PubMed
    1. den Dunnen JT et al. HGVS recommendations for the description of sequence variants: 2016 update. Hum. Mutat. 37, 564–569 (2016). - PubMed
    1. Miao HZ et al. Bioengineering of coagulation factor VIII for improved secretion. Blood 103, 3412–3419 (2004). - PubMed
    1. Kessler CM et al. B-domain deleted recombinant factor VIII preparations are bioequivalent to a monoclonal antibody purified plasma-derived factor VIII concentrate: a randomized, three-way crossover study. Haemophilia 11, 84–91 (2005). - PubMed

LinkOut - more resources