Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Mar 27:2024.03.21.585809.
doi: 10.1101/2024.03.21.585809.

Mutational scanning of CRX classifies clinical variants and reveals biochemical properties of the transcriptional effector domain

Affiliations

Mutational scanning of CRX classifies clinical variants and reveals biochemical properties of the transcriptional effector domain

James L Shepherdson et al. bioRxiv. .

Update in

Abstract

Cone-Rod Homeobox, encoded by CRX, is a transcription factor (TF) essential for the terminal differentiation and maintenance of mammalian photoreceptors. Structurally, CRX comprises an ordered DNA-binding homeodomain and an intrinsically disordered transcriptional effector domain. Although a handful of human variants in CRX have been shown to cause several different degenerative retinopathies with varying cone and rod predominance, as with most human disease genes the vast majority of observed CRX genetic variants are uncharacterized variants of uncertain significance (VUS). We performed a deep mutational scan (DMS) of nearly all possible single amino acid substitution variants in CRX, using an engineered cell-based transcriptional reporter assay. We measured the ability of each CRX missense variant to transactivate a synthetic fluorescent reporter construct in a pooled fluorescence-activated cell sorting assay and compared the activation strength of each variant to that of wild-type CRX to compute an activity score, identifying thousands of variants with altered transcriptional activity. We calculated a statistical confidence for each activity score derived from multiple independent measurements of each variant marked by unique sequence barcodes, curating a high-confidence list of nearly 2,000 variants with significantly altered transcriptional activity compared to wild-type CRX. We evaluated the performance of the DMS assay as a clinical variant classification tool using gold-standard classified human variants from ClinVar, and determined that activity scores could be used to identify pathogenic variants with high specificity. That this performance could be achieved using a synthetic reporter assay in a foreign cell type, even for a highly cell type-specific TF like CRX, suggests that this approach shows promise for DMS of other TFs that function in cell types that are not easily accessible. Per-position average activity scores closely aligned to a predicted structure of the ordered homeodomain and demonstrated position-specific residue requirements. The intrinsically disordered transcriptional effector domain, by contrast, displayed a qualitatively different pattern of substitution effects, following compositional constraints without specific residue position requirements in the peptide chain. The observed compositional constraints of the effector domain were consistent with the acidic exposure model of transcriptional activation. Together, the results of the CRX DMS identify molecular features of the CRX effector domain and demonstrate clinical utility for variant classification.

Keywords: CRX; acidic exposure model; deep mutational scan; intrinsically disordered domain; retina; retinopathy; transcription factor.

PubMed Disclaimer

Conflict of interest statement

Declaration of Interests B.A.C. is on the scientific advisory board of Patch Biosciences. The other authors declare no competing interests.

Figures

Figure 1:
Figure 1:. Experimental overview of the CRX deep mutational scan.
(A) Known CRX domains, sequence conservation, predicted disorder, and reported ClinVar missense variants. Per-residue conservation computed using sequences from the UniProt UniRef50 cluster derived from human CRX. Disorder predicted with Metapredict (Emenecker et al. 2021). Missense pathogenic variants (“Pathogenic” and/or “Likely pathogenic”) and benign variants (“Benign” and/or “Likely Benign”) from ClinVar (accessed 2023–12); height of bar proportional to number of variants at each position (max=3). (B) Schematic of the CRX deep mutational scan. A clonal cell line carrying a CRX-responsive fluorescent reporter and genomic landing pad (LP) was generated, and a library of CRX variants was integrated into the LP so that each cell expresses a single variant. Following fluorescence-activated cell sorting (FACS), sequencing was used to determine relative variant barcode abundances in each fluorescence bin, allowing for the calculation of a reporter activity score. (C) LP cells integrated with a 1:1 ratio of plasmids carrying mEmerald (green; arbitrary units) or mCherry2 (red; arbitrary units), with or without a plasmid expressing Cre recombinase (60,000 cells plotted per condition, points shaded by density). (D) Reporter activation (green fluorescence; arbitrary units) measured in Reporter + LP cells with the indicated CRX variants integrated. Two independent biological replicate experiments per sample (distributions plotted from 40,000 cells).
Figure 2:
Figure 2:. DMS activity scores for all measured single amino acid CRX substitutions.
Activity scores were normalized to wild-type CRX; the wild-type amino acid at each position is indicated by the gray circle in each column. The average row shows the mean activity score for all substitutions at each position. Disorder, domains, and conservation are shown as in Figure 1. Empty boxes indicate variants not measured in the DMS assay, due to drop-out during the library cloning or variant measurement steps.
Figure 3:
Figure 3:. Classification of CRX variants.
(A) DMS activity scores for variants reported to ClinVar in each of the indicated classes (Pathogenic includes “Pathogenic” and/or “Likely pathogenic”; Benign includes “Benign” and/or “Likely Benign”). For variants in the Conflicting Interpretations class, reanalysis of the boxed variants support re-classification into the groups indicated by box color. (B) Barcode-level activity measurements for wild-type CRX and the indicated representative wild-type-like (p.G137T), low-activity (p.D265I), and high-activity (p.F296P) variants. Each black dot represents a unique barcoded construct; not shown for wild-type CRX due to it being barcoded thousands of times. (C) Volcano plots showing classifications for the indicated ClinVar variants (left and middle panels) or all other variants not yet reported in ClinVar (right panel). FDR-corrected p-values were computed from a two-sample Kolmogorov–Smirnov test comparing each variant’s barcode-level measurements to those of wild-type CRX, as visualized in (B). The horizontal line corresponds to an FDR-corrected p-value of 0.05; the vertical line corresponds to a normalized activity score of 1 (wild-type). For visualization purposes, the y-axis is clipped to 10−15; 13 variants are hidden with activity scores greater than wildtype and p-values up to 10−26. (D) Quantized DMS activity scores for each variant, coloring low- and high-activity variants using the significance cutoffs shown as dotted lines in (C). Disorder, domains, and conservation are shown as in Figure 1.
Figure 4:
Figure 4:. Average DMS activity scores superimposed on a predicted structure of the CRX homeodomain in complex with DNA.
(A) Various views of residues 38–104 of an AlphaFold predicted structure of CRX aligned to a crystal structure of Drosophila paired in complex with DNA (PDB 1FJL) (Wilson et al. 1995). For each view, a cartoon ribbon model is shown on the left and a space-filling atomic model is on the right. (B) Close-up of residues in the major groove with K88 highlighted (side chain shown). (C) Close-up of minor groove-contacting residues with arginine residues highlighted (side chains shown). In all panels, residues are colored by average DMS activity score, as shown in the “Average” track in Figure 2.
Figure 5:
Figure 5:. Residue class preferences in the CRX transcriptional effector domain.
(A) Unsupervised hierarchical clustering (UPGMA method) of per-position activity scores for residues in the disordered region of the transcriptional effector domain (residues 2–38 and 153–264). Residues colored by class. Substitution activity scores colored as in Figure 2. (B) Abundance of aspartic acid (D) and glutamic acid (E) residues in disordered regions of all human TFs and CRX orthologs. (C) Abundance of phenylalanine (F), tryptophan (W), tyrosine (Y) and leucine (L) residues in disordered regions of all human TFs and CRX orthologs. For (B) and (C), significance tested by two-sided Mann-Whitney U-test; p < 1e-39. (D) Comparison of the effects of substituting non-hydrophobic positions in wild-type CRX with hydrophobic (F, W, Y, M, I, L, or V) or non-hydrophobic amino acids, separated by the number of neighboring hydrophobic residues. Analysis limited to positions in the disordered transcriptional effector domain; the number of positions in each neighbor group is shown in parentheses. (E) Reporter activation (green fluorescence; arbitrary units) measured in Reporter + LP cells with the indicated CRX variants integrated. Two independent biological replicate experiments per sample (distributions plotted from 50,000 cells).

Similar articles

References

    1. Brandes N, Goldman G, Wang CH, Ye CJ, Ntranos V. 2023. Genome-wide prediction of disease variant effects with a deep protein language model. Nature Genetics 55: 1512–1522. - PMC - PubMed
    1. Brnich SE, Abou Tayoun AN, Couch FJ, Cutting GR, Greenblatt MS, Heinen CD, Kanavy DM, Luo X, McNulty SM, Starita LM, et al. 2020. Recommendations for application of the functional evidence PS3/BS3 criterion using the ACMG/AMP sequence variant interpretation framework. Genome Medicine 12: 3. - PMC - PubMed
    1. Chen S, Wang Q-L, Xu S, Liu I, Li LY, Wang Y, Zack DJ. 2002. Functional analysis of cone–rod homeobox (CRX) mutations associated with retinal dystrophy. Human Molecular Genetics 11: 873–884. - PubMed
    1. Cheng J, Novati G, Pan J, Bycroft C, Žemgulytė A, Applebaum T, Pritzel A, Wong LH, Zielinski M, Sargeant T, et al. 2023. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science (New York, NY) 381: eadg7492. - PubMed
    1. Corbo JC, Lawrence KA, Karlstetter M, Myers CA, Abdelaziz M, Dirkes W, Weigelt K, Seifert M, Benes V, Fritsche LG, et al. 2010. CRX ChIP-seq reveals the cis-regulatory architecture of mouse photoreceptors. Genome Research 20: 1512–1525. - PMC - PubMed

Publication types