Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 19;122(33):e2419744122.
doi: 10.1073/pnas.2419744122. Epub 2025 Aug 11.

Targeted deletions of large syntenic regions in Arabidopsis thaliana

Affiliations

Targeted deletions of large syntenic regions in Arabidopsis thaliana

Ashot Papikian et al. Proc Natl Acad Sci U S A. .

Abstract

Plant genomes have undergone multiple rounds of whole-genome duplication (WGD) throughout their evolutionary history. As a result, many species, including Arabidopsis thaliana, retain duplicated genomic segments, or syntenic regions, which harbor large numbers of paralogous genes preserved from these ancient WGD events. We deleted four large, duplicated blocks, ranging from ~115 kb to ~684 kb using Staphylococcus aureus Cas9 to explore the effects of knocking out these blocks in Arabidopsis. Large deletions like these remain rare, especially in small and gene-dense plant genomes. Deletions were subsequently verified using whole-genome sequencing, which revealed minimal off-target effects. The number of deleted genes ranged from 16 to 60, and transposable elements ranged from 4 to 112 among the four deleted blocks. Two deletion lines showed distinct phenotypes resulting from the loss of many genes, while two others displayed no obvious defects, including for flowering time or hypocotyl elongation. Moreover, RNA-sequencing revealed that expression compensation, where deletions of paralogous genes lead to the upregulation of intact paralogues, was not a general response to the deleted regions under the conditions tested. Thus, it is possible to obtain viable plants when deleting large fragments that may be redundant or that contain nonessential genes. These results demonstrate that large chromosomal deletions can be used as a tool for genome engineering approaches, such as genome minimization in plants and allele replacement using homology-directed repair and other precision editing methods. Targeted deletions of large chromosome fragments will be a valuable tool for research and biotechnology applications.

Keywords: CRISPR–Cas9; deletions; dosage compensation; synteny; whole-genome duplication (WGD).

PubMed Disclaimer

Conflict of interest statement

Competing interests statement:T.P.M. is a founder of the carbon sequestration company Cquesta.

Figures

Fig. 1.
Fig. 1.
Arabidopsis syntenic blocks. (A) Scatterplot comparing syntenic block pairs across the Arabidopsis genome, which were defined as having at least five syntenic genes per 20 gene windows. (B) Synonymous substitution (Ks) rate was calculated for syntenic gene pairs and plotted as a function of block one length, which shows that blocks from the γ WGT are generally shorter since they are older than the α/β WGDs. The Ks mode was used to calculate duplication age. The age was then used to infer the polyploidy event from which the syntenic block arose. (C) Comparison of the syntenic block length between the γ WGT and α/β WGDs. (D) The four syntenic blocks across the Arabidopsis genome. Block 324 (blue ribbon), block 268 (cyan ribbon), block 271 (green ribbon), and block 438 (red ribbon); all syntenic blocks (gray ribbons). The constructs that were created to target each block are indicated in parentheses. (E) Comparison of block 438 with its pair (215) as well as one syntenic region in canola and camelina. (F) Heatmap representing the log-transformed (log2(count + 1)) gene counts across orthogroups (rows) and species (columns) for block 438. The color gradient indicates gene counts, where darker colors represent higher gene counts and lighter colors represent lower gene counts. Only orthogroups specific to the block were included.
Fig. 2.
Fig. 2.
Deletion screening and characterization of deletion lines. (A) Schematic depicting the screening process to identify large deletions. Primers flanking gRNA sites were designed to detect whether a small PCR product can be obtained, signifying the presence of a deletion (deletion PCR, primers 1 + 2). Primers were also designed to amplify left and right junctions (primers 1 + 3 and 4 + 2, respectively), as well as to detect whether a wt allele is present in a sample (primers 5 + 6). (B) Gel displaying results from deletion PCR screening of 43 T1 pAP141 (construct targeting syntenic block 271) samples. C = Col-0, L = ladder. (C) Screenshot from the Integrative Genomics Viewer (IGV) showing whole-genome sequencing read alignments within syntenic block 271. Tracks from T2 and T3 samples are displayed. Red arrows represent designed gRNA spacers. (D) Two representative plants from Col-0 and a syntenic block 271 deletion line. (E) One representative Col-0 plant and plants from two independent syntenic block 438 deletion lines are displayed. (F) Top panel displays pooled screening of 12 pools of T2 pAP146 (construct targeting syntenic block 268) samples. The Bottom panel shows deletion PCRs of 36 individual plants from the positive pool. C = Col-0, L = ladder. (G) The first panel shows a comparison of Col-0 plants and a T3 homozygous line. The second panel shows a close-up view of a block 268 deletion line. “+” indicates samples containing the T-DNA, and “−” indicates null segregants.
Fig. 3.
Fig. 3.
Investigation of compensation for deleted genes at the transcriptional level. (A) In Col-0 and the pAP141 and pAP142 deletion lines, total expression of the retained gene members of orthogroups affected by deletions shows that paralogues of deleted genes are generally not significantly affected. t1 = time point 1, t2 = time point 2. non-TOD refers to the separate set of pAP142 samples used for RNA-seq that were not part of the TOD experiment. (B) A network diagram representing a subset of module 8 from a large Arabidopsis coexpression analysis. Small network edges were filtered to a total network density of approximately 2.15 edges per gene. Any gene in the module that was left with no edges after this filtering was dropped from the visualization. This module contains a deleted ribosomal gene, AT5G28060, and is also strongly associated with our list of upregulated genes in that deletion line. (C, Top) A volcano plot showing significance of differential expression vs. change in expression for module 8 genes when comparing pAP142 deletion lines to Col-0. Module 8 genes are much more likely to be upregulated than we expect by chance. (Bottom) The same data as the Top panel, but with the deleted ribosomal gene, AT5G28060, removed to better visualize the other genes. (D) A heatmap representing associations between our gene lists of interest (deleted or DEGs) and coexpression network modules. The color reflects the log fold change between the observed intersection size of the module and gene list and the expected size assuming the two lists are independent.
Fig. 4.
Fig. 4.
Generation of deletion lines (deletions of retained duplicate regions or nonlethal genomic segments) and their applications. Libraries of large deletion lines in plants (analogous to a T-DNA population) can be generated, such as in A. thaliana or crop species, and screened for traits of interest, subsequently leading to the characterization of genes with unknown functions. Furthermore, novel pathways and networks can be discovered through our approach and further elucidate the significance of duplicated regions that have been retained. Deleted genomic segments within various lines can also serve as landing pads for precision engineering methods (e.g., Cas9-mediated HDR or other precision methods). Deletion of syntenic regions and/or nonessential segments of genomes provides a path toward eukaryotic genome minimization. By integrating genomics data (such as single-nuclei RNA-seq) and phenotypic information from a library of deletion lines, AI/machine learning pipelines can be utilized to predict outcomes resulting from new deletions and to predict functions of unknown genes and uncharacterized noncoding genomic regions. Figure created with BioRender.com.

References

    1. Bowers J. E., Chapman B. A., Rong J., Paterson A. H., Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422, 433–438 (2003). - PubMed
    1. Soltis D. E., et al. , Polyploidy and angiosperm diversification. Am. J. Bot. 96, 336–348 (2009). - PubMed
    1. Freeling M., Bias in plant gene content following different sorts of duplication: Tandem, whole-genome, segmental, or by transposition. Annu. Rev. Plant Biol. 60, 433–453 (2009). - PubMed
    1. Michael T. P., Plant genome size variation: Bloating and purging DNA. Brief. Funct. Genomic 13, 308–317 (2014). - PubMed
    1. Cheng F., et al. , Gene retention, fractionation and subgenome differences in polyploid plants. Nat. Plants 4, 258–268 (2018). - PubMed

LinkOut - more resources