Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Oct;32(5):475-486.
doi: 10.1038/s41434-025-00548-3. Epub 2025 Jul 8.

hafoe: an interactive tool for the analysis of chimeric AAV libraries after random mutagenesis

Affiliations

hafoe: an interactive tool for the analysis of chimeric AAV libraries after random mutagenesis

Tatevik Jalatyan et al. Gene Ther. 2025 Oct.

Abstract

Naturally occurring adeno-associated viruses (AAVs) are an integral part of gene therapy, yet engineering novel AAV variants is necessary to expand targetable tissues and treatable diseases. Directed evolution, particularly through DNA shuffling of the capsid genes of wild-type AAV serotypes, is a widely employed strategy to generate novel chimeric variants with desired properties. Yet, the computational analysis of such chimeric sequences presents challenges. We introduce hafoe, a novel computational tool designed for the exploratory analysis of chimeric AAV libraries, which does not require extensive bioinformatics expertise. hafoe accurately deciphers the serotype composition and enrichment patterns of chimeric AAV variants across different tissues. Validation against synthetic datasets demonstrates that hafoe identifies parental serotype compositions with an accuracy of 96.3% to 97.5%. Additionally, we engineered chimeric AAV capsid libraries and screened novel AAV variants for tropism to human dermal fibroblasts and dendritic cells, as well as canine muscle, and liver tissues. Using hafoe we identified and characterized enriched AAV variants in these tissues for potential use in gene therapy and vaccine development. Overall, hafoe can provide valuable insights that may further support the rational design of AAV vectors based on parental serotype and sequence preferences of the capsid genes in target tissues.

PubMed Disclaimer

Conflict of interest statement

Competing interests: Harvard University has filed for patent protection on DC and HDF-enriched AAV variants described herein, and EA, DM, and GMC are named as co-inventors on the patent. At the time of study, EA performed consulting services, and RH and ND were employees and held equity in Rejuvenate Bio. DM and DBT are full-time employees and hold equity in Medici Therapeutics, Inc. For a complete list of GMC's financial interests, see http://arep.med.harvard.edu/gmc/tech.html . The remaining authors declare no competing interests. Ethical approval: All procedures involving Beagle dogs in this study were conducted in accordance with the USDA Animal Welfare Act (9 CFR, Parts 1, 2, and 3) and the Guide for Care and Use of Laboratory Animals (ILAR publication, 2011). The protocols were approved by the institutional ethical review board at Absorption Systems California, LLC (ASC), and conducted under the authority of the Project License issued by ASC’s compliance office. The animals were housed in compliance with ethical standards, provided with a certified laboratory diet (5007 laboratory canine diet from LabDiet), and had access to water ad libitum. The study followed the Standard Operating Procedures established at ASC, ensuring a high standard of care and scientific integrity throughout the experiment.

Figures

Fig. 1
Fig. 1. Schematic overview of experimental design and workflow of hafoe.
A Experimental design for generating chimeric AAV capsid library from parental AAV serotypes, followed by enrichment in DCs and HDFs, total RNA extraction, and PacBio sequencing. B hafoe takes as input parental AAV serotypes and libraries of chimeric variants before and/or after enrichment in target tissues. hafoe produces interactive graphics representing the identified clusters of similar chimeric vectors, showing the prevalence of each wild-type vector and serotype composition of chimeric representatives. It also shows the enrichment of representative sequences in target tissues and the tissue-specificity of each vector.
Fig. 2
Fig. 2. hafoe’s neighbor-aware serotype identification algorithm.
A Chimeric library cluster size distribution. hafoe clusters the chimeric library sequences based on sequence identity and uses the representative variants of the clusters in subsequent analysis. B To identify the variant composition in terms of the AAV serotypes, hafoe first performs variant decomposition: chopping the variant into overlapping fragments of fixed size, aligning the fragments against AAV serotypes, and storing the alignment results in a list. C hafoe then performs neighbor-aware serotype identification with the following steps: for each position covered by fragments with multiple assignments Step 1) perform quality filtering (see methods), Step 2), identify the set of serotypes shared with its left and right neighbors if those exist (left_intersect and right_intersect), and update the position accordingly, Steps 3) annotate the unresolved positions as multimappers (e.g. with 17).
Fig. 3
Fig. 3. Comparison of true and predicted compositions of chimeric library representative variants in synthetic data.
Results for one of the five synthetic datasets are presented here (data for the rest not shown). Position resolved abundance of parental AAV serotypes in the representative variants based on true composition labels stored while generating the data (A) and based on the serotype assessment labels obtained by the neighbor-aware serotype identification method of hafoe (B). The abundances were averaged over 100 nt windows. True (C) and predicted (D) parental AAV serotype compositions of the representative variants. Multiple sequence alignment (MSA) of the representatives was performed to align the homology regions of the representatives. Gaps in MSA are colored white, unresolved positions are colored black. Conservation levels of the representative variants are displayed in the lower bar of the heatmap, with positions from higher to lower conservation scores represented on a light-to-dark scale.
Fig. 4
Fig. 4. Comprehensive analysis of AAV chimeric library by hafoe.
A Chimeric library cluster size distribution for the top 20 clusters. B Parental AAV serotype abundance in the chimeric library. C Compositions of the top 20 cluster representatives in terms of parental AAV serotypes. MSA of the representatives was performed to align the homology regions of the representatives. Gaps in MSA are colored white, unresolved positions are colored black, and the positions with no identified serotypes are colored gray. Conservation levels of the representative variants are displayed in the lower bar of the heatmap, with positions from higher to lower conservation scores represented on a light-to-dark scale. The variable regions (VR) I to IX of AAV2 [12] and the remaining conservative regions (CR) are indicated below the heatmap. D Position resolved abundance of parental AAV serotypes in the top 20 cluster representatives. The abundances were averaged over 100 nt windows.
Fig. 5
Fig. 5. Enrichment profiles of representative variants in HDFs and DCs with a > 1 log2 fold change over the chimeric library in either of the cell types.
A Log2 normalized counts of representative variants in chimeric and enriched libraries. B Log2 fold change of the representative variants in enriched libraries over the chimeric library. The tissue specificity of the variants is color-coded with underlines to their names. CE Compositions of the DC- and HDF-specific (C) HDF-specific (D) and DC-specific (E) representative variants in terms of parental AAV serotypes. The variable regions (VR) I to IX of AAV2 [12] and the remaining conservative regions (CR) are indicated below the heatmap.
Fig. 6
Fig. 6. Position resolved relative abundance or probability of parental AAV serotypes in identified tissue-specific variants and previously reported AAV vectors.
DC- and/or HDF-specific variants (A) and canine muscle- and/or liver-specific variants (B) identified by hafoe. Human liver-, islet-tropic variants from literature (C). The abundances were averaged over 100 nt windows.

References

    1. Korneyenkov MA, Zamyatnin AA. Next Step in Gene Delivery: Modern Approaches and Further Perspectives of AAV Tropism Modification. Pharmaceutics. 2021;13:750. - PMC - PubMed
    1. Nyberg WA, Ark J, To A, Clouden S, Reeder G, Muldoon JJ, et al. An evolved AAV variant enables efficient genetic engineering of murine T cells. Cell. 2023;186:446–460.e19. - PMC - PubMed
    1. Sung YK, Kim SW. Recent advances in the development of gene delivery systems. Biomater Res. 2019; 23. 10.1186/S40824-019-0156-Z. - PMC - PubMed
    1. Li C, Samulski RJ. Engineering adeno-associated virus vectors for gene therapy. Nat Rev Genet. 2020;21:255–72. - PubMed
    1. Wang D, Tai PWL, Gao G. Adeno-associated virus vector as a platform for gene therapy delivery. Nature Reviews Drug Discovery. 2019;18:358–78. - PMC - PubMed

Substances

LinkOut - more resources