Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb 28;14(1):e0000923.
doi: 10.1128/mbio.00009-23. Epub 2023 Feb 6.

A Virus-Packageable CRISPR System Identifies Host Dependency Factors Co-Opted by Multiple HIV-1 Strains

Affiliations

A Virus-Packageable CRISPR System Identifies Host Dependency Factors Co-Opted by Multiple HIV-1 Strains

Vanessa R Montoya et al. mBio. .

Abstract

At each stage of the HIV life cycle, host cellular proteins are hijacked by the virus to establish and enhance infection. We adapted the virus packageable HIV-CRISPR screening technology at a genome-wide scale to comprehensively identify host factors that affect HIV replication in a human T cell line. Using a smaller, targeted HIV Dependency Factor (HIVDEP) sublibrary, we then performed screens across HIV strains representing different clades and with different biological properties to define which T cell host factors are important across multiple HIV strains. Nearly 90% of the genes selected across various host pathways validated in subsequent assays as bona fide host dependency factors, including numerous proteins not previously reported to play roles in HIV biology, such as UBE2M, MBNL1, FBXW7, PELP1, SLC39A7, and others. Our ranked list of screen hits across diverse HIV-1 strains form a resource of HIV dependency factors for future investigation of host proteins involved in HIV biology. IMPORTANCE With a small genome of ~9.2 kb that encodes 14 major proteins, HIV must hijack host cellular machinery to successfully establish infection. These host proteins necessary for HIV replication are called "dependency factors." Whole-genome, and then targeted screens were done to try to comprehensively identify all dependency factors acting throughout the HIV replication cycle. Many host processes were identified and validated as critical for HIV replication across multiple HIV strains.

Keywords: CRISPR screen; T cells; dependency factor; human immunodeficiency virus; transcription factors; virus replication.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

FIG 1
FIG 1
Genome-wide HIV-CRISPR screening to identify dependency factor candidates for the HIV Dependency Factor guide library. (A) HIV-CRISPR screening process. Jurkat-CCR5 cells containing a CRISPR knockout library are infected with HIV-1 in duplicate infections. Strains used in this study are listed above the arrow. Viral RNA and genomic DNA were collected 3 days postinfection and sequences corresponding to sgRNAs present in virions (vRNA) and genomic DNA (gDNA) were quantified through deep sequencing. (B) MAGeCK gene analysis of the genome-wide screen showing the most depleted (dependency) genes in Jurkat T cells infected with HIV-1LAI in duplicate infections. The x axis shows randomly arrayed target genes. The y axis shows the -log10 MAGeCK score for each gene. Host factors previously reported as dependency factors are shown in pink. Other genes in the library are shown in purple. “Synthetic” nontargeting control (NTC) genes generated in silico by iterate random binning of the 142 NTC sgRNA sequences to generate a negative-control set are shown in white. Gene names are shown for all hits with a -log10 MAGeCK score greater than 6 and the entire list is in Table S1. The top 368 most depleted candidate genes (excluding the synthetic NTC genes) are indicated by the dashed black line. (C) Gene Set Enrichment Analysis (GSEA) of the genome-wide screen. The top 20 most enriched Negative Gene Ontologies (most enriched pathways of the depleted/dependency factor candidate genes) are shown here in ranked order by inverse Normalized Enrichment Score (NES). Adjusted P values are displayed next to each gene ontology.
FIG 2
FIG 2
Iterative screening with the HIVDEP sublibrary enriches previously reported and candidate dependency factors. (A) MAGeCK score comparison of each CRISPR screen. Nontargeting control (NTC) sgRNA scores were randomly binned (four NTC guides per gene for TKOv3 or 8 NTC guides per gene for HIVDEP) to recapitulate the same number of genes in the respective libraries. The y axis shows -log10 MAGeCK scores. The mean MAGeCK scores for the synthetic NTC versus the genes are shown above each graph and are represented by the line within each box. The top 95th percentile of NTC/genes is represented as the top horizontal line. For statistical analyses, the MAGeCK scores of the synthetic nontargeting controls (shown here as NTCs) were compared to the MAGeCK scores of the genes in each screen. The gene MAGeCK scores per each screen were also compared across screens. Each comparison resulted in significant P values (****, P < 0.0001; Welch’s t test). Waterfall plots of the top 20 genes or all genes in each HIVDEP screen in descending order are shown for the following HIV-1 strains (B) LAI, (C) LAIredo, (D) LAI-VSV-G, (E) Q23BG505, and (F) CH470TF.
FIG 3
FIG 3
Common and differential use of host cellular pathways by HIV-1 strains. Comparative pathway-focused heatmaps showing enriched or depleted sgRNAs across each HIVDEP screen. The pathways shown are derived from the top 20 most enriched Negative Gene Ontologies of the genome-wide screen (Fig. 1C). Any genes not included in the HIVDEP library were excluded. The z scores were calculated as previously described (21). Z scores on each heatmap are colored from red (lowest/most depleted genes, i.e., dependency factors) to blue (highest/most enriched genes, i.e., negative or restriction factors). The median NTC z score was 0.5 and marks the inflection in the color scale. (A to G). Each biological replicate is represented as a separate column showing the mean scores across wild-type (non-VSV-G pseudotyped) strains. The “Transcription” and “Other” heatmaps were truncated to the top 40 hits each heatmap. Gene names that are bolded with an asterisk indicate they were chosen for validation studies in Fig. 4 and 5. “Other” shows genes that were not assigned to any of the top Gene Ontology categories from Fig. 1C.
FIG 4
FIG 4
Validation of curated top hits list. (A) Heatmap of the candidate genes used for validation studies, ordered by the mean z score of WT strains (B) Pooled knockout Jurkat-CCR5 generated by transducing with lentiviral vectors encoding sgRNAs, including positive-control gene CD4, negative controls CD19 or AAVS1, or candidate dependency factor genes. Two knockout lines per gene were generated using the highest scoring sgRNAs selected from across each HIVDEP screen. EIF1 KO cells were not used in the infection assays because of poor viability. Viral supernatants were collected at days 0, 3, 5, and 7 to assess overall effect on replication kinetics via reverse transcriptase activity at each time point. The spreading infections were performed over two batches. The y axis shows reverse transcriptase milliUnits/mL. Batch 1 is shown in panel B and batch 2 is shown in Fig. S2. (C and D) Area under the curve (AUC) was calculated for each cell line after 7 days of infection in batch 1 (C) and batch 2 (D) with either LAI or Q23BG505. Infection data of each guide is shown separately. For statistical analysis, all conditions are compared to the mean of the control cell lines (CD19 and AAVS1). One-way Anova; Tukey’s multiple corrections test: ns, P > 0.05; *, P < 0.05; **, P < 0.05; ***, P < 0.001; ****, P < 0.0001. For each knockout line, Synthego ICE analysis was performed and knockout scores are displayed as pie charts in line with the corresponding gene target. ND, could not be determined.
FIG 5
FIG 5
Correlation of gene z scores with area under the curve (AUC) (A) AUC was compared to the inverse z score of either LAI (left) or Q23BG505 (right). For statistical analysis, the mean biological replicate inverse z score per each gene for either LAI or Q23BG505 are compared to the mean AUC for infection of both guide knockouts per gene from Fig. 4. Simple linear regression: LAI z score versus AUC (R2 = 0.39, P = 0.001); Q23BG505 z score versus AUC (R2 = 0.41, P = 0.001); (B) same as panel A, but the mean z score for all wild-type strains (Table S3) was used rather than strain-specific z score. Mean WT z score versus LAI AUC (R2 = 0.30, P = 0.007); Mean WT inverse z score versus Q23BG505 AUC (R2 = 0.46, P = 0.0004). Squares and top line represent LAI; circles and lower line represent Q23BG505.
FIG 6
FIG 6
Entry-specific host factors. (A) Heatmap of the top 10 most depleted sgRNAs based on the mean of the wild-type strains, but not for VSV-G pseudotyped HIV-1. (B) Heatmap of the top 10 most depleted sgRNAs for VSV-G-pseudotyped HIV-1, but not for HIV-1 with WT HIV. Matrix for determination of panels A and B is in Fig. S3 with the heatmap z score values in Table S3. (C) Jurkat-CCR5 cell pools edited for gene targets of interest were created by transducing wild-type Jurkat-CCR5 cells with lentiCRISPRv2 sgRNA constructs using sgRNAs for CD19 (B-cell marker used as a negative control), PSIP1 encoding p75/LEDGF (positive control), and genes of interest (UBE2M, ZBTB7A, KMT2D, SBDS, and OTUD5), and selected with puromycin for at least 10 days. Knockout pools were infected with HIV 1LAI or VSV-G-pseudotyped HIV-1 which both encode luciferase in place of the nef gene. Luciferase expression in infected WT or knockout cells was quantified 2 days postinfection. All infections were done in triplicate using two pools of knockouts with one or two different sgRNA guides per gene. Data for the two different guides are shown as black circles (sg1) and open squares (sg2) for each pooled knockout cell line. The mean percentage of luciferase activity of all replicates relative to the control cells is displayed on each bar. (D) Surface CD4 expression of WT or KO Jurkat-ZsGreen/CCR5 cells was quantified using flow cytometry for CD4-APC. Mean fluorescence intensity (MFI) for each cell line is shown in each respective row. Two different knockout pools corresponding to two different guides per gene (except for CD19) are shown.

References

    1. Konig R, Zhou Y, Elleder D, Diamond TL, Bonamy GM, Irelan JT, Chiang CY, Tu BP, De Jesus PD, Lilley CE, Seidel S, Opaluch AM, Caldwell JS, Weitzman MD, Kuhen KL, Bandyopadhyay S, Ideker T, Orth AP, Miraglia LJ, Bushman FD, Young JA, Chanda SK. 2008. Global analysis of host-pathogen interactions that regulate early-stage HIV-1 replication. Cell 135:49–60. doi:10.1016/j.cell.2008.07.032. - DOI - PMC - PubMed
    1. Zhou H, Xu M, Huang Q, Gates AT, Zhang XD, Castle JC, Stec E, Ferrer M, Strulovici B, Hazuda DJ, Espeseth AS. 2008. Genome-scale RNAi screen for host factors required for HIV replication. Cell Host Microbe 4:495–504. doi:10.1016/j.chom.2008.10.004. - DOI - PubMed
    1. Brass AL, Dykxhoorn DM, Benita Y, Yan N, Engelman A, Xavier RJ, Lieberman J, Elledge SJ. 2008. Identification of host proteins required for HIV infection through a functional genomic screen. Science 319:921–926. doi:10.1126/science.1152725. - DOI - PubMed
    1. Yeung ML, Houzet L, Yedavalli VS, Jeang KT. 2009. A genome-wide short hairpin RNA screening of jurkat T-cells for human proteins contributing to productive HIV-1 replication. J Biol Chem 284:19463–19473. doi:10.1074/jbc.M109.010033. - DOI - PMC - PubMed
    1. Park RJ, Wang T, Koundakjian D, Hultquist JF, Lamothe-Molina P, Monel B, Schumann K, Yu H, Krupzcak KM, Garcia-Beltran W, Piechocka-Trocha A, Krogan NJ, Marson A, Sabatini DM, Lander ES, Hacohen N, Walker BD. 2017. A genome-wide CRISPR screen identifies a restricted set of HIV host dependency factors. Nat Genet 49:193–203. doi:10.1038/ng.3741. - DOI - PMC - PubMed

Publication types

MeSH terms