Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 2:13:giae109.
doi: 10.1093/gigascience/giae109.

Diaci v3.0: chromosome-level assembly, de novo transcriptome, and manual annotation of Diaphorina citri, insect vector of Huanglongbing

Affiliations

Diaci v3.0: chromosome-level assembly, de novo transcriptome, and manual annotation of Diaphorina citri, insect vector of Huanglongbing

Teresa D Shippy et al. Gigascience. .

Abstract

Background: Diaphorina citri is an insect vector of "Candidatus Liberibacter asiaticus" (CLas), the gram-negative bacterial pathogen associated with citrus greening disease. Control measures rely on pesticides with negative impacts on the environment, natural ecosystems, and human and animal health. In contrast, gene-targeting methods have the potential to specifically target the vector species and/or reduce pathogen transmission.

Results: To improve the genomic resources needed for targeted pest control, we assembled a D. citri genome based on PacBio long reads followed by proximity ligation-based scaffolding. The 474-Mb genome has 13 chromosomal-length scaffolds. In total, 1,036 genes were manually curated as part of a community annotation project, composed primarily of undergraduate students. We also computationally identified a total of 1,015 putative transcription factors (TFs) and were able to infer motifs for 337 TFs (33%). In addition, we produced a genome-independent transcriptome and genomes for D. citri endosymbionts.

Conclusions: Manual annotation provided more accurate gene models for use by researchers and provided an excellent training opportunity for students from multiple institutions. All resources are available on CitrusGreening.org and NCBI. The chromosomal-length D. citri genome assembly serves as a blueprint for the development of collaborative genomics projects for other medically and agriculturally significant insect vectors.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1:
Figure 1:
Graphical abstract of the major findings from a multiyear, multi-institutional Diaphorina citri genome assembly project. The most recent version (3.0) of the D. citri genome assembly (Diaci v3.0) is available on CitrusGreening.org and NCBI, along with its Official Gene Set v3, de novo transcriptome, and extensive manual annotations covering major pathways and hundreds of genes. We have also predicted transcription factors and protein-coding genes and compared Diaci v3.0 to the Pachypsylla venusta psyllid genome, as well as other D. citri genome assemblies recently published. Lastly, during our genome assembly, we also created draft genomes for multiple D. citri endosymbionts. Our work took place over 8 years, with participation from multiple institutions and dozens of students, graduate students, and faculty.
Figure 2:
Figure 2:
Parallel plot showing synteny between Diaphorina citri Diaci v3.0 and Pachypsylla venusta genome assemblies. D. citri chromosomal-length scaffolds from Diaci v3.0 are numbered in order of size (dc1–dc13) and make up the top row in the parallel plot. P. venusta (the hackberry petiole gall psyllid) chromosomal-length scaffolds are numbered as previously described (pv1–pv11) [35]. The P. venusta scaffolds marked with arrows (pv2, pv3, pv4, pv6, and pv9) have been reversed to optimize synteny to Diaci v3.0. We see that dc8 is likely the D. citri X chromosome, while pv3 appears to be a fusion of sequences homologous to dc12 and dc13.
Figure 3:
Figure 3:
Parallel plot of 2 chromosomal-length Diaphorina citri genomes: Diaci v3.0 (dc) and CRF-California (dcca). The 474-Mb D. citri Diaci v3.0 Florida genome is about 200 Mb larger than the CRF-California (282.67 Mb) genome from Carlson et al. [22]. All Diaci v3.0 chromosomal-length scaffolds (dc) are larger than the corresponding scaffolds of the CRF-California genome (dcca), suggesting that the additional sequence in the Diaci v3.0 assembly is distributed throughout the genome. Lines connecting scaffolds indicate syntenic blocks of genes. Scaffold sizes are shown in megabytes (Mb). Gray ovals with arrows indicate CRF-California scaffolds that have been reversed to match the orientation of the corresponding Diaci v3.0 scaffold.
Figure 4:
Figure 4:
Comparison of transcription factors (TFs) across insects and response to infection in Diaphorina citri. (A) The top 50 families of TFs (1 shown per row) identified in D. citri (column 1) and in other insects (columns 2–9), where red is higher numbers of inferred TF motifs, and blue is no data. (B) Weighted gene coexpression network analysis (based on FPKM values) and TF binding enrichment analysis identified specific modules of gene targets associated with “Candidatus Liberibacter asiaticus” (CLas) infection. Associations between modules are shown as colors on the y-axis of coexpressed genes. CLas_pos are all CLas-positive (infected) and CLas_neg are all CLas-negative (uninfected) samples. Samples denoted as C_medica_CLas_pos are CLas-infected D. citri raised on Citrus medica plants, and those denoted as C_medica_CLas_neg are healthy, noninfected, and raised on healthy Citrus medica plants. Lastly, C_medica_CLas_pos_Gut and C_medica_CLas_neg_Gut are the same as above, but data come only from guts of D. citri in both treatments. Red indicates higher correlation between module and CLas treatment, with a black line indicating significant correlation at P < 0.05. (C) Expression patterns of 4 TFs of interest show increased enrichment in binding sites for genes with differential expression during CLas infection. Increased expression is shown as lighter yellow and decreased expression as dark blue. Specific genes listed include Dcitr02g06840.1.1 (Mothers against dpp, Mad), Dcitr08g10720.1.1 (ortholog of ventral nervous system defective, vnd), Dcitr13g01090.1.1 (transcription factor Myb), and Dcitr04g11590.1.2 (Suppressor of Hairless, Su(H)). Aver_pos and Aver_neg are averages of all 4 CLas-positive gut samples and all 4 CLas-negative gut samples. The averages show altered TF binding prediction and differential expression within the gut. Transcript levels in (A), (B), and (C) are reported from the Psyllid Expression Network [1] using Diaci v3.0 and the de novo transcriptome.
Figure 5:
Figure 5:
A visual representation of methods and pipelines for assembly, annotation, orthology, synteny, prediction, and transcriptomics. (A) The Diaphorina citri Diaci v3.0 genome assembly began with input of long-read DNA sequencing data, followed by multiple rounds of assembly and scaffolding, duplication reduction, error correction, and exclusion of nontarget organism reads (see C). (B) Iso-sequencing used long-read RNA sequencing data and an established computational pipeline (SMRTLink) to assemble gene isoforms independent of any genome. Additionally, a de novo D. citri transcriptome was generated using public short-read RNA sequencing data collected from online repositories combined with the high-quality Iso-seq isoforms. The transcriptome was assembled using Trinity, then cleaned and error corrected, subjected to BLAST to remove contaminants, and clustered by locus to remove redundant isoforms. (C) Genome assemblies of the D. citri endosymbionts, including “Candidatus Profftella armatura,” “Candidatus Carsonella ruddii,” and Wolbachia, were generated using reads excluded during the Diaci v3.0 assembly process, then cleaned and verified using Orthofinder and BLAST. (D) We performed synteny analysis between the Diaci v3.0 genome and other full-length psyllid genome assemblies. (E) Manual curation was a major part of this genome project and involved teams of student annotators, graduate students, and faculty across multiple institutions. We have previously published the annotation pipeline used [23]. (F) We compared the predicted proteins of 12 different insect species to generate orthogroups and assign GO terms to D. citri genes. (G) Transcription factor (TF) prediction was performed following previously published pipelines, and weighted gene coexpression network analysis (WGCNA) and HOMER were used to explore the effect of CLas infection on TFs and their target genes. (H) Protein-coding genes were predicted and annotated following the MAKER annotation pipeline, and results were informed using Mikado and RNA-seq datasets.

References

    1. Flores-Gonzalez M, Hosmani PS, Fernandez-Pozo N, et al. . Citrusgreening.Org: an open access and integrated systems biology portal for the Huanglongbing (HLB) disease complex. Biorxiv. 2019. 10.1101/868364. - DOI
    1. Singerman A, Useche P. Impact of citrus greening on citrus operations in Florida. https://edis.ifas.ufl.edu/publication/FE983. Accessed 28 November 2022.
    1. Lee JA, Halbert SE, Dawson WO, et al. . Asymptomatic spread of Huanglongbing and implications for disease control. Proc Natl Acad Sci USA. 2015;112:7605–10.. 10.1073/pnas.1508253112. - DOI - PMC - PubMed
    1. Ammar E-D, Hall DG, Hosseinzadeh S, et al. . The quest for a non-vector psyllid: natural variation in acquisition and transmission of the huanglongbing pathogen “Candidatus Liberibacter asiaticus” by Asian citrus psyllid isofemale lines. PLoS One. 2018;13:e0195804. 10.1371/journal.pone.0195804. - DOI - PMC - PubMed
    1. El-Desouky A, Shatters RG Jr, Heck M. Huanglongbing pathogens: acquisition, transmission and vector interactions. In: Qureshi JA, Stansly PA, eds. Asian Citrus Psyllid: Biology, Ecology and Management of the Huanglongbing Vector. 2020:113–39.. 10.1079/9781786394088.0113. - DOI

Supplementary concepts

LinkOut - more resources