Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun 12;3(4):e211.
doi: 10.1002/imt2.211. eCollection 2024 Aug.

JCVI: A versatile toolkit for comparative genomics analysis

Affiliations

JCVI: A versatile toolkit for comparative genomics analysis

Haibao Tang et al. Imeta. .

Abstract

The life cycle of genome builds spans interlocking pillars of assembly, annotation, and comparative genomics to drive biological insights. While tools exist to address each pillar separately, there is a growing need for tools to integrate different pillars of a genome project holistically. For example, comparative approaches can provide quality control of assembly or annotation; genome assembly, in turn, can help to identify artifacts that may complicate the interpretation of genome comparisons. The JCVI library is a versatile Python-based library that offers a suite of tools that excel across these pillars. Featuring a modular design, the JCVI library provides high-level utilities for tasks such as format parsing, graphics generation, and manipulation of genome assemblies and annotations. Supporting genomics algorithms like MCscan and ALLMAPS are widely employed in building genome releases, producing publication-ready figures for quality assessment and evolutionary inference. Developed and maintained collaboratively, the JCVI library emphasizes quality and reusability.

Keywords: comparative genomics; genome annotation; genome assembly; genomic data; visualization.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Genome dotplot, macro‐synteny, and micro‐synteny analyses between peach and grape genomes. (A) Dot plot of 14,279 gene pairs within synteny blocks resulting from the alignment of coding sequences between the peach and grape genomes. Counting the number of synteny blocks horizontally and vertically reveals a 3:3 pattern, suggesting that grapes and peaches share a genome triplication event. (B) Another illustration of synteny regions is the karyotype figure, which displays links between the chromosomes to show the gene pairs within synteny blocks. Users have the option to rotate the chromosomes for customized layout or highlight the genomic regions of interest that contain specific gene pairs (yellow). (C) Micro‐synteny analysis between a grape region and matching region in peach. The connecting ribbons show matching genes, with blue boxes showing genes on the positive strand, and green boxes showing genes on the negative strand.
Figure 2
Figure 2
Pseudochromosome 1 of the Carica papaya genome, reconstructed from two input maps—an SSR‐based genetic map and a synteny map coordinated with a previously published papaya genome. The left panel presents connecting lines as matching markers. The right panel shows two sets of scatter plots with each dot representing the physical position on the chromosome (x‐axis) versus the map location (y‐axis). Spearman's ρ is displayed to indicate the concordance between the marker order on the chromosome with their map locations (between −1 and 1, with 1 being completely concordant).
Figure 3
Figure 3
Examples of miscellaneous features visualized along the genome. (A) Chromosome painting with “graphics.chromosome” that shows the locations of various elements in the portrait mode. The chromosome sizes (scale on the right side of the chromosomes), location of the centromeres (filled ellipses in black), and genomic features (filled rectangles in various colors) can be specified as inputs to the script. (B) Genomic landscape plotted with “graphics.landscape.” The chromosome sizes, density of the gene, and repeat elements can be specified as inputs to the script and shown as stack plots. (C) Genomic landscape along a single chromosome, with alternative heatmap representations correlated with the feature density. Toy data are used for illustration purposes only. LTR, long‐terminal repeat.
Figure 4
Figure 4
Examples of genome‐build quality control. (A) K‐mer spectrum analysis to estimate genome parameters based on raw sequencing data. (B) Concordance between genome assembly and the linkage map. (C) Concordance between genome assembly and the Hi‐C contact map. These are all powerful tools (“assembly.kmer,” “assembly.geneticmap,” and “assembly.hic”) to evaluate the quality of a genome build.
Figure 5
Figure 5
Examples of pedigree and genome diversity analysis. (A) Pedigree of a particular variety (Variety 1; at the bottom), as encoded in a PED file, can be visualized with the “compara.pedigree” module. The root nodes (nodes with no parent information at the top) are assumed to be outcrossing for simplicity. The parentage in each variety is then estimated and visualized as pie charts. F: inbreeding coefficients estimated based on the pedigree. (B) Copy number variation between different varieties as visualized with “graphics.landscape” module.

References

    1. Ye, Ning , Wang Xuelin, Li Juan, Bi Changwei, Xu Yiqing, Wu Dongyang, and Ye Qiaolin. 2017. “Assembly and Comparative Analysis of Complete Mitochondrial Genome Sequence of an Economic Plant Salix suchowensis .” PeerJ 5: e3148. 10.7717/peerj.3148 - DOI - PMC - PubMed
    1. Lugli, Gabriele Andrea . 2021. “Assembly, Annotation, and Comparative Analysis of Bifidobacterial Genomes.” Methods Molecular Biology 2278: 31–44. 10.1007/978-1-0716-1274-3_4 - DOI - PubMed
    1. Pop, M . 2004. “Comparative Genome Assembly.” Briefings in Bioinformatics 5: 237–248. 10.1093/bib/5.3.237 - DOI - PubMed
    1. Van Bel, Michiel , Proost Sebastian, Wischnitzki Elisabeth, Movahedi Sara, Scheerlinck Christopher, Van de Peer Yves, and Vandepoele Klaas. 2012. “Dissecting Plant Genomes With the PLAZA Comparative Genomics Platform.” Plant Physiology 158: 590–600. 10.1104/pp.111.189514 - DOI - PMC - PubMed
    1. Schuldt, Andreas , Ebeling Anne, Kunz Matthias, Staab Michael, Guimarães‐Steinicke Claudia, Bachmann Dörte, Buchmann Nina, et al. 2019. “Multiple Plant Diversity Components Drive Consumer Communities Across Ecosystems.” Nature Communications 10: 1460. 10.1038/s41467-019-09448-8 - DOI - PMC - PubMed

LinkOut - more resources