Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 23;114(5):513-520.
doi: 10.1093/jhered/esad016.

The revised reference genome of the leopard gecko (Eublepharis macularius) provides insight into the considerations of genome phasing and assembly

Affiliations

The revised reference genome of the leopard gecko (Eublepharis macularius) provides insight into the considerations of genome phasing and assembly

Brendan J Pinto et al. J Hered. .

Abstract

Genomic resources across squamate reptiles (lizards and snakes) have lagged behind other vertebrate systems and high-quality reference genomes remain scarce. Of the 23 chromosome-scale reference genomes across the order, only 12 of the ~60 squamate families are represented. Within geckos (infraorder Gekkota), a species-rich clade of lizards, chromosome-level genomes are exceptionally sparse representing only two of the seven extant families. Using the latest advances in genome sequencing and assembly methods, we generated one of the highest-quality squamate genomes to date for the leopard gecko, Eublepharis macularius (Eublepharidae). We compared this assembly to the previous, short-read only, E. macularius reference genome published in 2016 and examined potential factors within the assembly influencing contiguity of genome assemblies using PacBio HiFi data. Briefly, the read N50 of the PacBio HiFi reads generated for this study was equal to the contig N50 of the previous E. macularius reference genome at 20.4 kilobases. The HiFi reads were assembled into a total of 132 contigs, which was further scaffolded using HiC data into 75 total sequences representing all 19 chromosomes. We identified 9 of the 19 chromosomal scaffolds were assembled as a near-single contig, whereas the other 10 chromosomes were each scaffolded together from multiple contigs. We qualitatively identified that the percent repeat content within a chromosome broadly affects its assembly contiguity prior to scaffolding. This genome assembly signifies a new age for squamate genomics where high-quality reference genomes rivaling some of the best vertebrate genome assemblies can be generated for a fraction of previous cost estimates. This new E. macularius reference assembly is available on NCBI at JAOPLA010000000.

Keywords: emerging model system; evolution; gekkota; genomics; phasing.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
HiC contact map for the MPM_Emac_v1.0 assembly. Each external blue segment (below) indicates the delimitation of a chromosome-length scaffold, whereas the internal green squares indicate contigs. Approximately half of the assembled chromosomes are represented by a near-single contig, indicating the extreme contiguity of the primary assembly pre-scaffolding.
Fig. 2.
Fig. 2.
Comparison between chromosomes assembled as a single contig (“Contigs,” dark gray) and those composed of multiple contigs (“Scaffolds,” light gray) displayed using vioplot (Adler et al. 2022). The violins represent the distribution of the underlying data points, whereas the internal bars represent a traditional bar graph representation. The mid-lines represent the median of the data. Our a priori hypothesis was that chromosomes assembled as a single contig would possess lower overall GC content and/or repeat content. However, neither GC content nor repetitive element content was significantly different using Mann–Whitney–Wilcoxon tests. Qualitatively, there appears to be a difference in median repeat content between the two groups. It’s possible that our ability to detect a true difference using frequentist methods lies in the low sample size (N = 19).
Fig. 3.
Fig. 3.
Comparative QC results from assembly phasing generated by merqury (Rhie et al. 2020) between A) Trios phasing and B) HiC phasing methods. HiC equaled or outperformed Trios phasing in all measured categories. Notably, Trios phasing appears to have suffered from high switch-error rates, which resulted in short phase block, relative to contig size. HiC phasing performed extremely well, however by definition; HiC phasing was unable to coordinate multiple phased contigs to their parent of origin.

Update of

Similar articles

Cited by

References

    1. Adler D, Kelly ST, Elliott T, Adamson J.. vioplot: violin plot. R package version 0.4.0. 2022. https://github.com/TomKellyGenetics/vioplot.
    1. Agarwal I, Bauer AM, Gamble T, Giri VB, Jablonski D, Khandekar A, Mohapatra PP, Masroor R, Mishra A, Ramakrishnan U.. The evolutionary history of an accidental model organism, the leopard gecko Eublepharis macularius (Squamata: Eublepharidae). Mol Phylogenet Evol. 2022:168:107414. - PubMed
    1. Amores A, Catchen J, Nanda I, Warren W, Walter R, Schartl M, Postlethwait JH.. A RAD-tag genetic map for the platyfish (Xiphophorus maculatus) reveals mechanisms of karyotype evolution among teleost fish. Genetics. 2014:197(2):625–641. - PMC - PubMed
    1. Bauer AM. Geckos: the animal answer guide. Baltimore, MD: Johns Hopkins University Press; 2013.
    1. Benjamini Y, Speed TP.. Summarizing and correcting the GC content bias in high-throughput sequencing. Nucleic Acids Res. 2012:40(10):e72. - PMC - PubMed

Publication types