Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 9;13(3):jkad001.
doi: 10.1093/g3journal/jkad001.

The gyrfalcon (Falco rusticolus) genome

Affiliations

The gyrfalcon (Falco rusticolus) genome

Andrea Zuccolo et al. G3 (Bethesda). .

Abstract

High-quality genome assemblies are characterized by high-sequence contiguity, completeness, and a low error rate, thus providing the basis for a wide array of studies focusing on natural species ecology, conservation, evolution, and population genomics. To provide this valuable resource for conservation projects and comparative genomics studies on gyrfalcon (Falco rusticolus), we sequenced and assembled the genome of this species using third-generation sequencing strategies and optical maps. Here, we describe a highly contiguous and complete genome assembly comprising 20 scaffolds and 13 contigs with a total size of 1.193 Gbp, including 8,064 complete Benchmarking Universal Single-Copy Orthologs (BUSCOs) of the total 8,338 BUSCO groups present in the library aves_odb10. Of these BUSCO genes, 96.7% were complete, 96.1% were present as a single copy, and 0.6% were duplicated. Furthermore, 0.8% of BUSCO genes were fragmented and 2.5% (210) were missing. A de novo search for transposable elements (TEs) identified 5,716 TEs that masked 7.61% of the F. rusticolus genome assembly when combined with publicly available TE collections. Long interspersed nuclear elements, in particular, the element Chicken-repeat 1 (CR1), were the most abundant TEs in the F. rusticolus genome. A de novo first-pass gene annotation was performed using 293,349 PacBio Iso-Seq transcripts and 496,195 transcripts derived from the assembly of 42,429,525 Illumina PE RNA-seq reads. In all, 19,602 putative genes, of which 59.31% were functionally characterized and associated with Gene Ontology terms, were annotated. A comparison of the gyrfalcon genome assembly with the publicly available assemblies of the domestic chicken (Gallus gallus), zebra finch (Taeniopygia guttata), and hummingbird (Calypte anna) revealed several genome rearrangements. In particular, nine putative chromosome fusions were identified in the gyrfalcon genome assembly compared with those in the G. gallus genome assembly. This genome assembly, its annotation for TEs and genes, and the comparative analyses presented, complement and strength the base of high-quality genome assemblies and associated resources available for comparative studies focusing on the evolution, ecology, and conservation of Aves.

Keywords: Falco rusticolus; CR1; chromosome fusion; conservation genomics; gyrfalcon; long reads; transposable elements.

PubMed Disclaimer

Conflict of interest statement

Conflicts of interest None declared.

Figures

Fig. 1.
Fig. 1.
a) Picture of a gyrfalcon. b) Placement of falcons in the avian tree of life (modified and simplified from Prum et al., 2015 and Wink, 2018). c) Phylogenetic analysis of falcons modified and simplified from Wink (2018). For the species that have been sequenced (indicated by a double-helix) we provided along to the genome assembly accession number, the genome assembly details in parentheses are as follows: PA, number of primary assemblies; S, number of assemblies at the scaffold level; c, number of assemblies at the chromosome level; i, Illumina technology; p, PacBio technology; h, HiC chromatine interaction data; b, bionano optical maps. For F. pelegrinoides which was sequenced but is not included in the phylogenetic tree, the genome assembly information are PA (1), S (1), (i).
Fig. 2.
Fig. 2.
Details of predicted chromosomal rearrangements in the gyrfalcon super-scaffold 2_sc. a) Circa plot comparing 2_sc with the entire set of domestic chicken chromosomes (specified as “gg_chromosome number”). Regions showing significant similarity are connected by violet lines. b) Bionano optical map validation for 2_sc. NGS, assembled sequence; BNG, Bionano map. c) Dot plot of 2_sc vs four chicken chromosomes showing homology. 2_sc is on the x-axis, and chicken chromosomes are on the y-axis. The chicken chromosomes are coded using the color assigned to them in A.
Fig. 3.
Fig. 3.
Circa plot of the Falco rusticolus genome assembly showing (outer circle inward) GC content distribution, gene density, and TE content.

References

    1. Bao W, Kojima KK, Kohany O. Repbase update, a database of repetitive elements in eukaryotic genomes. Mob DNA. 2015;6(1):11. doi:10.1186/s13100-015-0041-9. - DOI - PMC - PubMed
    1. BioBam . 2019. OmicsBox–Bioinformatics made easy. https://www.biobam.com/omicsbox.
    1. Bravo GA, Schmitt CJ, Scott VE. What have we learned from the first 500 avian genomes? . Annu Rev Ecol Evol Syst. 2021;52(1):611–639. doi:10.1146/annurev-ecolsys-012121-085928. - DOI
    1. Brusatte SL, O’Connor JK, Jarvis ED. 2015. The origin and diversification of birds. Curr Biol. 25(19):R888–R898. doi:10.1016/j.cub.2015.08.003. - DOI - PubMed
    1. Cabanettes F, Klopp C. D-GENIES: dot plot large genomes in an interactive, efficient and simple way. PeerJ. 2018;6(6):e4958. doi:10.7717/peerj.4958. - DOI - PMC - PubMed

Publication types

Substances

LinkOut - more resources