Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Apr 6:8:e8914.
doi: 10.7717/peerj.8914. eCollection 2020.

First de novo whole genome sequencing and assembly of the bar-headed goose

Affiliations

First de novo whole genome sequencing and assembly of the bar-headed goose

Wen Wang et al. PeerJ. .

Abstract

Background: The bar-headed goose (Anser indicus) mainly inhabits the plateau wetlands of Asia. As a specialized high-altitude species, bar-headed geese can migrate between South and Central Asia and annually fly twice over the Himalayan mountains along the central Asian flyway. The physiological, biochemical and behavioral adaptations of bar-headed geese to high-altitude living and flying have raised much interest. However, to date, there is still no genome assembly information publicly available for bar-headed geese.

Methods: In this study, we present the first de novo whole genome sequencing and assembly of the bar-headed goose, along with gene prediction and annotation.

Results: 10X Genomics sequencing produced a total of 124 Gb sequencing data, which can cover the estimated genome size of bar-headed goose for 103 times (average coverage). The genome assembly comprised 10,528 scaffolds, with a total length of 1.143 Gb and a scaffold N50 of 10.09 Mb. Annotation of the bar-headed goose genome assembly identified a total of 102 Mb (8.9%) of repetitive sequences, 16,428 protein-coding genes, and 282 tRNAs. In total, we determined that there were 63 expanded and 20 contracted gene families in the bar-headed goose compared with the other 15 vertebrates. We also performed a positive selection analysis between the bar-headed goose and the closely related low-altitude goose, swan goose (Anser cygnoides), to uncover its genetic adaptations to the Qinghai-Tibetan Plateau.

Conclusion: We reported the currently most complete genome sequence of the bar-headed goose. Our assembly will provide a valuable resource to enhance further studies of the gene functions of bar-headed goose. The data will also be valuable for facilitating studies of the evolution, population genetics and high-altitude adaptations of the bar-headed geese at the genomic level.

Keywords: 10X Genomics Chromium; Anser indicus; Avian genomes; Bar-headed goose; Comparative genomics; Conservation genomics; High-altitude adaptation; Hypoxia; Positive selection; Qinghai-Tibetan Plateau.

PubMed Disclaimer

Conflict of interest statement

The authors declare there are no competing interests. Rongkai Hao is employed by Novogene Bioinformatics Institute.

Figures

Figure 1
Figure 1. Orthologous genes in bar-headed goose and other birds.
The number of unique or shared orthologous genes are listed in each diagram component. Ain, bar-headed goose; Acy, swan goose; Apl, mallard; Gga, red junglefowl.
Figure 2
Figure 2. Gene family expansion and contraction in the bar-headed goose genome.
The number of expanded (blue) and contracted (red) gene families are shown along branches and nodes. MRCA, most recent common ancestor; Ain, bar-headed goose; Acy, swan goose; Nni, crested ibis; Apl, mallard; Gga, red junglefowl; Mga, turkey; Cca, common cuckoo; Cli, rock pigeon; Phu, ground tit; Cfl, bananaquit; Pma, great tit; Tgu, zebra finch; Ppu, ruff; Fpe, peregrine falcon; Bmu, yak; Pho, tibetan antelope.
Figure 3
Figure 3. Functional distribution of positively selected genes (PSGs) according to the Gene Ontology (GO) database.
The y axis reveals the GO functional categories, including (A) biological process, (B) cellular component, and (C) molecular function, while the number of genes in each category is plotted on the x axis.

References

    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. Journal of Molecular Biology. 1990;215(3):403–410. doi: 10.1016/S0022-2836(05)80360-2. - DOI - PubMed
    1. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research. 1997;25(17):3389–3402. doi: 10.1093/nar/25.17.3389. - DOI - PMC - PubMed
    1. Antonides J, Ricklefs R, DeWoody JA. The genome sequence and insights into the immunogenetics of the bananaquit (Passeriformes: Coereba flaveola) Immunogenetics. 2017;69(3):175–186. doi: 10.1007/s00251-016-0960-8. - DOI - PubMed
    1. Armstrong EE, Taylor RW, Prost S, Blinston P, Van der Meer E, Madzikanda H, Mufute O, Mandisodza-Chikerema R, Stuelpnagel J, Sillero-Zubiri C, Petrov D, Dmitri P. Cost-effective assembly of the African wild dog (Lycaon pictus) genome using linked reads. Gigascience. 2019;8(2):giy124. doi: 10.1093/gigascience/giy124. - DOI - PMC - PubMed
    1. Bao W, Kojima KK, Kohany O. Repbase update, a database of repetitive elements in eukaryotic genomes. Mobile DNA. 2015;6:11. doi: 10.1186/s13100-015-0041-9. - DOI - PMC - PubMed