Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Dec 12;14(12):e0226365.
doi: 10.1371/journal.pone.0226365. eCollection 2019.

Application of DArT seq derived SNP tags for comparative genome analysis in fishes; An alternative pipeline using sequence data from a non-traditional model species, Macquaria ambigua

Affiliations

Application of DArT seq derived SNP tags for comparative genome analysis in fishes; An alternative pipeline using sequence data from a non-traditional model species, Macquaria ambigua

Foyez Shams et al. PLoS One. .

Abstract

Bi-allelic Single Nucleotide Polymorphism (SNP) markers are widely used in population genetic studies. In most studies, sequences either side of the SNPs remain unused, although these sequences contain information beyond that used in population genetic studies. In this study, we show how these sequence tags either side of a single nucleotide polymorphism can be used for comparative genome analysis. We used DArTseq (Diversity Array Technology) derived SNP data for a non-model Australian native freshwater fish, Macquaria ambigua, to identify genes linked to SNP associated sequence tags, and to discover homologies with evolutionarily conserved genes and genomic regions. We concatenated 6,776 SNP sequence tags to create a hypothetical genome (representing 0.1-0.3% of the actual genome), which we used to find sequence homologies with 12 model fish species using the Ensembl genome browser with stringent filtering parameters. We identified sequence homologies for 17 evolutionarily conserved genes (cd9b, plk2b, rhot1b, sh3pxd2aa, si:ch211-148f13.1, si:dkey-166d12.2, zgc:66447, atp8a2, clvs2, lyst, mkln1, mnd1, piga, pik3ca, plagl2, rnf6, sec63) along with an ancestral evolutionarily conserved syntenic block (euteleostomi Block_210). Our analysis also revealed repetitive sequences covering approximately 12% of the hypothetical genome where DNA transposon, LTR and non-LTR retrotransposons were most abundant. A hierarchical pattern of the number of sequence homologies with phylogenetically close species validated the approach for repeatability. This new approach of using SNP associated sequence tags for comparative genome analysis may provide insight into the genome evolution of non-model species where whole genome sequences are unavailable.

PubMed Disclaimer

Conflict of interest statement

AK is employed by and receives salary from Diversity Arrays Technology. This does not alter our adherence to PLOS ONE policies on sharing data and materials.

Figures

Fig 1
Fig 1. A schematic diagram of selecting evolutionarily conserved genes.
Numbers against each species represents predicted genes resulted from non-repetitive sequence (STAGs) homologies.
Fig 2
Fig 2. Ensembl Nucleotide BLAST/BLAT hits.
Phylogeny (Not according to scale) adopted from Betancur-R et al. (2017) [19] represents Macquaria ambigua (golden perch) lineage with all reference fish used for the analysis. Length (in kilo-base pairs) represents the proportion of golden perch hypothetical genome (GP-H-Genome) homologous with 12 fish species (M. ambigua bar represent the length of the total GP-H-Genome). STAGs stand for a single trimmed sequence (remaining product of a single 69 bp DArT marker after removing the restriction site-associated adapter) usually 69 bp or less in length where the polymorphic nucleotide is replaced with a standard ambiguity code.
Fig 3
Fig 3. Evolutionarily conserved genes.
(a) Number of genes have homologies with non-repetitive part of GP-H-Genome. Colour code represents the species against the BLAST hits obtained. 17 genes common in all three species. (b) Gene block arrangement. Comparison of conserve gene block arrangement for the region encompassing genes atp8a2 and nrf6 genes on Gasterosteus aculeatus Chromosome XXI and Oryzias latipes Chromosome 20 with the ancestral Euteleostomi blocks predicted in Genomicus database (c) atp8a2 and nrf6 genes are conserved compared to other vertebrate species. Arrows indicate direction of the transcription. Gene order and orientation is unknown for M. ambigua.

References

    1. Ezaz T, Azad B, O’Meally D, Young MJ, Matsubara K, Edwards MJ, et al. Sequence and gene content of a large fragment of a lizard sex chromosome and evaluation of candidate sex differentiating gene R-spondin 1. BMC genomics. 2013;14(1):899. - PMC - PubMed
    1. Shetty S, Griffin DK, Graves JAM. Comparative Painting Reveals Strong Chromosome Homology Over 80 Million Years of Bird Evolution. Chromosome Research. 1999;7(4):289–95. 10.1023/a:1009278914829 - DOI - PubMed
    1. Taylor JS, Van de Peer Y, Braasch I, Meyer A. Comparative genomics provides evidence for an ancient genome duplication event in fish. Philosophical Transactions of the Royal Society of London Series B: Biological Sciences. 2001;356(1414):1661–79. 10.1098/rstb.2001.0975 - DOI - PMC - PubMed
    1. Davey JW, Hohenlohe PA, Etter PD, Boone JQ, Catchen JM, Blaxter ML. Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nature Reviews Genetics. 2011;12(7):499–510. 10.1038/nrg3012 - DOI - PubMed
    1. Liu S, Zhou Z, Lu J, Sun F, Wang S, Liu H, et al. Generation of genome-scale gene-associated SNPs in catfish for the construction of a high-density SNP array. BMC Genomics. 2011;12:53 Epub 2011/01/25. 10.1186/1471-2164-12-53 . - DOI - PMC - PubMed