Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Apr 30:11:279.
doi: 10.1186/1471-2164-11-279.

Salmo salar and Esox lucius full-length cDNA sequences reveal changes in evolutionary pressures on a post-tetraploidization genome

Affiliations

Salmo salar and Esox lucius full-length cDNA sequences reveal changes in evolutionary pressures on a post-tetraploidization genome

Jong S Leong et al. BMC Genomics. .

Abstract

Background: Salmonids are one of the most intensely studied fish, in part due to their economic and environmental importance, and in part due to a recent whole genome duplication in the common ancestor of salmonids. This duplication greatly impacts species diversification, functional specialization, and adaptation. Extensive new genomic resources have recently become available for Atlantic salmon (Salmo salar), but documentation of allelic versus duplicate reference genes remains a major uncertainty in the complete characterization of its genome and its evolution.

Results: From existing expressed sequence tag (EST) resources and three new full-length cDNA libraries, 9,057 reference quality full-length gene insert clones were identified for Atlantic salmon. A further 1,365 reference full-length clones were annotated from 29,221 northern pike (Esox lucius) ESTs. Pairwise dN/dS comparisons within each of 408 sets of duplicated salmon genes using northern pike as a diploid out-group show asymmetric relaxation of selection on salmon duplicates.

Conclusions: 9,057 full-length reference genes were characterized in S. salar and can be used to identify alleles and gene family members. Comparisons of duplicated genes show that while purifying selection is the predominant force acting on both duplicates, consistent with retention of functionality in both copies, some relaxation of pressure on gene duplicates can be identified. In addition, there is evidence that evolution has acted asymmetrically on paralogs, allowing one of the pair to diverge at a faster rate.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic of S. salar FLcDNA contig identification and reference FLcDNA identification. Two-stage assembly of 434,384 high-quality 5'- and 3'-end ESTs identified 81,398 contigs (1-2) for FL contig identification. A BLASTX was carried out resulting in 34,451 well-annotated contigs (3), which were further reduced to 14,021 FL annotations by increasing the stringency of the local alignment length (4). In-frame annotation-flanking start and stop codons were found from the reduced set, resulting in a set of 10,026 FL contigs (5). The FL contigs represent the complete set of FL unique putative transcripts. A set of all reads and subsequently sequenced library rgf reads was mapped to the FL contigs (6). Those clones whose 5'- and 3'-end reads map to the same contig were analyzed to determine sequence overlap (complete) or non-overlap (incomplete) (7). Only complete clones are considered, and a single representative of a clone is taken for each transcript resulting in 5,953 complete reference FLcDNAs (8).
Figure 2
Figure 2
Schematic of S. salar reference FLcDNA identification through individual clone assemblies. Three full-length 5'-CAP enriched libraries were created. A 4,380 clone subset of library rgf was resequenced to completion. Libraries rgg and rgh were bi-directionally sequenced and individually assembled using PHRAP (1). A BLASTX was carried out resulting in a total of 14,384 well-annotated cDNAs (2), which were further reduced to 8,469 FL annotations by increasing the stringency of the local alignment length (3). In-frame annotation-flanking start and stop codons were found from the reduced set, resulting in a set of 7,255 reference FLcDNA candidates (4). Intra-library sequence redundancy was minimized using an all versus all pairwise BLASTN comparison (5), resulting in a total set of 3,204 non-redundant reference FLcDNAs.
Figure 3
Figure 3
Distributions and means of ORF, 5' and 3' UTR sizes in reference FLcDNAs for (A) S. salar (B) E. lucius. Each reference FLcDNA, determined by in-house annotation methods, was examined for an ORF, 5' UTR, and 3' UTR. Means for each region were calculated (+/- standard deviation). An ORF is characterized by a start (ATG) and an in-frame stop codon (TGA, TAG, TAA). The 5' UTR is calculated as the entire area upstream of the start codon, while the 3' UTR is considered the entire area downstream of the stop codon. Any 3' polyA tails were masked and were not included in UTR length calculations.
Figure 4
Figure 4
Frequencies of dS and ω values for comparisons within S. salar and E. lucius gene trios. (A) Distributions of dS values from pairwise comparisons within gene trios: between S. salar paralogs (green) and between each of the two S. salar paralogs and its corresponding E. lucius ortholog (gray and black). (B) Distributions of dN/dS ratios (ω) from pairwise comparisons within gene trios: between S. salar paralogs (green) and between each of the two S. salar paralogs and its corresponding E. lucius ortholog (gray and black). (C) Distributions of dS values separated into individual tree branches based on gene trios. Values from pairwise comparisons were used to calculate silent substitution rates for periods before and after the salmonid tetraploidization event. The light blue curve represents frequencies of dS values from the duplication event to one S. salar paralog, the red curve from the duplication event to the other paralog, and the black curve prior to the genome duplication to the E. lucius ortholog. (D) Distributions of dN/dS ratios separated into branches where one S. salar paralog, that which has the lower ω value, is considered to be a slow branch (light blue curve) and the other paralog (red curve) is considered to be more quickly diverging (fast branch for the purposes of labelling). The black curve displays frequencies of ω values between the E. lucius ortholog and the genome duplication.

References

    1. Nelson JS. Fishes of the world. 4. John Wiley & Sons, New York; 2006.
    1. Handeland SO, Berge Å, Björnsson BTh, Stefansson SO. Effects of temperature and salinity on osmoregulation and growth of Atlantic salmon (Salmo salar L.) smolts in seawater. Aquaculture. 1998;168:289–302. doi: 10.1016/S0044-8486(98)00356-1. - DOI
    1. Hutchings JA, Jones MEB. Life history variation and growth rate thresholds for maturity in Atlantic salmon, Salmo salar. Can J Fish Aquat Sci. 1998;55(Suppl 1):22–47. doi: 10.1139/cjfas-55-S1-22. - DOI
    1. Boeuf G, Le Bail PY. Does light have an influence on fish growth? Aquaculture. 1999;177(1-4):129–152. doi: 10.1016/S0044-8486(99)00074-5. - DOI
    1. Mommsen TP, Vijayan MM Moon TW. Cortisol in teleosts: dynamics, mechanisms of action, and metabolic regulation. Rev Fish Biol Fisher. 1999;9(3):211–268. doi: 10.1023/A:1008924418720. - DOI

Publication types

Substances