Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec;6(12):1940-1951.
doi: 10.1038/s41559-022-01894-w. Epub 2022 Oct 20.

Epigenetic divergence during early stages of speciation in an African crater lake cichlid fish

Affiliations

Epigenetic divergence during early stages of speciation in an African crater lake cichlid fish

Grégoire Vernaz et al. Nat Ecol Evol. 2022 Dec.

Abstract

Epigenetic variation can alter transcription and promote phenotypic divergence between populations facing different environmental challenges. Here, we assess the epigenetic basis of diversification during the early stages of speciation. Specifically, we focus on the extent and functional relevance of DNA methylome divergence in the very young radiation of Astatotilapia calliptera in crater Lake Masoko, southern Tanzania. Our study focuses on two lake ecomorphs that diverged approximately 1,000 years ago and a population in the nearby river from which they separated approximately 10,000 years ago. The two lake ecomorphs show no fixed genetic differentiation, yet are characterized by different morphologies, depth preferences and diets. We report extensive genome-wide methylome divergence between the two lake ecomorphs, and between the lake and river populations, linked to key biological processes and associated with altered transcriptional activity of ecologically relevant genes. Such genes differing between lake ecomorphs include those involved in steroid metabolism, hemoglobin composition and erythropoiesis, consistent with their divergent habitat occupancy. Using a common-garden experiment, we found that global methylation profiles are often rapidly remodeled across generations but ecomorph-specific differences can be inherited. Collectively, our study suggests an epigenetic contribution to the early stages of vertebrate speciation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Whole-genome DNA methylation landscape of the Astatotilapia cichlid radiation in Lake Masoko.
a, Map of Lake Masoko/Kisiba, Tanzania (modified from www.d-maps.com). b,c, Dissolved oxygen (O2) concentration (%) and water temperature (°C) by depth (metres, m) in Lake Masoko. The oxy- (b) and thermo-clines (c) separate the habitats of the two Lake Masoko A. calliptera ecomorphs: the littoral (yellow) population thrives in shallow (≤5 m), well-oxygenated waters, while the benthic (blue) population is found in deeper, colder and less oxygenated habitats of the lake. Data from Delalande (locally estimated scatterplot smoothed curves). d, Violin plots of stable isotope ratios (δ13CV-PDB, ‰) by population indicate a significantly more offshore zooplankton-based diet for the benthic fish. Two-sided adjusted P values for Games–Howell multiple comparison tests using the Tukey’s method are shown together with mean differences and 95% CI mean differences (5,000 bootstrap resamples; Supplementary Notes). n, number of fish per population. e,f, PCA (PC1 and PC2) of liver methylome (mCG) variation using both the RRBS (e) and WGBS (f) datasets. The principal component scores in (e) significantly segregate the three populations apart (two-sided P value for MANOVA tests; Extended Data Fig. 2c). The percentage of total variance is given in parentheses. The asterisks in (e) indicate samples used for WGBS. n, number of biological replicates per population. g, Unbiased hierarchical clustering and heatmap of the average DNA methylation levels at all significant DMRs found between the three pairwise comparisons (numbered 1–3) reveal population-specific methylome patterns. mCG/CG levels (%) averaged by population over each DMR (Methods). The total number of DMRs for each comparison is shown on the left-hand side of each heatmap. h, Histogram of the closest distances in bp (log scale; median, dotted line) between DMRs and HDRs, when on the same chromosome. i, Enrichment plots (observed/expected ratio; expected values calculated from 500 random resampling data) for methylome variation (DMRs) in different genomic features for each pairwise comparison (the categories ‘promoter’, ‘gene body’ and ‘intergenic’ are mutually exclusive). Chi-squared tests and one-sided P values are shown above the graphs. Repeat, transposon-only repeats. j, GO enrichment analysis for the genes associated with methylome divergence among populations (either in promoter, gene body or intergenic regions).
Fig. 2
Fig. 2. Methylome variation is associated with altered transcriptional activity of genes related to hematopoiesis, erythropoiesis and fatty acid metabolism.
a, Unbiased hierarchical dendrogram based on whole-transcriptome variation (Euclidean distances), highlighting population-specific transcription patterns. b, Unbiased hierarchical clustering and heatmap of transcriptional activity (z-score, row-scaled) for all significant DEGs among the three wild populations, showing three different clusters of transcriptional activity (Wald test FDR-adjusted two-sided P value using Benjamini–Hochberg <0.05, fold change ≥2 and high gene expression [top 90th percentile] in any one sample). Examples of DEGs are shown on the right-hand side of the graph. Right: GO enrichment for the DEGs from each of the three transcriptional activity clusters (only FDR < 0.01 is shown). KEGG, Kyoto Encyclopedia of Genes and Genomes. c, Pie chart representing the genomic localization of DMRs associated with DEGs. Significant overlap between DMRs at promoter and transcriptional changes (overrepresentation factor = 5.1; exact hypergeometric test, two-sided P < 3.2 × 1025). DEGs can be associated with multiple DMRs in different locations (promoter, intergenic and genic DMRs). d,e, The promoters of the Eklf (d) and EpoR (e) genes, both involved in erythropoiesis and red blood cells differentiation, show hypomethylation levels in the livers of benthic fish compared to the littoral populations. The genome browser view of the methylome profiles for each ecomorph is shown. Each bar represents the average mCG/CG levels in 50 bp-long non-overlapping windows for each ecomorph population. DMRs are highlighted in red and their lengths are indicated in red. Right-hand side of (d) and (e): box plots of gene expression in liver of benthic, littoral and river fish for Eklf (d) and EpoR (e) are shown (q values: Wald test FDR-adjusted two-sided P values using Benjamini–Hochberg <0.05). All box plots indicate the median (middle line), 25th and 75th percentiles (box) and 5th and 95th percentiles (whiskers), as well as outliers (single points).
Fig. 3
Fig. 3. Common-garden experiment results in both global resetting of methylome profiles in wild benthic and littoral fish to resemble riverine methylome profiles and inheritance of fixed methylome differences.
a, Heatmap of the average DNA methylation levels (mCG/CG, %) at all DMRs found in wild populations in Fig. 1g for all wild and common-garden fish. Methylome profiles revealed global epigenetic resetting in wild benthic and littoral fish to resemble neighbouring river fish methylome profiles, which were mostly unaffected by environmental perturbation. b, The proportion of reset (on common-garden experiment and within one generation) and population-fixed DMRs between littoral and benthic fish. See Extended Data Fig. 9a for the other pairwise comparisons. c, GO analysis showing significant enrichment for fixed and reset DMRs in genes involved primarily in developmental and metabolic processes, respectively. d,e, Examples of DMRs fixed between populations (d) in wild and common-garden fish, with some fixed DMRs also associated with altered transcriptional activity in the liver (liver DEG; using Wald test FDR-adjusted two-sided P value using Benjamini–Hochberg <0.05; see Extended Data Fig. 9f–h for P values associated with each DEG) and of DMRs reset on the common-garden experiment, all associated with population-specific transcriptional differences (e). Each bar represents the average mCG/CG levels in 50 bp-long non-overlapping windows for each fish population (n ≥ 2 biological replicates). DMRs are highlighted in red and the length (bp) of each DMR is indicated in black.
Fig. 4
Fig. 4. Hypothesized three stages of epigenetically associated Astatotilapia ecomorph speciation in Lake Masoko.
(1) Colonization of the shallow habitats of Lake Masoko by the generalist riverine Astatotilapia population approximately 10,000 years ago (Malinsky et al.). (2) Occupancy of shallow, reedy and highly oxygenated habitats by fish with a high level of depth philopatry. Phenotypic plasticity, partially linked to global methylome changes, enables utilization of littoral macroinvertebrate prey. (3) Colonization of deep, zooplankton-rich and lowly oxygenated habitats by the shallow population approximately 1,000 years ago (Malinsky et al.). Extreme methylome changes in the benthic population associated with diet (for example, fatty acid metabolism) and environment (for example, hemoglobin composition) are shown. Epialleles are reciprocally fixed in the two populations, plausibly leading migrants, and those of intermediate epi-genotypes, to suffer a fitness disadvantage. Eventually, selection leads to differential fixation of genomic variation.
Extended Data Fig. 1
Extended Data Fig. 1. Map of Lake Masoko, Tanzania, and photographs of Astatotilapia cichlid ecomorphs of Lake Masoko.
a. Lower panel: Lake Masoko/Kisiba is part of the Lake Malawi catchment geographical area (location of Lake Masoko is indicated by the rectangle). Upper panel: Geographical map inset (rectangle) of Lake Masoko area. Credits: Imagery ©2022 CNES / Airbus, Maxar Technologies, Map data ©2022 Google (upper panel) and d-maps.com (lower panel). b. Panoramic photograph of Lake Masoko (11 April 2018). c. Photographs of male A. calliptera specimens of Lake Masoko in breeding colors: littoral ecomorph (left), benthic ecomorph (right). Photo credits: GV.
Extended Data Fig. 2
Extended Data Fig. 2. Mapping of sequencing reads and genome-wide methylome levels.
a. Mapping rates (percentage total reads) for RRBS and WGBS sequencing reads aligned to the SNP-corrected Maylandia zebra reference assembly (GCF_000238955.4 M_zebra_UMD2a). Black midlines and whiskers represent median values with 95% CI of mapping rates for each population. Overall mapping rates are shown at the bottom of each graph (mean ± sd). b. Upper panel: Genome-wide average liver methylation levels for each sample of each population (B, benthic, L, littoral and R, river; w, wild-caught). Average mCG/CG levels in non-overlapping 1kbp-long windows. Median and mean values are indicated with black midlines and white dots, respectively. Lower panel: CG coverage (count of unique mapped reads) of all WGBS samples. c. Boxplots showing PC scores for PC1 and PC2 associated with the PC analysis (Fig.1a) of RRBS-related genome-wide CG methylation variation among all populations. Statistical report for MANOVA test is shown at the bottom of the graph and post-hoc Games-Howell multiple comparison two-sided P-values adjusted with the Tukey’s method are shown for each pairwise comparison above each boxplot. n ≥ 2 and n ≥ 11 biological replicates for WGBS and RRBS datasets, respectively, for all graphs. All box plots indicate median (middle line), 25th, 75th percentile (box), and 5th and 95th percentile (whiskers) as well as outliers (single points).
Extended Data Fig. 3
Extended Data Fig. 3. Properties of the WGBS-DMRs found between wild populations of Lake Masoko and riverine A. calliptera ecomorph populations.
a. Total count of differentially methylated regions (DMR) found between each pairwise comparison using the WGBS dataset (B, benthic, L, littoral and R, river; v., versus; see Fig. 1g) and total number of unique DMRs (found in ≥1 pairwise comparison; in grey). b. Histograms of methylation difference (mCG/CG ∆) at DMRs found for each pairwise comparison. For each pairwise comparison, DMRs are split between Gain-/Loss-DMRs for gain/loss of methylation. Gain-DMRs in benthic, littoral or river fish, respectively, are indicated with histograms of different colors (blue, yellow, or red, respectively). Average DMR methylation difference (mean ± sd) is shown above/below each graph. c. Violin plots showing average DMR mCG/CG levels for each DMR group found in (c). Values on the left of each graph represent mean ± sd for mCG/CG levels. d. Left: Violin and box plots of the length (bp, log10 y-axis) of all unique DMRs found in ≥1 comparison. Right: violin plots showing the number of CG sites within all precited DMR sequences found between each pairwise comparison (log10 y-axis). Red dots and black horizontal bars represent mean and median values, respectively. Box plots indicate median (middle line), 25th, 75th percentile (box), and 5th and 95th percentile (whiskers) as well as outliers (single points). e. Circos plot showing DMR density found between each pairwise comparison (from Fig. 1g) across all chromosomes (data shown only for linkage groups LG [putative chromosomes] 1-23 in GCF_000238955.4 M_zebra_UMD2a). f. Example of methylome landscape (mean mCG/CG over 50 bp-long windows) at four highly diverged regions (HDRs between littoral and benthic populations, in green; from Malinsky et al., 2015) for wild-caught (w) fish from river (R), littoral (L) and benthic populations (B). n, number of biological replicates. DMR, differentially methylated regions in the three pairwise comparisons (v, versus; in purple). HDR annotations refer to Ref. .
Extended Data Fig. 4
Extended Data Fig. 4. WGBS and RRBS datasets cross-validation.
a. Heatmap showing Spearman’s correlation scores (Euclidean distances and complete-linkage clustering) of methylome variation at WGBS-DMRs covered by all RRBS samples (DMR count, n = 413) and showing strong population-specific methylome patterns (see Methods and Supplementary Notes). b. Heatmap showing average methylation levels averaged across all RRBS and WGBS samples (average mCG/CG per group) at WGBS-DMRs for each of the three WGBS-DMRs pairwise comparison (Fig.1g). Number of DMRs shown next to each heatmap. c. Total number of DMRs identified using the RRBS samples. The proportion of RRBS-DMRs overlapping with WGBS-DMRs is shown in pink. d. Table summarising the total number of DMRs found using both RRBS and WGBS datasets. Biological replicates: WGBS: Benthic (B), n = 3; Littoral (L) and Riverine (R), n = 2; RRBS, Benthic and Littoral, n = 12; River, n = 11.
Extended Data Fig. 5
Extended Data Fig. 5. Examples of WGBS DMRs and enrichment analysis for TF sequence binding motifs within DMRs.
a. Methylome landscape (liver) for the three wild-caught (w) populations at six DMRs. Average mCG/CG [0-1] in 50 bp long windows (n indicates the number of biological replicates). The length (bp) of each DMR is indicated in red below each example. Loss-/Gain-DMRs indicate DMRs showing significant decrease/increase in mCG/CG levels (at least 25% mCG/CG difference, P < 0.05). TNFRSF member 5, tumor necrosis factor receptor superfamily member 5 (immune response, apoptosis); H-2 class II histocompatibility antigen, A-U alpha chain (adaptive immune response); IL18RAP, Interleukin-18 receptor accessory protein (immune response); protein wnt-8a and sost, sclerostin, both part of the wnt signalling pathway (early embryo embryogenesis). The homeobox hoxa genes cluster. b. Motif enrichment analysis for transcription factor (TF) binding sites in the sequence of DMRs located in promoter/intergenic regions (upper graphs) and within gene bodies (lower graphs) for the three pairwise comparisons. Enrichment over background using HOMER (scrambled DMR sequences, 50,000 iterations; see Methods). Two-sided P-values for motif enrichment based on cumulative hypergeometric distributions (tests) are shown.
Extended Data Fig. 6
Extended Data Fig. 6. Transcriptional divergence among populations and negative correlation between promoter methylation and transcriptional activity.
a. Unbiased hierarchical clustering and heatmap of transcriptional variation among all RNAseq samples (based on Spearman’s correlation scores). Gene expression patterns segregate the populations apart, independently of sequencing read lengths. All annotated genes were used (tpm). Of note, paired-end 150bp-long reads were trimmed in silico down to paired-end 100 bp to account for different read lengths (see Methods). b. Methylation profiles (averaged mCG/CG levels for n = 3 WGBS benthic samples) along gene bodies and at promoters according to gene expression levels. Genes were split into five categories according to their gene expression activity (averaged tpm per gene for n = 5 benthic fish RNAseq samples). Spearman’s correlation tests are shown in plot and show significant negative correlation between methylation at promoters and transcriptional activity (rho = -0.33, two-sided P < 2.2E-16).
Extended Data Fig. 7
Extended Data Fig. 7. Genes with functions related to haematopoiesis, hemoglobin and lipid metabolism show higher transcriptional activity in Benthic fish specifically.
a. Boxplot of liver gene expression (log10(tpm + 1)) for the three hemoglobin (Hb) subunit genes and for SCL/Tal1 gene, all significantly upregulated in benthic fish. False discovery rate adjusted P-values (q) using the Benjamini-Hochberg method (sleuth; Wald test; see Methods) shown for Benthic-Littoral comparison only. b. The 2kbp-long DMR is located 7kbp away from the ‘MN’ globin cluster containing the three differentially expressed hemoglobin genes (see a.). Gain-DMR in the wild Benthic fish. Littoral and river fish show intermediate and low methylation levels respectively. w, wild-caught; R, L, B, river, littoral and benthic fish, respectively. Each bar plot represents average mCG/CG levels for each population in 50bp-long windows. c. Methylation landscape over the genes fads2 (gene body DMR) and miox (promoter DMR). Each bar represents the average liver mCG/CG levels in 50bp-long windows for each population. d. Boxplot of gene expression values (log10(tpm + 1) for fads2 and miox in Benthic (n = 5), Littoral (n = 4) and River (n = 4) liver tissues. q, false discovery rate adjusted two-sided P-value, using the Benjamini-Hochberg methods (sleuth; Wald test; see Methods) are shown for all comparisons in graphs. All box plots indicate median (middle line), 25th, 75th percentile (box), and 5th and 95th percentile (whiskers) as well as outliers (single points).
Extended Data Fig. 8
Extended Data Fig. 8. Mapping results, methylome variation and PC analysis for Common-Garden/tank-reared WGBS fish samples.
a. Mapping rates for Common Garden fish WGBS reads aligned to M. zebra_UMD2a reference genome (see Methods). Black midlines and whiskers represent mean ± sd of mapping rates within each population. Average mapping rates shown at the bottom of each graph (mean ± sd). b. Violin plots of average mCG/CG levels in 1kbp-long non-overlapping windows genome-wide for common garden (cg) fish of each population. Black circles and lines indicate mean and median values, respectively. c. PCA of liver methylome variation in wild (shades of purple) and common garden (shades of green) A. calliptera populations reveals global methylation remodeling in wild benthic and littoral fish (Wild Masoko) to resemble river fish methylome profiles. Upper- and lower-case letters distinguish wild from common garden fish for each population (river R/r, benthic B/b, littoral L/l). Biological replicates: wild fish: Benthic (B), n = 3; Littoral (L) and Riverine (R), n = 2; common garden fish: benthic (b), littoral (l) and riverine (r), n = 2.
Extended Data Fig. 9
Extended Data Fig. 9. Fixed and reset methylome variation among wild populations of Lake Masoko A. calliptera cichlids.
a. Plot showing the proportion of fixed versus reset DMRs between each pairwise comparison upon common garden experiment. b. Violin plots showing length in base pair (bp) of fixed and reset DMRs found for each pairwise comparison. y-axis, logarithmic scale. Median lengths for all fixed and reset DMRs are indicated above the graphs. Two-way ANOVA, P < 4.6e-05. c. GO analysis for reset DMRs associated with differentially expressed genes (DEG). d. Methylome profiles of the genes eklf and epoR for wild and common garden (‘Tank’) populations showing reset methylome patterns in benthic fish upon environmental perturbation. e. Example of fixed benthic-specific hypomethylation in an uncharacterised non-coding RNA gene associated with its downregulation in benthic fish only. f-g. Box plot showing gene expression profiles for the genes dpp4 and ces2, associated with fixed benthic-specific methylome patterns. B, benthic; L, littoral; R, river populations, respectively. Box plots indicate median (middle line), 25th, 75th percentile (box), and 5th and 95th percentile (whiskers) as well as outliers (single points).

References

    1. Brawand D, et al. The genomic substrate for adaptive radiation in African cichlid fish. Nature. 2014;513:375–381. doi: 10.1038/nature13726. - DOI - PMC - PubMed
    1. Svardal H, Salzburger W, Malinsky M. Genetic variation and hybridization in evolutionary radiations of cichlid fishes. Annu. Rev. Anim. Biosci. 2021;9:55–79. doi: 10.1146/annurev-animal-061220-023129. - DOI - PubMed
    1. Salzburger W. Understanding explosive diversification through cichlid fish genomics. Nat. Rev. Genet. 2018;19:705–717. doi: 10.1038/s41576-018-0043-9. - DOI - PubMed
    1. Gore AV, et al. An epigenetic mechanism for cavefish eye degeneration. Nat. Ecol. Evol. 2018;2:1155–1160. doi: 10.1038/s41559-018-0569-4. - DOI - PMC - PubMed
    1. Kawakatsu T, et al. Epigenomic diversity in a global collection of Arabidopsis thaliana accessions. Cell. 2016;166:492–505. doi: 10.1016/j.cell.2016.06.044. - DOI - PMC - PubMed

Publication types