Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 10;23(1):14.
doi: 10.1186/s13059-021-02599-2.

The genome of oil-Camellia and population genomics analysis provide insights into seed oil domestication

Affiliations

The genome of oil-Camellia and population genomics analysis provide insights into seed oil domestication

Ping Lin et al. Genome Biol. .

Abstract

Background: As a perennial crop, oil-Camellia possesses a long domestication history and produces high-quality seed oil that is beneficial to human health. Camellia oleifera Abel. is a sister species to the tea plant, which is extensively cultivated for edible oil production. However, the molecular mechanism of the domestication of oil-Camellia is still limited due to the lack of sufficient genomic information.

Results: To elucidate the genetic and genomic basis of evolution and domestication, here we report a chromosome-scale reference genome of wild oil-Camellia (2.95 Gb), together with transcriptome sequencing data of 221 cultivars. The oil-Camellia genome, assembled by an integrative approach of multiple sequencing technologies, consists of a large proportion of repetitive elements (76.1%) and high heterozygosity (2.52%). We construct a genetic map of high-density corrected markers by sequencing the controlled-pollination hybrids. Genome-wide association studies reveal a subset of artificially selected genes that are involved in the oil biosynthesis and phytohormone pathways. Particularly, we identify the elite alleles of genes encoding sugar-dependent triacylglycerol lipase 1, β-ketoacyl-acyl carrier protein synthase III, and stearoyl-acyl carrier protein desaturases; these alleles play important roles in enhancing the yield and quality of seed oil during oil-Camellia domestication.

Conclusions: We generate a chromosome-scale reference genome for oil-Camellia plants and demonstrate that the artificial selection of elite alleles of genes involved in oil biosynthesis contributes to oil-Camellia domestication.

Keywords: Domestication; Genome; Genome-wide association analysis; Oil biosynthesis; Oil-Camellia; Population genomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
The landscape of genome structure and evolutionary analyses of the CON genome. A A circular representation of the characteristics of the assembled CON genome. The different layers of circles are listed: (a) pseudo-molecules of oil-Camellia chromosomes; (b–e) the distribution of GC density, repetitive elements density, gene models, and non-coding RNA genes. (f) the syntenic blocks of the CON genome. B Distribution of silent divergence rates (Ks) between gene pairs within the Diospyros (KsD), Actinidia (KsA), and CON genomes (Camellia, KsY). Camellia shows two peaks, indicated by red arrows, which are corresponding to the Ad-β and γ duplication. The Ad-α and Dd-α duplication that are found in Actinidia and Diospyros are revealed by blue and green arrows. C The Ks distribution of orthologous genes between Camellia-Actinidia (KsYA), Camellia-Diospyros (KsYD), and Acitinidia-Diospyros (KsAD); the peaks (indicated by the black arrows) indicate the speciation events. D A time-calibrated phylogenetic tree of related plant species. The whole-genome duplication events are indicated by the colored circles. The divergent time of branches is revealed by the numbers on each branch. Ci.sinensis, Citrus sinensis; A.thaliana, Arabidopsis thaliana; P.trichocarpa, Populus trichocarpa; V.vinifera, Vitis vinifera; C.oleifera, Camellia oleifera; Ca.sinensis, Camellia sinensis; A.chinensis, Acitinidia chinensis; D.kaki, Diospyros kaki; A.trichopoda, Amborella trichopoda. Grey circle, the γ triplication event; Orange circle, the Dd-α duplication; Green circle, the Ad-α duplication; Magenta circle, the Ad-β duplication.
Fig. 2
Fig. 2
Comparative analyses of LTR elements. A The age distribution (LTR insertion time) of CON genome and other higher plants. The dashed line indicates the timepoint of 4 million years. Aqc, Aquilegia coerulea; Art, Arabidopsis thaliana; Cap, Carica papaya; Cis, Citrus sinensis; Mig, Mimulus guttatus; Nen, Nelumbo nucifera; Pot, Populus trichocarpa; Sol, Solanum lycopersicum; CON, Camellia oleifera. The genome sequences of other plant species are derived from Phytozome v12.1.6. B, C The phylogenic analysis of the Copia and Gypsy LTR elements in the CON genome. Different color shades indicate the expansion of the LTR subfamilies.
Fig. 3
Fig. 3
Genetic architecture and analysis of the population of oil-Camellia cultivars. A The PCA analysis of C. oleifera accessions. Different color of points indicates the origins of the cultivars that are collected from different areas of China. Source data are provided as Table S20. SEF, South East of Fujian; SWG, South West of Guangxi; MJX, Middle of Jiangxi; MZJHN, Middle of Zhejiang and Hunan. B The population structures of 221 accessions in association population by admixture analysis. Numbers in the y-axis indicate the membership coefficient. Each bar on the x-axis represents an accession; colored segments within one bar reflect the proportional contributions of each subpopulation to this individual. The subpopulations are separated by dashed lines. The black bracket indicates the wild accessions. C The phylogenetic analysis and phenotypic evaluation of 221 C. oleifera accessions. (a) The seven red frames are the seven subpopulations defined based on the phylogenetic tree. (b–h) are the distributions of average fruit yield of per tree (kg) for the seven subpopulations, respectively. (i–o) are the distributions of average weight per fruit (g) for the subpopulations, respectively. (p–v) are the distribution of seed number per fruit for the subpopulations, respectively
Fig. 4
Fig. 4
Genome-wide identification and annotations of candidate genes involved in the seed oil domestication. A The strategy of association analyses in this study. Three categories of association analyses are performed including GWAS, qGWAS, and eQTL. B, G, H are the Manhattan plots displaying the GWAS results for OC, palmitic acid content and stearic acid content, respectively. The y-axis is the –log10(P values). Each point represents a molecular marker. Horizontal dashed lines indicate the significance level of P value = 10E-05. The labeled SNP loci are the candidate genes involved in the oil biosynthesis genes. C, D, E, F are corresponding to the four candidate genes, respectively to SDP1, IAA26, FabD, and Oleosin3), that are involved in the OC selection. I, J are two candidate genes (SAC8 and KASIII) involved in the palmitic acid content selection. K, L, M are three genes (GDL57, GLPK, and SAD1) involved in the stearic acid content selection. In each panel (including C-F and I-M), The distribution of corresponding traits and gene expression are shown on the diagonal plots. To the bottom left were the bivariate scatter plots with best fit lines displayed. Correlation coefficients are shown above the diagonal. “***,” “**,” and “*” denote significance with P values of 0.001, 0.01, and 0.05, respectively. Red and blue denote the positive and negative correlations, respectively.
Fig.5
Fig.5
A proposed molecular diagram of the metabolic pathway comprising of the genes associated with ORTs in C. oelifera. The biochemical process of enzymes and functions are on the basis of previous reports [23, 24]. A The de novo biosynthesis process of free fatty acids in plastid; B fatty acids modification, acyl editing and TAG assembly in endoplasmic reticulum; C and D oil storage, transportation and breakdown in cytosol and plasma membrane. The key enzymes of the oil metabolism pathway are indicated in the light red boxes. The squares and hexagons are indicating the identification of genes in GWAS or qGWAS, respectively. Colored squares and circles are presenting the significant associations of different oil traits. The yellow circle indicates the oil body. C16:0, palmitic acid; C16:1, palmitoleic acid; C18:0, stearic acid; C18:1, oleic acid; C18:2, linoleic acid; C18:3, linolenic acid; C20:1, cis-11-eicosenoic acid; Acyl-CoA, acyl-coenzyme A; ACC, acetyl-CoA carboxylase; FabD, malonyl CoA:ACP transacylase; KAS I, II, III, β-ketoacyl-[acyl carrier protein] synthase I, II, III; SAD, stearoyl ACP desaturase; FatA and FatB, fatty acyl-ACP thioesterase A, B; FFA, free fatty acid; LACS, long-chain acyl-CoA synthase; LPCAT, acyl-CoA: lysophosphatidylcholine acyltransferase; PC, phosphatidylcholine; FAD2 and FAD6, fatty acid Delta-12 desaturase; FAD3 and FAD7, fatty acid Delta-15 desaturase; PUFA, polyunsaturated fatty acid; G3P, glycerol-3 phosphate; GPAT, G3P acyltransferase; LPA, lyso-phosphatidic acid; LPAAT, LPA acyltransferase; PA, phosphatidic acid; PAP, phosphatidic acid phosphatase; DAG, diacylglycerol; DGAT, diacylglycerol acyltransferase; PLC and PLD, phospholipase C and D; DGK, diacylglycerol kinase; SDP1, sugar-dependent triacylglycerol lipase; DGL, diacylglycerol lipase; lip1, monoacylglycerol lipase; GLPK, Glycerol kinase; LTPG1, non-specific lipid transfer protein GPI-anchored 1; GDSL and GDL57, GDSL esterase/lipase
Fig. 6
Fig. 6
The evaluation of genotype groups of candidate genes involved in the domestication of oil-Camellia. A SDP1, B IAA26, C FabD, D Oleosin3, E SAC8, FKASIII, G SAD1, H SAD6. The upper panel shows the distribution of expression level of candidate gene; and the lower panel shows the distribution of corresponding traits in different genotypes. In each barplot, the middle line of the box represents the median, the first (25%) and third (75%) quartiles are indicated by the lower and upper boundaries and the whiskers show the minimum and maximum values, excluding outliers. The black points outside the whiskers indicate the outliers. The x-axis indicates the genotypes and the number of individuals of each genotype

References

    1. Doebley JF, Gaut BS, Smith BD. The molecular genetics of crop domestication. Cell. 2006;127(7):1309–1321. doi: 10.1016/j.cell.2006.12.006. - DOI - PubMed
    1. Huang X, Han B. Natural variations and genome-wide association studies in crop plants. Annu Rev Plant Biol. 2014;65(1):531–551. doi: 10.1146/annurev-arplant-050213-035715. - DOI - PubMed
    1. Tang H, Sezen U, Paterson AH. Domestication and plant genomes. Curr Opin Plant Biol. 2010;13(2):160–166. doi: 10.1016/j.pbi.2009.10.008. - DOI - PubMed
    1. Zhang H, Ren S. Theaceae. Beijing: Science Press; 1998.
    1. Gao DF, Xu M, Zhao P, Zhang XY, Wang YF, Yang CR, Zhang YJ. Kaempferol acetylated glycosides from the seed cake of Camellia oleifera. Food Chem. 2011;124(2):432–436. doi: 10.1016/j.foodchem.2010.06.048. - DOI

Publication types

LinkOut - more resources