Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Apr 6:6:44.
doi: 10.1038/s41438-019-0125-7. eCollection 2019.

Low-cost assembly of a cacao crop genome is able to resolve complex heterozygous bubbles

Affiliations

Low-cost assembly of a cacao crop genome is able to resolve complex heterozygous bubbles

Joe Morrissey et al. Hortic Res. .

Abstract

Cacao (Theobroma cacao) is a tropical tree that produces the essential raw material for chocolate. Because yields have been stagnant, land use has expanded to provide for increasing chocolate demand. Assembled genomes of key parents could modernize breeding programs in the remote and under-resourced locations where cacao is grown. The MinION, a long read sequencer that runs off of a laptop computer, has the potential to facilitate the assembly of the complex genomes of high-yielding F1 hybrids. Here, we validate the MinION's application to heterozygous crops by creating a de novo genome assembly of a key parent in breeding programs, the clone Pound 7. Our MinION-only assembly was 20% larger than the latest released cacao genome, with 10-fold greater contiguity, and the resolution of complex heterozygosity and repetitive elements. Polishing with Illumina short reads brought the predicted completeness of our assembly to similar levels to the previously released cacao genome assemblies. In contrast to previous cacao genome projects, our assembly required only a small scientific team and limited reagents. Our sequencing and assembly methods could easily be adopted by under-resourced breeding programs, speeding crop improvement in the developing world.

PubMed Disclaimer

Conflict of interest statement

Compliance with ethical standardsThe authors declare that they have no conflict of interest.

Figures

Fig. 1
Fig. 1. Pound 7 is better representative of both wild cacao and elite F1 hybrids than the previous cacao reference genomes.
a Relative change in worldwide cacao yields, compared with the change in total production and the amount of land used to produce the cacao, normalized to the 1961 data points. Data from FAO.org. b Relative worldwide yield gains for cacao compared with the two other crops. Data from FAO.org. Values are relative, normalized to the 1961 datapoint. c The site of collection of Pound 7 is represented by the red square. The center of cacao’s diversity is the blue circle,. The previous cacao reference genomes are Matina 1–6 (associated with Costa Rica) and B97–61/B2 (associated with Belize),. The satellite image (adapted from Google Maps) shows the coordinates of where Pound 7 was collected, 73.10 W 3.45 S (Pound, 1943; Turnbull, C.J. and Hadley, International Cocoa Germplasm Database). d Pound 7 is a heterozygous wild hybrid of several ancestral groups (Motamayor et al.), in contrast to the two domesticated cacao that were used for the previous reference genomes. e The number of heterozygous SNP calls out of 135,696 SNPs from resequenced cacao genomes shows the range of heterozygosity in cacao accessions. The current reference genomes are the “highly homozygous” B97–61/B2, and Matina 1–6. Both are relatively homozygous compared with Pound 7 and the widely cultivated F1 hybrid CCN 51
Fig. 2
Fig. 2. Our assembly resolved a highly heterozygous locus.
The top panel is the alignment of BAC sequences for heterozygous haplotypes on Chromosome 4 of Pound 7. When aligned to the Matina 1–6 genome, Haplotype A maps to Scaffold 4: 142672 to 324489, and Haplotype B maps to Scaffold 4: 181600 to 392073. The lower two panels show the two haplotypes resolved in the MinION assembly, aligned to the BAC sequences. A summary of the annotations is in Supplementary Table 10. Haplotype A is deposited on NCBI as P7SI_AltHap_V3 and Haplotype B is P7SI_MatHap_V3

References

    1. Michael, T. P. & Jackson, S. The first 50 plant genomes. Plant Genome 6, 1–7 (2013).
    1. Ong-Abdullah M, et al. Loss of Karma transposon methylation underlies the mantled somaclonal variant of oil palm. Nature. 2015;525:533–537. doi: 10.1038/nature15365. - DOI - PMC - PubMed
    1. Argout X, et al. The genome of Theobroma cacao. Nat. Genet. 2011;43:101–108. doi: 10.1038/ng.736. - DOI - PubMed
    1. Argout X, et al. The cacao Criollo genomev2.0: an improved version of the genome for genetic and functional genomic studies. BMC Genom. 2017;18:730. doi: 10.1186/s12864-017-4120-9. - DOI - PMC - PubMed
    1. Motamayor JC, et al. The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color. Genome Biol. 2013;14:r53. doi: 10.1186/gb-2013-14-6-r53. - DOI - PMC - PubMed