Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comment
. 2021 Jun 10:2021:gigabyte24.
doi: 10.46471/gigabyte.24. eCollection 2021.

Improvements in the sequencing and assembly of plant genomes

Affiliations
Comment

Improvements in the sequencing and assembly of plant genomes

Priyanka Sharma et al. GigaByte. .

Abstract

Advances in DNA sequencing have made it easier to sequence and assemble plant genomes. Here, we extend an earlier study, and compare recent methods for long read sequencing and assembly. Updated Oxford Nanopore Technology software improved assemblies. Using more accurate sequences produced by repeated sequencing of the same molecule (Pacific Biosciences HiFi) resulted in less fragmented assembly of sequencing reads. Using data for increased genome coverage resulted in longer contigs, but reduced total assembly length and improved genome completeness. The original model species, Macadamia jansenii, was also compared with three other Macadamia species, as well as avocado (Persea americana) and jojoba (Simmondsia chinensis). In these angiosperms, increasing sequence data volumes caused a linear increase in contig size, decreased assembly length and further improved already high completeness. Differences in genome size and sequence complexity influenced the success of assembly. Advances in long read sequencing technology continue to improve plant genome sequencing and assembly. However, results were improved by greater genome coverage, with the amount needed to achieve a particular level of assembly being species dependent.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1.
Figure 1.
Protocol for IPA assembly for PacBio Hifi reads [9]. https://www.protocols.io/widgets/doi?uri=dx.doi.org/10.17504/protocols.io.buxvnxn6
Figure 2.
Figure 2.
Influence of data volume on assembly for Macadamia species. N50 of contigs is plotted against the genome coverage. Genome sizes used to calculate coverage were; M. integrifolia, 895 Mb [20]; M. janseni, 780 Mb [1]; M. tetraphylla, 758 Mb [27] and M. ternifolia, 758 Mb (not known but assumed to be the same as M. tetraphylla owing to similar assembly size).
Figure 3.
Figure 3.
Influence of data volume on assembly for diverse species. N50 of contigs is plotted against the genome coverage. Genome sizes used to calculate coverage were jojoba 1003 Mb [22]; avocado 920 Mb [28], and as in Figure 2 for Macadamia species.
Figure 4.
Figure 4.
Decrease in length of total assembly as more genome coverage is used in the assembly.
Figure 5.
Figure 5.
Improvement in genome completeness (BUSCO%) with genome coverage.

Comment on

  • Comparison of long-read methods for sequencing and assembly of a plant genome.
    Murigneux V, Rai SK, Furtado A, Bruxner TJC, Tian W, Harliwong I, Wei H, Yang B, Ye Q, Anderson E, Mao Q, Drmanac R, Wang O, Peters BA, Xu M, Wu P, Topp B, Coin LJM, Henry RJ. Murigneux V, et al. Gigascience. 2020 Dec 21;9(12):giaa146. doi: 10.1093/gigascience/giaa146. Gigascience. 2020. PMID: 33347571 Free PMC article.

References

    1. Murigneux V, et al. Comparison of long-read methods for sequencing and assembly of a plant genome. Gigascience, 2020; 9(12): giaa146. doi:10.1093/gigascience/giaa146. - DOI - PMC - PubMed
    1. Michael TP, Van Buren R, . Building near-complete plant genomes. Curr. Opin. Plant Biol., 2020; 54: 26–33, doi:10.1016/j.pbi.2019.12.009. - DOI - PubMed
    1. Hon T, et al. Highly accurate long-read HiFi sequencing data for five complex genomes. Sci. Data, 2020; 7: 399. doi:10.1038/s41597-020-00743-4. - DOI - PMC - PubMed
    1. Lang D, et al. Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore. Gigascience, 2020; 9(12): giaa123. doi:10.1093/gigascience/giaa123. - DOI - PMC - PubMed
    1. Cheng B, Furtado A, Henry RJ, . Long-read sequencing of the coffee bean transcriptome reveals the diversity of full-length transcripts. Gigascience, 2017; 6(11): gix086. doi:10.1093/gigascience/gix086. - DOI - PMC - PubMed

LinkOut - more resources