Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Sep 30;16(9):e0253102.
doi: 10.1371/journal.pone.0253102. eCollection 2021.

Signal-based optical map alignment

Affiliations

Signal-based optical map alignment

Mehmet Akdel et al. PLoS One. .

Abstract

In genomics, optical mapping technology provides long-range contiguity information to improve genome sequence assemblies and detect structural variation. Originally a laborious manual process, Bionano Genomics platforms now offer high-throughput, automated optical mapping based on chips packed with nanochannels through which unwound DNA is guided and the fluorescent DNA backbone and specific restriction sites are recorded. Although the raw image data obtained is of high quality, the processing and assembly software accompanying the platforms is closed source and does not seem to make full use of data, labeling approximately half of the measured signals as unusable. Here we introduce two new software tools, independent of Bionano Genomics software, to extract and process molecules from raw images (OptiScan) and to perform molecule-to-molecule and molecule-to-reference alignments using a novel signal-based approach (OptiMap). We demonstrate that the molecules detected by OptiScan can yield better assemblies, and that the approach taken by OptiMap results in higher use of molecules from the raw data. These tools lay the foundation for a suite of open-source methods to process and analyze high-throughput optical mapping data. The Python implementations of the OptiTools are publicly available through http://www.bif.wur.nl/.

PubMed Disclaimer

Conflict of interest statement

No authors have competing interests.

Figures

Fig 1
Fig 1. The principle of BNG optical mapping.
Long DNA molecules (A) are fluorescently labelled at specific sites (B). Signals are then captured (C) in which peaks correspond to these sites.
Fig 2
Fig 2. OptiTools workflow.
OptiScan detects molecules and stores these in a database, which can then be used by OptiMap for molecule-to-molecule or molecule-to-reference alignment, or exported for use with BNG methods in a BNX molecule file.
Fig 3
Fig 3. Frame pairs as produced by the BNG platform.
Each scan produces two frames (images), with labels (top) and backbones (bottom). The top molecule’s label and backbone intensities are illustrated as 1D signals above the frames. In the label frames, peaks correspond to label centers (A), with peak heights indicating label intensity (B). In the backbone frames, stretches of equal intensity delineate the molecules (C) with occasional higher intensity stretches indicating possible DNA entanglement (D). Note that the frames shown here are in a horizontal orientation for illustration purposes.
Fig 4
Fig 4. Frame stitching.
A simplified illustration of rotation (A), horizontal (B) and vertical (C) translation towards the stitched frames (D).
Fig 5
Fig 5. The OptiMap alignment procedure.
A. Allowing for molecule stretching.B. Log-transformed signal correlation scores help confirm alignments based on raw signals. C. Requiring a minimum number of overlapping labels.
Fig 6
Fig 6. Molecule detection (yeast) by OptiScan and BNG.
Two examples of cases where OptiScan errs on the side of caution in molecule detection. In both, the BNG molecule detection routine seems to generate erroneous molecules.
Fig 7
Fig 7. Comparison of 3 different contigs assembled by the BNG assembler using molecules extracted by OptiScan and by BNG software.
A. OptiScan molecules can have higher resolution, which often more accurately matches the reference genome (i.e. the in silico generated optical map). B. In some cases, OptiScan-based assembly results in better contiguity. C. A long repeat on chromosome 12, not present in the reference genome, was assembled differently based on the two molecule sets.
Fig 8
Fig 8. Alignment performance for yeast (A) and eggplant data (B,C).
Precision, recall, unique molecules and F1 (harmonic mean) of alignments found for diferent molecule sets. A. Taken from all yeast contigs (avg. overlap rate 113x). B. Taken from the longest eggplant contigs (> 5Mb, avg. overlap rate 17x). C. Eggplant molecules with a single overlap.
Fig 9
Fig 9. Distribution of alignment overlap lengths for pairs aligned only by either of the tools.
Fig 10
Fig 10. Visualization of overlapping parts of molecule pairs aligned by A. OptiMap and B. RefAligner.
All overlaps are shown from OptiMap’s perspective where only global stretching is applied. OptiMap alignment (A.) overlaps tend to be shorter in size, whereas RefAligner alignment overlaps indicate loss of phase due to local stretching after 200bp for the top and 150bp for the bottom alignment.

Similar articles

Cited by

References

    1. Faure D, Joly D. Next-generation sequencing as a powerful motor for advances in the biological and environmental sciences. Genetica. 2015;143(2):129–132. doi: 10.1007/s10709-015-9831-8 - DOI - PubMed
    1. De Bustos A, Cuadrado A, Jouve N. Sequencing of long stretches of repetitive DNA. Scientific Reports. 2016;6:36665. doi: 10.1038/srep36665 - DOI - PMC - PubMed
    1. Yuan Y, Bayer PE, Batley J, Edwards D. Improvements in genomic technologies: application to crop genomics. Trends in Biotechnology. 2017;35(6):547–558. doi: 10.1016/j.tibtech.2017.02.009 - DOI - PubMed
    1. Dumschott K, Schmidt MH, Chawla HS, Snowdon R, Usadel B. Oxford Nanopore sequencing: new opportunities for plant genomics? Journal of Experimental Botany. 2020;71(18):5313–5322. doi: 10.1093/jxb/eraa263 - DOI - PMC - PubMed
    1. Jiao Y, Peluso P, Shi J, Liang T, Stitzer MC, Wang B, et al.. Improved maize reference genome with single-molecule technologies. Nature. 2017;546(7659):524. doi: 10.1038/nature22971 - DOI - PMC - PubMed

Publication types

MeSH terms