Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Nov 7:21:5738-5750.
doi: 10.1016/j.csbj.2023.11.003. eCollection 2023.

JLOH: Inferring loss of heterozygosity blocks from sequencing data

Affiliations

JLOH: Inferring loss of heterozygosity blocks from sequencing data

Matteo Schiavinato et al. Comput Struct Biotechnol J. .

Abstract

Heterozygosity is a genetic condition in which two or more alleles are found at a genomic locus. Individuals that are the offspring of genetically divergent yet still interfertile parents (e.g. hybrids) are highly heterozygous. One of the most studied aspects in the genomes of these individuals is the loss of their original heterozygosity (LOH) when multi-allelic sites lose one of their two alleles by converting it to the other, or by remaining hemizygous at that site. The region undergoing LOH may involve a single nucleotide polymorphism (SNP) or a longer stretch of DNA. LOH is deeply interconnected with adaptation but the in silico techniques to infer evolutionary relevant LOH blocks are hardly standardised, and a general tool to infer and analyse them across genomic contexts and species is missing. Here, we present JLOH, a computational toolkit for the inference and exploration of LOH blocks in genomes with at least 1% heterozygosity. JLOH only requires commonly available genomic sequencing data as input. Starting from mapped reads, called variants and a reference genome sequence, JLOH infers candidate LOH blocks based on SNP density (SNPs/kbp) and read coverage per position. Considering that most organisms that undergo extensive LOH are hybrids, JLOH has been designed to capture any subgenomic LOH pattern, assigning each LOH block to its subgenome of origin.

Keywords: Heterozygosity; Hybrids; LOH; Pipeline; Sequencing; Variation.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1
Fig. 1
Schematic representing the rationale of “jloh extract”. A) SNPs are parsed by the program from the provided VCF file(s), subdividing them into heterozygous (gray) and homozygous (cyan). B) Regions rich in heterozygous SNPs are identified (light gray) based on SNP density criteria. Dotted red lines represent intervals containing heterozygous SNPs that do not fit the SNP density thresholds either due to the number of SNPs (c) or to the distance between them (w). C) Candidate LOH blocks, i.e. regions depleted in heterozygous SNPs are extracted (cyan), which are complementary to the identified heterozygous regions. D) Coverage per position is assessed for each candidate LOH block (cyan) and zygosity is inferred from it (horizontal light gray bars). E) When using data from hybrids, homozygous SNPs are used to subdivide candidate LOH blocks into reference (REF, orange) and alternative (ALT, cyan) allele based on the same mechanism explained in (B) for heterozygous SNPs. F) If specified by the user, candidate LOH blocks overlapping known LOH blocks are filtered out, leaving only the newly found ones.
Fig. 2
Fig. 2
Schematic displaying the different modules available within the JLOH toolkit, with brief explanation.
Fig. 3
Fig. 3
Sensitivity (TP) vs specificity (TN) rates in all the runs performed with simulated data, subdivided in facets corresponding to the simulated sequence divergence between subgenomes of the simulated hybrid data. Divergence rates are indicated from 0.01 (1%) to 0.20 (20%). The orange box in each facet represents “good” runs (TP and TN ≥ 0.75) while the red box represents “excellent” runs (TP and TN ≥ 0.9).
Fig. 4
Fig. 4
(A) Tree based on variants depicting phylogenetic relationships between C. orthopsilosis hybrid strains. The four known clades are marked. (B) Average of homozygous and heterozygous SNP densities of C. orthopsilosis strains belonging to clades 1–4.
Fig. 5
Fig. 5
The number of inferred LOH blocks and their size depend on the different quantile values used (A) number, size of LOH blocks and total bp of LOH in all C. orthopsilosis hybrid strains using different quantile values. (B) Number of shared LOH blocks between four representative strains of each clade.
Fig. 6
Fig. 6
Density of LOH signal along chromosome 12 of S. cerevisiae (A) and of S. paradoxus (B) computed in windows of 10 kb in five S. cerevisiae x S. paradoxus hybrids. The figure was generated with “jloh plot”. Colors represent allele assignment (REF or ALT) while color intensity represents the fraction of positions in each window that were in a predicted LOH block. Note that in (A) “blue” corresponds to S. paradoxus alleles while in (B) it corresponds to S. cerevisiae alleles, as it generally represents the alternative allele.

References

    1. Arnold M.L. Natural hybridization as an evolutionary process. Annu Rev Ecol Syst. 1992;23:237–261.
    1. Gabaldón T. Patterns and impacts of nonvertical evolution in eukaryotes: a paradigm shift. Ann N Y Acad Sci. 2020;1476:78–92. - PMC - PubMed
    1. Schneemann H., De Sanctis B., Roze D., Bierne N., Welch J.J. The geometry and genetics of hybridization. Evolution. 2020;74:2575–2590. - PMC - PubMed
    1. Forche A. Large-scale chromosomal changes and associated fitness consequences in pathogenic fungi. Curr Fungal Infect Rep. 2014;8:163–170. - PMC - PubMed
    1. Liang S.-H., Bennett R.J. The impact of gene dosage and heterozygosity on the diploid pathobiont candida albicans. J Fungi. 2020;6:10. - PMC - PubMed