Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 3;13(8):evab121.
doi: 10.1093/gbe/evab121.

Characterization of a Sex-Determining Region and Its Genomic Context via Statistical Estimates of Haplotype Frequencies in Daughters and Sons Sequenced in Pools

Affiliations

Characterization of a Sex-Determining Region and Its Genomic Context via Statistical Estimates of Haplotype Frequencies in Daughters and Sons Sequenced in Pools

Richard Cordaux et al. Genome Biol Evol. .

Abstract

Sex chromosomes are generally derived from a pair of autosomes that have acquired a locus controlling sex. Sex chromosomes may evolve reduced recombination around this locus and undergo a long process of molecular divergence. At that point, the original loci controlling sex may be difficult to pinpoint. This difficulty has affected many model species from mammals to birds to flies, which present highly diverged sex chromosomes. Identifying sex-controlling loci is easier in species with molecularly similar sex chromosomes. Here we aimed at pinpointing the sex-determining region (SDR) of Armadillidium vulgare, a terrestrial isopod with female heterogamety (ZW females and ZZ males) and whose sex chromosomes appear to show low genetic divergence. To locate the SDR, we assessed single-nucleotide polymorphism (SNP) allele frequencies in F1 daughters and sons sequenced in pools (pool-seq) in several families. We developed a Bayesian method that uses the SNP genotypes of individually sequenced parents and pool-seq data from F1 siblings to estimate the genetic distance between a given genomic region (contig) and the SDR. This allowed us to assign more than 43 Mb of contigs to sex chromosomes, and to demonstrate extensive recombination and very low divergence between these chromosomes. By taking advantage of multiple F1 families, we delineated a very short genomic region (∼65 kb) that presented no evidence of recombination with the SDR. In this short genomic region, the comparison of sequencing depths between sexes highlighted female-specific genes that have undergone recent duplication, and which may be involved in sex determination in A. vulgare.

Keywords: SNP; gene duplication; pool-seq; recombination; sex chromosomes; terrestrial isopods.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Locating the ZW SDR via a cross where the genomes of several F1 siblings per sex are sequenced in pools and parental genomes sequenced individually. Red/blue rods represent chromosomes carrying the W/Z alleles. They possess informative SNPs (heterozygous for mothers and homozygous for fathers) shown as horizontal segments. For each SNP, the maternal allele (as defined in the “Materials and Methods” section) appears in green. In this example, no crossing over occurred between the SNPs and the SDR. The mapping of reads obtained from the pools shows sequences (reads) aligned on the reference genome (bottom sequence), with the three informative SNPs outlined. A sequencing error is shown in red. Tables at the bottom show the values of the variables used to estimate the frequency of the W-linked haplotype (C-A-C) in each pool, based on the mapped reads. See “Statistical Estimation of Haplotype Frequencies” section for the definition of these variables.
Fig. 2
Fig. 2
Ten hypothetical SNPs at a locus that has not recombined with the SDR during crosses involving two families. Letters A/T indicate DNA bases (alleles) at the SNPs. (A) Parental genotypes and data from F1 pools of five siblings. Numbers (0, 5, 10) in the F1 tables indicate the number of chromosomes that carry each allele and are only recorded for informative SNPs (otherwise, “NA” is noted). The SDR allele that is linked to the maternal allele (rightmost column of each table) is inferred from allele frequencies in the F1 (see “Estimation of Recombination with the Sex-Determining Locus” section). This inference permits the phasing of parental haplotypes, shown as vertical rods in panel (B). Recombination must have occurred between certain SNPs and the SDR during the divergence of families, barring homoplasy in the SNPs. SNP #4 may not have recombined with the SDR, but it is flanked by two SNPs that must have. Because there is no evidence of recombination between SNP #4 and these two others (these three SNPs constitute three different haplotypes, not four), they constitute a single genomic block delineated by the dotted lines.
Fig. 3
Fig. 3
Distributions of the inferred genetic distance of A. vulgare contigs to the SDR for real data and for simulated data assuming that all contigs are located on autosomes. Genetic distances are inferred from simulated or observed genetic data from 40 F1 siblings belonging to two families. Vertical dotted lines represent genetic distances corresponding to integer numbers of recombination events during the crosses. Distributions only consider contigs for which both families present informative SNPs (33,875 contigs). Contigs whose inferred distance to the SDR was sinificantly lower than the distance yielded by simulations were assigned to sex chromosomes and constitute the blue area (see text). The modes of the distributions are lower than 50 cM, despite this value being the expectancy for autosomal contigs, because the phasing of maternal haplotypes is less reliable for contigs that are distant from the SDR (supplementary text, fig. S1, Supplementary Material online).
Fig. 4
Fig. 4
Cumulated length of 1004 A. vulgare contigs that locate below or at a given genetic distance from the SDR. Genetic distances are inferred from genetic data from 40 F1 siblings belonging to two families. The black curve represents observed data, and the colored area is the envelope constructed from the 0.005 and 0.995 quantiles of the cumulated length of contigs simulated under the assumption of uniform crossing over rates along sex chromosomes.
Fig. 5
Fig. 5
The 112 A. vulgare contigs that are inferred not to have recombined with the SDR in 40 F1s from two crosses. Each sectored vertical bar of the larger plot represents a contig. Contigs are ranked according to their length. Sectors within bars represent the genomic blocks constituting contigs (see “Localization of Genomic Regions That May Contain the SDR” section). Bar colors represent the SNPs that genomic blocks carry and use the same color codes as in figure 2 and in the inset. The inset shows the total lengths of different categories of genomic blocks according to the SNPs they carry. Blocks belonging to first three categories (from the top) contain no more than one recombinant SNP and less than 50% of recombinant SNPs.
Fig. 6
Fig. 6
Density of heterozygous SNPs in females as a function of the inferred genetic distance to the SDR for contigs assigned to sex chromosomes (left-hand plot) and its distribution for other contigs (right-hand plot). The diamonds on the left-hand plot represent medians computed for ten classes of genetic distance. Classes are delimited by the deciles and therefore comprise ∼70 contigs each. For the density computation (right-hand plot), contigs were assigned weights equal to the lengths of regions with sufficient sequencing depth to measure heterozygosity (see “Materials and Methods” section).
Fig. 7
Fig. 7
Normalized female (red curves) and male (blue curves) sequencing depths on contig 20397, presenting low chromosome quotient (CQ), based on sequenced DNA from the progeny of three A. vulgare families, constituting six pools. Regions of CQ < 0.3 and of female sequencing depth ≥ 5 in all families are represented as light pink areas. Vertical lines represent informative SNPs, including (gray) SNPs that are variable in a single family and (green) SNPs that are compatible with the control of sex (“potentially causal SNPs,” see text). Annotated genes are represented by horizontal dark gray lines under gene identifiers. Exons are shown as thick red bars.

Similar articles

Cited by

References

    1. Akagi T, Henry IM, Tao R, Comai L.. 2014. A Y-chromosome–encoded small RNA acts as a sex determinant in persimmons. Science 346:646. - PubMed
    1. Artault J-C. 1977. [Thèse de 3ème cycle]. Contribution à l’étude des garnitures chromosomiques chez quelques Crustacés Isopodes. Poitiers (France: ): Université de Poitiers.
    1. Bachtrog D. 2013. Y-chromosome evolution: emerging insights into processes of Y-chromosome degeneration. Nat Rev Genet. 14:113–124. - PMC - PubMed
    1. Bachtrog D, et al.2014. Sex determination: why so many ways of doing it? PLoS Biol. 12(7):e1001899. - PMC - PubMed
    1. Baird NA, et al.2008. Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS One. 3:e3376. - PMC - PubMed

Publication types