Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Apr 27:7:227.
doi: 10.1186/1471-2105-7-227.

MIDAS: software for analysis and visualisation of interallelic disequilibrium between multiallelic markers

Affiliations

MIDAS: software for analysis and visualisation of interallelic disequilibrium between multiallelic markers

Tom R Gaunt et al. BMC Bioinformatics. .

Abstract

Background: Various software tools are available for the display of pairwise linkage disequilibrium across multiple single nucleotide polymorphisms. The HapMap project also presents these graphics within their website. However, these approaches are limited in their use of data from multiallelic markers and provide limited information in a graphical form.

Results: We have developed a software package (MIDAS - Multiallelic Interallelic Disequilibrium Analysis Software) for the estimation and graphical display of interallelic linkage disequilibrium. Linkage disequilibrium is analysed for each allelic combination (of one allele from each of two loci), between all pairwise combinations of any type of multiallelic loci in a contig (or any set) of many loci (including single nucleotide polymorphisms, microsatellites, minisatellites and haplotypes). Data are presented graphically in a novel and informative way, and can also be exported in tabular form for other analyses. This approach facilitates visualisation of patterns of linkage disequilibrium across genomic regions, analysis of the relationships between different alleles of multiallelic markers and inferences about patterns of evolution and selection.

Conclusion: MIDAS is a linkage disequilibrium analysis program with a comprehensive graphical user interface providing novel views of patterns of linkage disequilibrium between all types of multiallelic and biallelic markers.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Flow-chart of MIDAS from the users perspective. Rectangles indicate user inputs, ovals indicate program functions.
Figure 2
Figure 2
Screenshots of MIDAS. (a) A region of chromosome 11 showing 30 markers. Green lines indicate relative position of markers. Yellow intensity indicates distance between pairwise markers. Placing the mouse over a feature provides details. (b) A pairwise plot for two microsatellites (zoomed in). Significant results are boxed in red (D' ≥ 0) or blue (D'<0). Placing the mouse cursor over an allele pair provides details and statistics and also plots that pair at the bottom right of the screen. Magnitude of |D'| is also plotted (middle right). (c) A SNP/microsatellite pair (zoomed in). This is identical to the microsatellite/microsatellite plot, but with only two alleles in one dimension. (d) A SNP/SNP pair (zoomed in). The plot is oriented to place the most frequent alleles for both SNPs in the top left. Statistics can be observed by placing the mouse over an allele pair. For SNPs a magnified plot is not shown, but the |D'| graph is still used (middle right).
Figure 3
Figure 3
Representation of haplotype frequencies and LD in MIDAS. For SNPs the expected (under no LD) haplotype frequencies for each allele combination are plotted with an unfilled, black rectangle divided into four quadrants by two lines. The estimated haplotype frequencies are represented by solid red or blue rectangles. Where a coloured rectangle exceeds the size of the black rectangle it is coloured red, indicating an excess of that haplotype (D' ≥ 0). The opposite situation is indicated by a blue rectangle (D'<0). For multi-allelic markers the principle is the same, but a separate plot is shown for each combination of alleles, i.e. locus 1 allele i/allele not-i, locus 2 allele j/allele not-j.
Figure 4
Figure 4
The format of a MIDAS input file. Data are raw genotypes in a tab-delimited text file. Row 1 contains marker names, row 2 contains positions. Markers should be sorted in position order for clarity. Alleles should be delimited by an underscore ("_"), and can be any valid letter or number. Where numbers are used, ensure that the same number of digits are used for all alleles (eg 094, 098, 102) to preserve size order in the alphanumeric sort. There must be no more than one blank line at the end of the data and all null values must be coded as "?_?".
Figure 5
Figure 5
Use of MIDAS SNP/SNP plots to infer evolutionary history. The haplotype on which a SNP first arose is indicated by the estimated frequency of the haplotype carrying the most frequent alleles at both loci. (i) If this is less than expected, it implies that the SNP 2 arose on the haplotype carrying the common allele at SNP 1 (i.e. D'<0). (ii) If it is more common than expected then SNP 2 arose on the haplotype carrying the rare allele at SNP 1 (i.e. D' ≥ 0). (iii) If only two haplotypes are observed then perfect LD exists (r2 = 1). This may arise through bottlenecks, selection or simultaneous occurrence.
Figure 6
Figure 6
LD between a complex microsatellite and SNPs. (a) Previous work [22] indicated SNP alleles in LD with two size ranges of the CSH1.01 microsatellite. The lower size range has dinucleotide spacing, the upper has tetranucleotide spacing. This suggested two major lineages. (b) Plotting interallelic LD between a SNP (GH1V004) and the CSH1.01 microsatellite demonstrates clear LD with the two lineages. The common SNP alleles associate with the lower size range and the rare SNP alleles associate with the upper size range. Results are boxed in red where the haplotype frequency is significantly higher than expected (D' ≥ 0) and blue where it is significantly lower (D'<0). (c) SNP haplotypes (four SNPs, including GH1V004) confirm these findings and demonstrate the ability of MIDAS to handle haplotype data as a multi-allelic marker.
Figure 7
Figure 7
LD between the INS VNTR and the TH01 microsatellite. Each TH01 allele associates with a size range of VNTR alleles (256 and 263 associate with the class III alleles). This infers a greater rate of mutation in the VNTR because there is a wider range of allele sizes in the VNTR dimension significantly associated with TH01 alleles than vice versa. Close-ups of individual allele plots are shown to indicate the magnitude of effect – black rectangle indicates expected haplotype frequency under no LD, coloured rectangle indicates the estimated haplotype frequency.
Figure 8
Figure 8
Visualisation of regions of perfect LD. Marker pairs can have either two or three haplotypes present when D' = 1. Most programs do not distinguish between these graphically, despite the potential biological importance. (a) The BRCA1 region on chromosome 17. MIDAS shows only two haplotypes for many SNPs (perfect LD, r2 = 1) using HapMap data [29,30]. (b) Allele frequencies from HapMap data [29,30] show that SNPs in regions with only two haplotypes share the same minor allele frequency (MAF) for many SNPs (eg BRCA1 region on chromosome 17) compared to (c) nearby regions which have a mixture of MAFs. Viewing MAF may therefore be a quick way to find regions of perfect LD, which can then be checked with MIDAS.

Similar articles

Cited by

References

    1. Ardlie KG, Kruglyak L, Seielstad M. Patterns of linkage disequilibrium in the human genome. Nat Rev Genet. 2002;3:299–309. doi: 10.1038/nrg777. - DOI - PubMed
    1. Zapata C, Rodríguez S, Visedo G, Sacristán F. Spectrum of nonrandom associations between microsatellite loci on human chromosome 11p15. Genetics. 2001;158:1235–1251. - PMC - PubMed
    1. Jorde LB. Linkage disequilibrium and the search for complex disease genes. Genome Res. 2000;10:1435–1444. doi: 10.1101/gr.144500. - DOI - PubMed
    1. Mueller JC. Linkage disequilibrium for different scales and applications. Brief Bioinform. 2004;5:355–364. doi: 10.1093/bib/5.4.355. - DOI - PubMed
    1. Abecasis GR, Cookson WO. GOLD – graphical overview of linkage disequilibrium. Bioinformatics. 2000;16:182–183. doi: 10.1093/bioinformatics/16.2.182. - DOI - PubMed

Publication types

Substances

LinkOut - more resources