Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008;3(12):e3874.
doi: 10.1371/journal.pone.0003874. Epub 2008 Dec 5.

GeneDistiller--distilling candidate genes from linkage intervals

Affiliations

GeneDistiller--distilling candidate genes from linkage intervals

Dominik Seelow et al. PLoS One. 2008.

Abstract

Background: Linkage studies often yield intervals containing several hundred positional candidate genes. Different manual or automatic approaches exist for the determination of the gene most likely to cause the disease. While the manual search is very flexible and takes advantage of the researchers' background knowledge and intuition, it may be very cumbersome to collect and study the relevant data. Automatic solutions on the other hand usually focus on certain models, remain "black boxes" and do not offer the same degree of flexibility.

Methodology: We have developed a web-based application that combines the advantages of both approaches. Information from various data sources such as gene-phenotype associations, gene expression patterns and protein-protein interactions was integrated into a central database. Researchers can select which information for the genes within a candidate interval or for single genes shall be displayed. Genes can also interactively be filtered, sorted and prioritised according to criteria derived from the background knowledge and preconception of the disease under scrutiny.

Conclusions: GeneDistiller provides knowledge-driven, fully interactive and intuitive access to multiple data sources. It displays maximum relevant information, while saving the user from drowning in the flood of data. A typical query takes less than two seconds, thus allowing an interactive and explorative approach to the hunt for the candidate gene.

Access: GeneDistiller can be freely accessed at http://www.genedistiller.org.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Strategies / Possibilities.
This scheme illustrates different approaches to choose reasonable candidate genes from a linkage interval. The researcher can either follow a hypothesis driven approach based on a functional model or simply choose genes based on single properties reflecting the likelihood of being disease causing, e.g. the co-expression with other disease genes that cause similar phenotypes. The general concepts are depicted as pink boxes, gene properties that can be queried by GeneDistiller as yellow boxes, and properties or models GeneDistiller presently does not offer as blue boxes. With GeneDistiller, the user is absolutely free to combine gene properties according to her or his own hypotheses.
Figure 2
Figure 2. The GeneDistillery.
The user-friendly interface allows the researchers to incorporate their background knowledge about diseases and genes into the interactive “gene distilling” process. They can extract all the information relevant to their specific question at our one-stop shop. This saves them from drowning in the flood of data available on the WWW and helps them to determine the most promising candidates.
Figure 3
Figure 3. Filtering.
This figure shows how filters can be applied in GeneDistiller to reduce the number of genes to be studied. After defining the linkage interval, more and more selection criteria can be added by the researcher, narrowing down the genes to ever more likely candidates. The example depicts the hunt for candidate genes for epilepsy in a 60 Mbp region on chromosome 2. The size of a rectangle is proportional to the number of genes and the grey shades reflect the “distillation” process in which the best candidates are enriched.
Figure 4
Figure 4. Prioritisation / query interface (screenshot).
This figure shows the query interface of GeneDistiller for the prioritisation example for epilepsy described in the text. The interface is divided into different sections in which the parameters describing a similar aspect of the gene-specific data are listed. Sections not used can be closed (e.g. “prioritisation settings”). Please note that most of the available tissues in the expression section are omitted to improve readability.
Figure 5
Figure 5. Results page (screenshot).
GeneDistiller prints all results on a single HTML page. The genes are listed in the selected order, in case of prioritisation strategies also with their over-all scores and sub scores for different parameters. The gene specific data is presented with hyperlinks to the original data sources. Keywords or values that were used for filtering or highlighting are printed in bold letters. The same applies to values that are present in other genes known to be related with the selected disease (epilepsy, in this case). Please note that many NCBI GeneRIFs and OMIM reports for SCN1A were omitted in this figure to improve readability.

References

    1. Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 2005;33:D514–D517. - PMC - PubMed
    1. Lai CS, Fisher SE, Hurst JA, Vargha-Khadem F, Monaco AP. A forkhead-domain gene is mutated in a severe speech and language disorder. Nature. 2001;413:519–523. - PubMed
    1. Schuelke M, Wagner KR, Stolz LE, Hubner C, Riebel T, et al. Myostatin mutation associated with gross muscle hypertrophy in a child. N Engl J Med. 2004;350:2682–2688. - PubMed
    1. Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2008;36:D13–D21. - PMC - PubMed
    1. Karolchik D, Kuhn RM, Baertsch R, Barber GP, Clawson H, et al. The UCSC Genome Browser Database: 2008 update. Nucleic Acids Res. 2008;36:D773–D779. - PMC - PubMed

Publication types