Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2003 Apr;13(4):732-41.
doi: 10.1101/gr.603103.

GALA, a database for genomic sequence alignments and annotations

Affiliations
Comparative Study

GALA, a database for genomic sequence alignments and annotations

Belinda Giardine et al. Genome Res. 2003 Apr.

Abstract

We have developed a relational database to contain whole genome sequence alignments between human and mouse with extensive annotations of the human sequence. Complex queries are supported on recorded features, both directly and on proximity among them. Searches can reveal a wide variety of relationships, such as finding all genes expressed in a designated tissue that have a highly conserved noncoding sequence 5' to the start site. Other examples are finding single nucleotide polymorphisms that occur in conserved noncoding regions upstream of genes and identifying CpG islands that overlap the 5' ends of divergently transcribed genes. The database is available online at http://globin.cse.psu.edu/ and http://bio.cse.psu.edu/.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Examples from the GALA query and output pages. (A) List boxes, check boxes, and fill-in text boxes are provided for selection of field values on the query page. This panel illustrates the use of list boxes to query for a gene encoding a protein with a zinc finger domain and that is expressed in lung cancers or tumors. (B) The GALA “region summary” output screen prompts a user to select the features they would like to see when the corresponding “view” buttons are pressed. (C) Selected sections of fields from the data returned for chr 1 152,802,808 to 152,805,338 are shown, including information about the gene and its function, range of tissues in which it is expressed (partial view), and conserved domains in the product.
Figure 2.
Figure 2.
Sample output from GALA visualized in the Human Genome Browser. A clickable button on the output page creates a custom track for the browser that opens in a separate window. (A) The region displayed is the 5' end of the SMARCB1 gene, which is one of the regions returned by a query for single nucleotide polymorphisms in conserved noncoding regions near the 5′ ends of genes. (B) The region displayed includes two of the CpG islands located between divergently transcribed genes, close to positions 38,480,000 and 38,630,000.
Figure 3.
Figure 3.
Sample output from GALA visualized in the Laj viewer, a Java applet for viewing alignment results. The figure shows a display window with panels for (A) the position of the mouse pointer and identification of any objects at that location, (B) the position of the moveable circle, (C) Human Genome Browser (HGB) coordinates for the region and (D) hyperlinks to alignment information (gray) and data for genes (black). (E) Icons for genomic features including locations of coding exons (dark-filled taller box), untranslated regions (UTRs) (gray-filled taller box), CpG islands (long, low open box), and simple repeats (short, low open boxes), as well as interspersed repeats when present. (F) The percent identity plot of the alignments in the query results showing the positions of the aligning segments in human on the horizontal axis and percent identity of each gap-free segment on the vertical axis. Important features are highlighted with underlays, including coding exons, UTRs, introns, highly conserved noncoding regions (at least 100 bp gap-free and at least 70% identity), and the single nucleotide polymorphisms (SNP). (G) The nucleotide-level alignment for the local alignment marked by the circle in F. The polymorphic nucleotide is a T (dark gray) in the reference human sequence. Boxes for matches to transcription-factor binding sites in the vicinity of the SNP are drawn below the local alignments with Java (Laj) screen shot. For the CCAAT box (binding site for NF-Y), the sequence of the reverse complement of the human sequence is given, as well as the sequence of the consensus binding site. Note that the SNP (in boldface) is part of the consensus-binding site. Alignments that are shown in the percent identity plot (pip) include only the ones that were selected in the query. Thus, if a size threshold was applied, only alignments meeting it appear in the pip. Gray horizontal bars in D show the positions of all aligning segments. The nucleotide-level view is obtained by clicking on any alignment in the pip. Names of genes and information about repeats appear in the text box at the top of the page (A).

Similar articles

Cited by

References

    1. Browman K.W., Murray, J.C., Sheffield, R.L., White, R.L., and Weber, J.L. 1998. Comprehensive human genetic maps: Individual and sex-specific variation in recombination. Amer. J. Human Genetics 63: 861-869. - PMC - PubMed
    1. Burge C. and Karlin, S. 1997. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268: 78-94. - PubMed
    1. DeSilva U., Elnitski, L., Idol, J.R., Doyle, J.L., Gan, W., Thomas, J.W., Schwartz, S., Dietrich, N.L., Beckstrom-Sternberg, S.M., McDowell, J.C., et al. 2002. Generation and comparative analysis of approximately 3.3 Mb of mouse genomic sequence orthologous to the region of human chromosome 7q11.23 implicated in Williams syndrome. Genome Res. 12: 3-15. - PMC - PubMed
    1. Endrizzi M., Huang, S., Scharf, J.M., Kelter, A.R., Wirth, B., Kunkel, L.M., Miller, W., and Dietrich, W.F. 1999. Comparative sequence analysis of the mouse and human Lgn1/SMA interval. Genomics 60: 137-51. - PubMed
    1. Frith M.C., Hansen, U., and Weng, Z. 2001. Detection of cis-element clusters in higher eukaryotic DNA. Bioinformatics 17: 878-889. - PubMed

Publication types

Substances

LinkOut - more resources