Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Jul 1;33(Web Server issue):W262-6.
doi: 10.1093/nar/gki368.

PatMatch: a program for finding patterns in peptide and nucleotide sequences

Affiliations

PatMatch: a program for finding patterns in peptide and nucleotide sequences

Thomas Yan et al. Nucleic Acids Res. .

Abstract

Here, we present PatMatch, an efficient, web-based pattern-matching program that enables searches for short nucleotide or peptide sequences such as cis-elements in nucleotide sequences or small domains and motifs in protein sequences. The program can be used to find matches to a user-specified sequence pattern that can be described using ambiguous sequence codes and a powerful and flexible pattern syntax based on regular expressions. A recent upgrade has improved performance and now supports both mismatches and wildcards in a single pattern. This enhancement has been achieved by replacing the previous searching algorithm, scan_for_matches [D'Souza et al. (1997), Trends in Genetics, 13, 497-498], with nondeterministic-reverse grep (NR-grep), a general pattern matching tool that allows for approximate string matching [Navarro (2001), Software Practice and Experience, 31, 1265-1312]. We have tailored NR-grep to be used for DNA and protein searches with PatMatch. The stand-alone version of the software can be adapted for use with any sequence dataset and is available for download at The Arabidopsis Information Resource (TAIR) at ftp://ftp.arabidopsis.org/home/tair/Software/Patmatch/. The PatMatch server is available on the web at http://www.arabidopsis.org/cgi-bin/patmatch/nph-patmatch.pl for searching Arabidopsis thaliana sequences.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(A) The PatMatch input web interface. This screen capture shows how PatMatch is used to find the DREB binding site (12), RCCGAC, where R stands for any purine base. One of the locus upstream sequence datasets is used to find sequences containing this cis-element. (B) The PatMatch results page. This screen capture shows the output of the query of the pattern, RCCGAC, after searching the 1000 bp locus upstream dataset on both strands. (C) A page showing a single match (highlighted in red) of the query in a sequence. The pattern, mismatch options of the search and information about the sequence from its FASTA header are shown.

Similar articles

Cited by

References

    1. Cherry J.M., Ball C., Weng S., Juvik G., Schmidt R., Adler C., Dunn B., Dwight S., Riles L., Mortimer R.K., Botstein D. Genetic and physical maps of Saccharomyces cerevisiae. Nature. 1997;387:67–73. - PMC - PubMed
    1. Huala E., Dickerman A., Garcia-Hernandez M., Weems D., Reiser L., LaFond F., Hanley D., Kiphart D., Zhuang J., Huang W., et al. The Arabidopsis Information Resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant. Nucleic Acids Res. 2001;29:102–105. - PMC - PubMed
    1. Rhee S.Y., Beavis W., Berardini T.Z., Chen G., Dixon D., Doyle A., Garcia-Hernandez M., Huala E., Lander G., Montoya M., et al. The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community. Nucleic Acids Res. 2003;31:224–228. - PMC - PubMed
    1. Navarro G. NR-grep: a fast and flexible pattern matching tool. Software Practice and Experience. 2001;31:1265–1312.
    1. D'Souza M., Larsen N., Overbeek R. Searching for patterns in genomic data. Trends Genet. 1997;13:597–498. - PubMed

Publication types