Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan 6;51(D1):D647-D653.
doi: 10.1093/nar/gkac977.

SulfAtlas, the sulfatase database: state of the art and new developments

Affiliations

SulfAtlas, the sulfatase database: state of the art and new developments

Mark Stam et al. Nucleic Acids Res. .

Abstract

SulfAtlas (https://sulfatlas.sb-roscoff.fr/) is a knowledge-based resource dedicated to a sequence-based classification of sulfatases. Currently four sulfatase families exist (S1-S4) and the largest family (S1, formylglycine-dependent sulfatases) is divided into subfamilies by a phylogenetic approach, each subfamily corresponding to either a single characterized specificity (or few specificities in some cases) or to unknown substrates. Sequences are linked to their biochemical and structural information according to an expert scrutiny of the available literature. Database browsing was initially made possible both through a keyword search engine and a specific sequence similarity (BLAST) server. In this article, we will briefly summarize the experimental progresses in the sulfatase field in the last 6 years. To improve and speed up the (sub)family assignment of sulfatases in (meta)genomic data, we have developed a new, freely-accessible search engine using Hidden Markov model (HMM) for each (sub)family. This new tool (SulfAtlas HMM) is also a key part of the internal pipeline used to regularly update the database. SulfAtlas resource has indeed significantly grown since its creation in 2016, from 4550 sequences to 162 430 sequences in August 2022.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Example of a result page of the SulfAtlas HHM server. Data input are protein sequences in FASTA format. They can be copied-pasted or uploaded as a file (with a size limit of 50MB). The results can be obtained directly online or be sent by e-mail. As an example, here results are shown for the complete proteome of Zobellia galactanivorans DsijT (4582 proteins). 72 hits were found (71 sulfatases and 1 pseudo-gene) which is consistent with previous genomic analyses (46). Data processing took 2 minutes and 45 seconds.
Figure 2.
Figure 2.
Home page of the internal web-based curation tool. The updated pipeline automatically feeds the curation tool with candidate sulfatase sequences with uncertain status. Details of HMM and BLAST results are provided for each candidate sequence. The figure shows the results for the S1_7 subfamily. Three sequences are highlighted as example cases: (i) green box (UniProt: A0A086XT05_9RHOB): the HMM and BLAST analyses concur to an S1_7 assignation, but with HMM score slightly below 300 (248.5). Manual verification confirmed that this sequence contained a complete S1_7 sulfatase module but also hemolysin-type calcium-binding regions, explaining its larger size (775 residues). The S1_7 subfamily was thus confirmed in the ‘Select Family’ menu and saved with the ‘Action’ button. (ii) red box (A0A090XCV4_IXORI): the query is a very short sequence (93 residues) and is thus a sulfatase fragment (pseudo-gene or incorrectly predicted ORF) and is definitively rejected. (iii) blue box (A0A0B5GH05_9EURY): this sequence has the correct size to be a functional S1 sulfatase (472 residues) but the HMM and BLAST analyses do not concur on the same subfamily assignment (S1_7 and S1_64). Therefore, this sulfatase is currently orphan and may be a seed for a future new subfamily. For now, it is assigned to the S1_NC subfamily.

References

    1. Barbeyron T., Brillet-Gueguen L., Carre W., Carriere C., Caron C., Czjzek M., Hoebeke M., Michel G.. Matching the diversity of sulfated biomolecules: creation of a classification database for sulfatases reflecting their substrate specificity. PLoS One. 2016; 11:e0164846. - PMC - PubMed
    1. Hanson S.R., Best M.D., Wong C.H.. Sulfatases: structure, mechanism, biological activity, inhibition, and synthetic utility. Angew. Chem. Int. Ed. Engl. 2004; 43:5736–5763. - PubMed
    1. Kahnert A., Kertesz M.A.. Characterization of a sulfur-regulated oxygenative alkylsulfatase from Pseudomonas putida S-313. J. Biol. Chem. 2000; 275:31661–31667. - PubMed
    1. Drula E., Garron M.L., Dogan S., Lombard V., Henrissat B., Terrapon N.. The carbohydrate-active enzyme database: functions and literature. Nucleic Acids Res. 2022; 50:D571–D577. - PMC - PubMed
    1. Davison J., Brunel F., Phanopoulos A., Prozzi D., Terpstra P.. Cloning and sequencing of pseudomonas genes determining sodium dodecyl sulfate biodegradation. Gene. 1992; 114:19–24. - PubMed

Publication types