Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan;32(1):e4524.
doi: 10.1002/pro.4524.

LambdaPP: Fast and accessible protein-specific phenotype predictions

Affiliations

LambdaPP: Fast and accessible protein-specific phenotype predictions

Tobias Olenyi et al. Protein Sci. 2023 Jan.

Abstract

The availability of accurate and fast artificial intelligence (AI) solutions predicting aspects of proteins are revolutionizing experimental and computational molecular biology. The webserver LambdaPP aspires to supersede PredictProtein, the first internet server making AI protein predictions available in 1992. Given a protein sequence as input, LambdaPP provides easily accessible visualizations of protein 3D structure, along with predictions at the protein level (GeneOntology, subcellular location), and the residue level (binding to metal ions, small molecules, and nucleotides; conservation; intrinsic disorder; secondary structure; alpha-helical and beta-barrel transmembrane segments; signal-peptides; variant effect) in seconds. The structure prediction provided by LambdaPP-leveraging ColabFold and computed in minutes-is based on MMseqs2 multiple sequence alignments. All other feature prediction methods are based on the pLM ProtT5. Queried by a protein sequence, LambdaPP computes protein and residue predictions almost instantly for various phenotypes, including 3D structure and aspects of protein function. LambdaPP is freely available for everyone to use under embed.predictprotein.org, the interactive results for the case study can be found under https://embed.predictprotein.org/o/Q9NZC2. The frontend of LambdaPP can be found on GitHub (github.com/sacdallago/embed.predictprotein.org), and can be freely used and distributed under the academic free use license (AFL-2). For high-throughput applications, all methods can be executed locally via the bio-embeddings (bioembeddings.com) python package, or docker image at ghcr.io/bioembeddings/bio_embeddings, which also includes the backend of LambdaPP.

Keywords: artificial intelligence; protein annotation; protein function prediction; protein language models; protein structure prediction; web server.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

FIGURE 1
FIGURE 1
LambdaPP pipeline. Starting with an amino acid sequence, LambdaPP orchestrates the prediction of (1) protein structure using ColabFold (Mirdita et al., 2022), (2) per‐protein features: gene ontology (GO) annotations using goPredSim (Littmann, Heinzinger, Dallago, Olenyi, et al., 2021), subcellular location using LA (Stärk et al., 2021); (3) per‐residue features: binding residues using bindEmbed21DL (Littmann, Heinzinger, Dallago, Weissenow, et al., 2021), conservation using ProtT5cons (Marquet et al., 2021), disorder using SETH (Ilzhoefer et al., 2022), secondary structure using ProtT5‐sec (Elnaggar et al., 2021), helical and barrel transmembrane (TM) regions using TMbed (Bernhofer & Rost, 2022); and (4) variant effect scores using VESPAl (Marquet et al., 2021).
FIGURE 2
FIGURE 2
LambdaPP output for TREM2_HUMAN. Panel a: residue level features: secondary structure, transmembrane topology, disordered residues, small molecule, nucleic or metal binding residues, residue conservation and average variation (Bernhofer & Rost, ; Elnaggar et al., ; Ilzhoefer et al., ; Littmann, Heinzinger, Dallago, Weissenow, et al., ; Marquet et al., 2021); panel b: sequence‐level features: predicted subcellular localization (Stärk et al., 2021), and an excerpt of predicted GO‐annotations (Littmann, Heinzinger, Dallago, Olenyi, et al., 2021); panel c: effect of SAVs (wildtype sequence on x‐axis, mutations on y‐axis; darker color = higher effect) (Marquet et al., 2021); and panel d: predicted 3D structure (Mirdita et al., 2022). Interactive version at https://embed.predictprotein.org/o/Q9NZC2.
FIGURE 3
FIGURE 3
Remarkable AlphaFold2 predictions. Panel a (lower left triangle) displays the 3D structure predicted by AlphaFold2 for the ice nucleation protein ICEV_PSESX. The protein contains 1165 residues and is available through LambdaPP as part of AFDB. Panel b (upper right triangle) showcases the AlphaFold2 prediction of what might constitute a novel superfamily for the plant protein with the UniProt identifier Q9S828_ARATH.

References

    1. Abriata LA, Tamò GE, Monastyrskyy B, Kryshtafovych A, Dal Peraro M. Assessment of hard target modeling in CASP12 reveals an emerging role of alignment‐based contact prediction methods. Proteins. 2018;86(Suppl. 1):97–112. - PubMed
    1. Ahdritz G, Bouatta N, Kadyan S, Xia Q, Gerecke W, AlQuraishi M. OpenFold. 2021.
    1. Alexander‐Brett JM, Kober DL. Triggering receptor expressed on myeloid cells 2. 2015. 10.2210/pdb5ELI/pdb - DOI
    1. Alley EC, Khimulya G, Biswas S, AlQuraishi M, Church GM. Unified rational protein engineering with sequence‐based deep representation learning. Nat Methods. 2019;16:1315–22. - PMC - PubMed
    1. Almagro Armenteros JJ, Sønderby CK, Sønderby SK, Nielsen H, Winther O. DeepLoc: prediction of protein subcellular localization using deep learning. Bioinformatics. 2017;33:3387–95. - PubMed

Publication types

LinkOut - more resources