PERISCOPE-Opt: Machine learning-based prediction of optimal fermentation conditions and yields of recombinant periplasmic protein expressed in Escherichia coli
- PMID: 35765650
- PMCID: PMC9201004
- DOI: 10.1016/j.csbj.2022.06.006
PERISCOPE-Opt: Machine learning-based prediction of optimal fermentation conditions and yields of recombinant periplasmic protein expressed in Escherichia coli
Abstract
Optimization of the fermentation process for recombinant protein production (RPP) is often resource-intensive. Machine learning (ML) approaches are helpful in minimizing the experimentations and find vast applications in RPP. However, these ML-based tools primarily focus on features with respect to amino-acid-sequence, ruling out the influence of fermentation process conditions. The present study combines the features derived from fermentation process conditions with that from amino acid-sequence to construct an ML-based model that predicts the maximal protein yields and the corresponding fermentation conditions for the expression of target recombinant protein in the Escherichia coli periplasm. Two sets of XGBoost classifiers were employed in the first stage to classify the expression levels of the target protein as high (>50 mg/L), medium (between 0.5 and 50 mg/L), or low (<0.5 mg/L). The second-stage framework consisted of three regression models involving support vector machines and random forest to predict the expression yields corresponding to each expression-level-class. Independent tests showed that the predictor achieved an overall average accuracy of 75% and a Pearson coefficient correlation of 0.91 for the correctly classified instances. Therefore, our model offers a reliable substitution of numerous trial-and-error experiments to identify the optimal fermentation conditions and yield for RPP. It is also implemented as an open-access webserver, PERISCOPE-Opt (http://periscope-opt.erc.monash.edu).
Keywords: AUC, area under the curve; CV, cross-validation; CfsSubsetEval, Correlation-based Forward Selection Subset Evaluator; ClassifierSubsetEval, Classifier Subset Evaluator; E. coli, Escherichia coli; Escherichia coli; FC1, Feature Category 1; FC2, Feature Category 2; FC3, Feature Category 3; FC4, Feature Category 4; IPTG, isopropyl β-D-1-thiogalactopyranoside; LOOCV, Leave-one-out cross-validation; MAE, mean absolute error; MCC, Mathew correlation coefficient; ML, machine learning; MLR, machine learning in R; Machine learning; OD, optical density at 600 nm; Optimization; PCC, Pearson correlation coefficient; Periplasmic expression; Prediction model; RF, random forest; RFR, RF regression; RFR-High, RFR for high; RFR-Medium, RFR for medium; RMSE, root mean squared error; RPP, Recombinant protein production; RSM, response surface methodology; Recombinant protein production; SMOTE, Synthetic Minority Over-sampling Technique; SP, signal peptides; SVM, support vector machines; SVR, SVM regression; SVR-Low, SVR for class: "low"; XGB, XGBoost; pI, isoelectric point.
© 2022 The Authors.
Conflict of interest statement
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Figures





Similar articles
-
Periscope: quantitative prediction of soluble protein expression in the periplasm of Escherichia coli.Sci Rep. 2016 Mar 2;6:21844. doi: 10.1038/srep21844. Sci Rep. 2016. PMID: 26931649 Free PMC article.
-
Hyperspectral Monitoring of Powdery Mildew Disease Severity in Wheat Based on Machine Learning.Front Plant Sci. 2022 Mar 21;13:828454. doi: 10.3389/fpls.2022.828454. eCollection 2022. Front Plant Sci. 2022. PMID: 35386677 Free PMC article.
-
Machine learning-based risk factor analysis and prevalence prediction of intestinal parasitic infections using epidemiological survey data.PLoS Negl Trop Dis. 2022 Jun 14;16(6):e0010517. doi: 10.1371/journal.pntd.0010517. eCollection 2022 Jun. PLoS Negl Trop Dis. 2022. PMID: 35700192 Free PMC article.
-
Stepwise optimization of recombinant protein production in Escherichia coli utilizing computational and experimental approaches.Appl Microbiol Biotechnol. 2020 Apr;104(8):3253-3266. doi: 10.1007/s00253-020-10454-w. Epub 2020 Feb 19. Appl Microbiol Biotechnol. 2020. PMID: 32076772 Review.
-
Regression analysis for detecting epileptic seizure with different feature extracting strategies.Biomed Tech (Berl). 2019 Dec 18;64(6):619-642. doi: 10.1515/bmt-2018-0012. Biomed Tech (Berl). 2019. PMID: 31145684 Review.
Cited by
-
Immunogenic potential and neutralizing ability of a heterologous version of the most abundant three-finger toxin from the coral snake Micrurus mipartitus.J Venom Anim Toxins Incl Trop Dis. 2024 Nov 25;30:e20230074. doi: 10.1590/1678-9199-JVATITD-2023-0074. eCollection 2024. J Venom Anim Toxins Incl Trop Dis. 2024. PMID: 39628669 Free PMC article.
-
Heterologous Expression and Immunogenic Potential of the Most Abundant Phospholipase A2 from Coral Snake Micrurus dumerilii to Develop Antivenoms.Toxins (Basel). 2022 Nov 24;14(12):825. doi: 10.3390/toxins14120825. Toxins (Basel). 2022. PMID: 36548722 Free PMC article.
-
Machine learning-assisted medium optimization revealed the discriminated strategies for improved production of the foreign and native metabolites.Comput Struct Biotechnol J. 2023 Apr 20;21:2654-2663. doi: 10.1016/j.csbj.2023.04.020. eCollection 2023. Comput Struct Biotechnol J. 2023. PMID: 37138901 Free PMC article.
-
Artificial intelligence-driven systems engineering for next-generation plant-derived biopharmaceuticals.Front Plant Sci. 2023 Nov 15;14:1252166. doi: 10.3389/fpls.2023.1252166. eCollection 2023. Front Plant Sci. 2023. PMID: 38034587 Free PMC article. Review.
-
Towards AI-designed genomes using a variational autoencoder.Proc Biol Sci. 2024 Dec;291(2036):20241457. doi: 10.1098/rspb.2024.1457. Epub 2024 Dec 11. Proc Biol Sci. 2024. PMID: 39657811 Free PMC article.
References
-
- Sandomenico A, Sivaccumar JP, Ruvo M. Evolution of Escherichia coli Expression System in Producing Antibody Recombinant Fragments. Int J Mol Sci 2020, Vol 21, Page 6324 2020;21:6324. https://doi.org/10.3390/IJMS21176324. - PMC - PubMed
LinkOut - more resources
Full Text Sources
Research Materials
Miscellaneous