. 2022 Jun 3:20:2909-2920.

doi: 10.1016/j.csbj.2022.06.006. eCollection 2022.

PERISCOPE-Opt: Machine learning-based prediction of optimal fermentation conditions and yields of recombinant periplasmic protein expressed in Escherichia coli

Kulandai Arockia Rajesh Packiam¹, Chien Wei Ooi^{1

2}, Fuyi Li³, Shutao Mei⁴, Beng Ti Tey^{1

2}, Huey Fang Ong⁵, Jiangning Song^{4

6}, Ramakrishnan Nagasundara Ramanan¹

Affiliations

¹ Chemical Engineering Discipline, School of Engineering, Monash University Malaysia, Jalan Lagoon Selatan, 47500 Bandar Sunway, Malaysia.
² Advanced Engineering Platform, Monash University Malaysia, Jalan Lagoon Selatan, 47500 Bandar Sunway, Selangor, Malaysia.
³ Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Victoria 3010, Australia.
⁴ Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Victoria 3800, Australia.
⁵ School of Information Technology, Monash University Malaysia, Jalan Lagoon Selatan, 47500 Bandar Sunway, Malaysia.
⁶ Monash Centre for Data Science, Faculty of Information Technoology, Monash University, Victoria 3800, Australia.

PMID: 35765650
PMCID: PMC9201004
DOI: 10.1016/j.csbj.2022.06.006

PERISCOPE-Opt: Machine learning-based prediction of optimal fermentation conditions and yields of recombinant periplasmic protein expressed in Escherichia coli

Kulandai Arockia Rajesh Packiam et al. Comput Struct Biotechnol J. 2022.

. 2022 Jun 3:20:2909-2920.

doi: 10.1016/j.csbj.2022.06.006. eCollection 2022.

Authors

Kulandai Arockia Rajesh Packiam¹, Chien Wei Ooi^{1

2}, Fuyi Li³, Shutao Mei⁴, Beng Ti Tey^{1

2}, Huey Fang Ong⁵, Jiangning Song^{4

6}, Ramakrishnan Nagasundara Ramanan¹

Affiliations

¹ Chemical Engineering Discipline, School of Engineering, Monash University Malaysia, Jalan Lagoon Selatan, 47500 Bandar Sunway, Malaysia.
² Advanced Engineering Platform, Monash University Malaysia, Jalan Lagoon Selatan, 47500 Bandar Sunway, Selangor, Malaysia.
³ Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Victoria 3010, Australia.
⁴ Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Victoria 3800, Australia.
⁵ School of Information Technology, Monash University Malaysia, Jalan Lagoon Selatan, 47500 Bandar Sunway, Malaysia.
⁶ Monash Centre for Data Science, Faculty of Information Technoology, Monash University, Victoria 3800, Australia.

PMID: 35765650
PMCID: PMC9201004
DOI: 10.1016/j.csbj.2022.06.006

Abstract

Optimization of the fermentation process for recombinant protein production (RPP) is often resource-intensive. Machine learning (ML) approaches are helpful in minimizing the experimentations and find vast applications in RPP. However, these ML-based tools primarily focus on features with respect to amino-acid-sequence, ruling out the influence of fermentation process conditions. The present study combines the features derived from fermentation process conditions with that from amino acid-sequence to construct an ML-based model that predicts the maximal protein yields and the corresponding fermentation conditions for the expression of target recombinant protein in the Escherichia coli periplasm. Two sets of XGBoost classifiers were employed in the first stage to classify the expression levels of the target protein as high (>50 mg/L), medium (between 0.5 and 50 mg/L), or low (<0.5 mg/L). The second-stage framework consisted of three regression models involving support vector machines and random forest to predict the expression yields corresponding to each expression-level-class. Independent tests showed that the predictor achieved an overall average accuracy of 75% and a Pearson coefficient correlation of 0.91 for the correctly classified instances. Therefore, our model offers a reliable substitution of numerous trial-and-error experiments to identify the optimal fermentation conditions and yield for RPP. It is also implemented as an open-access webserver, PERISCOPE-Opt (http://periscope-opt.erc.monash.edu).

Keywords: AUC, area under the curve; CV, cross-validation; CfsSubsetEval, Correlation-based Forward Selection Subset Evaluator; ClassifierSubsetEval, Classifier Subset Evaluator; E. coli, Escherichia coli; Escherichia coli; FC1, Feature Category 1; FC2, Feature Category 2; FC3, Feature Category 3; FC4, Feature Category 4; IPTG, isopropyl β-D-1-thiogalactopyranoside; LOOCV, Leave-one-out cross-validation; MAE, mean absolute error; MCC, Mathew correlation coefficient; ML, machine learning; MLR, machine learning in R; Machine learning; OD, optical density at 600 nm; Optimization; PCC, Pearson correlation coefficient; Periplasmic expression; Prediction model; RF, random forest; RFR, RF regression; RFR-High, RFR for high; RFR-Medium, RFR for medium; RMSE, root mean squared error; RPP, Recombinant protein production; RSM, response surface methodology; Recombinant protein production; SMOTE, Synthetic Minority Over-sampling Technique; SP, signal peptides; SVM, support vector machines; SVR, SVM regression; SVR-Low, SVR for class: "low"; XGB, XGBoost; pI, isoelectric point.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

**Fig. 1**
**Framework of the proposed prediction model.** Low: yield is<0.5 mg/L, Medium: yield is between 0.5 and 50 mg/L, High: yield is higher than 50 mg/L. Non-medium refers to both High and Low together.

**Fig. 2**
**Feature importance for a) XGB Classifier 1b) XGB Classifier 2.** Performance of the model has been evaluated using ten times 10-fold cross validation (100 experiments).

**Fig. 3**
**Feature importance for a) SVR-Low b) RFR-Medium c) RFR-High.** Performance of the model has been evaluated using ten times 10-fold cross validation (100 experiments).

**Fig. 4**
**Benchmarking**ofthe performance of different algorithms. a) Classification tasks for both training and testing datasets b) Regression tasks for both training and testing datasets.

See this image and copyright information in PMC

Cited by

Immunogenic potential and neutralizing ability of a heterologous version of the most abundant three-finger toxin from the coral snake Micrurus mipartitus.
Giraldo LER, Pulido S, Berrío MA, Flórez MF, Rey-Suárez P, Núñez-Rangel V, Córdoba MS, Pereañez JA. Giraldo LER, et al. J Venom Anim Toxins Incl Trop Dis. 2024 Nov 25;30:e20230074. doi: 10.1590/1678-9199-JVATITD-2023-0074. eCollection 2024. J Venom Anim Toxins Incl Trop Dis. 2024. PMID: 39628669 Free PMC article.
Heterologous Expression and Immunogenic Potential of the Most Abundant Phospholipase A₂ from Coral Snake Micrurus dumerilii to Develop Antivenoms.
Romero-Giraldo LE, Pulido S, Berrío MA, Flórez MF, Rey-Suárez P, Nuñez V, Pereañez JA. Romero-Giraldo LE, et al. Toxins (Basel). 2022 Nov 24;14(12):825. doi: 10.3390/toxins14120825. Toxins (Basel). 2022. PMID: 36548722 Free PMC article.
Machine learning-assisted medium optimization revealed the discriminated strategies for improved production of the foreign and native metabolites.
Aida H, Uchida K, Nagai M, Hashizume T, Masuo S, Takaya N, Ying BW. Aida H, et al. Comput Struct Biotechnol J. 2023 Apr 20;21:2654-2663. doi: 10.1016/j.csbj.2023.04.020. eCollection 2023. Comput Struct Biotechnol J. 2023. PMID: 37138901 Free PMC article.
Artificial intelligence-driven systems engineering for next-generation plant-derived biopharmaceuticals.
Parthiban S, Vijeesh T, Gayathri T, Shanmugaraj B, Sharma A, Sathishkumar R. Parthiban S, et al. Front Plant Sci. 2023 Nov 15;14:1252166. doi: 10.3389/fpls.2023.1252166. eCollection 2023. Front Plant Sci. 2023. PMID: 38034587 Free PMC article. Review.
Towards AI-designed genomes using a variational autoencoder.
Dudek NK, Precup D. Dudek NK, et al. Proc Biol Sci. 2024 Dec;291(2036):20241457. doi: 10.1098/rspb.2024.1457. Epub 2024 Dec 11. Proc Biol Sci. 2024. PMID: 39657811 Free PMC article.

See all "Cited by" articles

References

1. Ahmadi M.K., Pfeifer B.A. Recent progress in therapeutic natural product biosynthesis using Escherichia coli. Curr Opin Biotechnol. 2016;42:7–12. doi: 10.1016/j.copbio.2016.02.010. - DOI - PubMed
1. Liu M., Feng X., Ding Y., Zhao G., Liu H., Xian M. Metabolic engineering of Escherichia coli to improve recombinant protein production. Appl Microbiol Biotechnol. 2015;99:10367–10377. doi: 10.1007/s00253-015-6955-9. - DOI - PubMed
1. Packiam K.A.R., Ramanan R.N., Ooi C.W., Krishnaswamy L., Tey B.T. Stepwise optimization of recombinant protein production in Escherichia coli utilizing computational and experimental approaches. Appl Microbiol Biotechnol. 2020;104:3253–3266. doi: 10.1007/s00253-020-10454-w. - DOI - PubMed
1. Sandomenico A, Sivaccumar JP, Ruvo M. Evolution of Escherichia coli Expression System in Producing Antibody Recombinant Fragments. Int J Mol Sci 2020, Vol 21, Page 6324 2020;21:6324. https://doi.org/10.3390/IJMS21176324. - PMC - PubMed
1. Kaur J.J., Kumar A., Kaur J.J. Strategies for optimization of heterologous protein expression in E. coli: Roadblocks and reinforcements. Int J Biol Macromol. 2018;106:803–822. doi: 10.1016/J.IJBIOMAC.2017.08.080. - DOI - PubMed

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

PERISCOPE-Opt: Machine learning-based prediction of optimal fermentation conditions and yields of recombinant periplasmic protein expressed in Escherichia coli

Affiliations

PERISCOPE-Opt: Machine learning-based prediction of optimal fermentation conditions and yields of recombinant periplasmic protein expressed in Escherichia coli

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources

Research Materials

Miscellaneous

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Related information

LinkOut - more resources

Full Text Sources

Research Materials

Miscellaneous