Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov 4:13:1061122.
doi: 10.3389/fmicb.2022.1061122. eCollection 2022.

iProm-phage: A two-layer model to identify phage promoters and their types using a convolutional neural network

Affiliations

iProm-phage: A two-layer model to identify phage promoters and their types using a convolutional neural network

Muhammad Shujaat et al. Front Microbiol. .

Abstract

The increased interest in phages as antibacterial agents has resulted in a rise in the number of sequenced phage genomes, necessitating the development of user-friendly bioinformatics tools for genome annotation. A promoter is a DNA sequence that is used in the annotation of phage genomes. In this study we proposed a two layer model called "iProm-phage" for the prediction and classification of phage promoters. Model first layer identify query sequence as promoter or non-promoter and if the query sequence is predicted as promoter then model second layer classify it as phage or host promoter. Furthermore, rather than using non-coding regions of the genome as a negative set, we created a more challenging negative dataset using promoter sequences. The presented approach improves discrimination while decreasing the frequency of erroneous positive predictions. For feature selection, we investigated 10 distinct feature encoding approaches and utilized them with several machine-learning algorithms and a 1-D convolutional neural network model. We discovered that the one-hot encoding approach and the CNN model outperformed based on performance metrics. Based on the results of the 5-fold cross validation, the proposed predictor has a high potential. Furthermore, to make it easier for other experimental scientists to obtain the results they require, we set up a freely accessible and user-friendly web server at http://nsclbio.jbnu.ac.kr/tools/iProm-phage/.

Keywords: DNA promoters; bioinformatics; computational biology; convolutional neural networks; phages.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Flow diagram of iProm-phage.
Figure 2
Figure 2
Flow diagram of the two-layer model.
Figure 3
Figure 3
iProm-phage CNN architecture.
Figure 4
Figure 4
Accuracy of First layer baseline models.
Figure 5
Figure 5
Accuracy of Second layer baseline models.
Figure 6
Figure 6
First layer ROC curve.
Figure 7
Figure 7
Second layer ROC curve.
Figure 8
Figure 8
Webserver adding query sequence.
Figure 9
Figure 9
Predictor output.

References

    1. Ali S. D., Alam W., Tayara H., Chong K. (2020). Identification of functional pi RNAs using a convolutional neural network. IEEE/ACM Trans. Comput. Biol. Bioinforma. 14:1. doi: 10.1109/tcbb.2020.3034313 - DOI - PubMed
    1. Ali S. D., Alam W., Tayara H., Chong K. T. (2022). Identification of functional piRNAs using a convolutional neural network. IEEE/ACM Trans. Comput. Biol. Bioinform. 19, 1661–1669. doi: 10.1109/TCBB.2020.3034313, PMID: - DOI - PubMed
    1. Chantsalnyam T., Lim D. Y., Tayara H., Chong K. T. (2020). ncRDeep: non-coding RNA classification with convolutional neural network. Comput. Biol. Chem. 88:107364. doi: 10.1016/j.compbiolchem.2020.107364, PMID: - DOI - PubMed
    1. Feng Z. P., Zhang C. T. (2000). Prediction of membrane protein types based on the hydrophobic index of amino acids. J. Protein Chem. 19, 269–275. doi: 10.1023/A:1007091128394 - DOI - PubMed
    1. Guzina J., Djordjevic M. (2015). Bioinformatics as a first-line approach for understanding bacteriophage transcription. Bacteriophage 5:e1062588. doi: 10.1080/21597081.2015.1062588, PMID: - DOI - PMC - PubMed

LinkOut - more resources