Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Nov;142(4):359-370.
doi: 10.1007/s12064-023-00402-3. Epub 2023 Aug 30.

MLACNN: an attention mechanism-based CNN architecture for predicting genome-wide DNA methylation

Affiliations

MLACNN: an attention mechanism-based CNN architecture for predicting genome-wide DNA methylation

JianGuo Bai et al. Theory Biosci. 2023 Nov.

Abstract

Methylation is an important epigenetic regulation of methylation genes that plays a crucial role in regulating biological processes. While traditional methods for detecting methylation in biological experiments are constantly improving, the development of artificial intelligence has led to the emergence of deep learning and machine learning methods as a new trend. However, traditional machine learning-based methods rely heavily on manual feature extraction, and most deep learning methods for studying methylation extract fewer features due to their simple network structures. To address this, we propose a bottomneck network based on an attention mechanism and use new methods to ensure that the deep network can learn more effective features while minimizing overfitting. This approach enables the model to learn more features from nucleotide sequences and make better predictions of methylation. The model uses three coding methods to encode the original DNA sequence and then applies feature fusion based on attention mechanisms to obtain the best fusion method. Our results demonstrate that MLACNN outperforms previous methods and achieves more satisfactory performance.

Keywords: Attention CNN; Genome wide methylation detection; Hybrid neural network.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Abstract modular structure of MLACNN: The original data enter three similar feature extraction modules through three coding methods to obtain abstract features; and then, features are fused based on attention mechanism in feature fusion module to obtain prediction results
Fig. 2
Fig. 2
a Specific architecture of CNN attention block of MLACNN feature extraction module b specific architecture of feature fusion module
Fig. 3
Fig. 3
Confusion matrix of proposed model. Among them, ac are the confusion matrices that use one pot encoding, NCP encoding, and EIIP-vector data processing methods to stack and use our MLA-BCS module to predict test set data. d For fusion, the features extracted in ac are fused and the confusion matrix displayed on the test set is retrained
Fig. 4
Fig. 4
Boxplot of six indicators on MLACNN, MRCNN, DeepCpG and DeepCpG CNN: a Sn, b SP, c precision, d ACC, e MCC, f AUC
Fig. 5
Fig. 5
ac Represent the visualization results of t-SNE clustering for raw data, data processed by the first layer of MLACNN, and data processed by the last fully connected layer, respectively. On the other hand, df show the visualization results processed by PCA
Fig. 6
Fig. 6
Six indicators of liver, skin, pancreatic, and lung cells on MLACNN: a AUC, b ACC, c MCC, d Se, e Sn, and f precision

Similar articles

References

    1. Abbas Z, Tayara H, Chong K. Spinenet-6ma: a novel deep learning tool for predicting DNA n6-methyladenine sites in genomes. IEEE Access. 2020;8:201450–201457. doi: 10.1109/ACCESS.2020.3036090. - DOI
    1. Akbar S, Hayat M. iMethyl-STTNC: Identification of N6-methyladenosine sites by extending the idea of SAAC into Chou’s PseAAC to formulate RNA sequences. J Theor Biol. 2018;455:205–211. doi: 10.1016/j.jtbi.2018.07.018. - DOI - PubMed
    1. Alam W, Ali SD, Tayara H, et al. A CNN-based RNA N6-methyladenosine site predictor for multiple species using heterogeneous features representation. IEEE Access. 2020;8:138203–138209. doi: 10.1109/ACCESS.2020.3002995. - DOI
    1. Angermueller C, Lee HJ, Reik W, et al. DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning. Genome Biol. 2017;18(1):1–13. - PMC - PubMed
    1. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473

LinkOut - more resources