. 2022 Dec 1;23(1):517.

doi: 10.1186/s12859-022-05070-6.

ENTAIL: yEt aNoTher amyloid fIbrils cLassifier

Alessia Auriemma Citarella¹, Luigi Di Biasi², Fabiola De Marco², Genoveffa Tortora²

Affiliations

¹ Department of Computer Science, University of Salerno, Fisciano, Italy. aauriemmacitarella@unisa.it.
² Department of Computer Science, University of Salerno, Fisciano, Italy.

PMID: 36456900
PMCID: PMC9714056
DOI: 10.1186/s12859-022-05070-6

ENTAIL: yEt aNoTher amyloid fIbrils cLassifier

Alessia Auriemma Citarella et al. BMC Bioinformatics. 2022.

. 2022 Dec 1;23(1):517.

doi: 10.1186/s12859-022-05070-6.

Authors

Alessia Auriemma Citarella¹, Luigi Di Biasi², Fabiola De Marco², Genoveffa Tortora²

Affiliations

¹ Department of Computer Science, University of Salerno, Fisciano, Italy. aauriemmacitarella@unisa.it.
² Department of Computer Science, University of Salerno, Fisciano, Italy.

PMID: 36456900
PMCID: PMC9714056
DOI: 10.1186/s12859-022-05070-6

Abstract

Background: This research aims to increase our knowledge of amyloidoses. These disorders cause incorrect protein folding, affecting protein functionality (on structure). Fibrillar deposits are the basis of some wellknown diseases, such as Alzheimer, Creutzfeldt-Jakob diseases and type II diabetes. For many of these amyloid proteins, the relative precursors are known. Discovering new protein precursors involved in forming amyloid fibril deposits would improve understanding the pathological processes of amyloidoses.

Results: A new classifier, called ENTAIL, was developed using over than 4000 molecular descriptors. ENTAIL was based on the Naive Bayes Classifier with Unbounded Support and Gaussian Kernel Type, with an accuracy on the test set of 81.80%, SN of 100%, SP of 63.63% and an MCC of 0.683 on a balanced dataset.

Conclusions: The analysis carried out has demonstrated how, despite the various configurations of the tests, performances are superior in terms of performance on a balanced dataset.

Keywords: Amyloidoses; Fibrils machine learning; Protein classification.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

**Fig. 1**
The composition of the aggregated dataset based on the length of the sequences

**Fig. 2**
Comparison of the experiments and their configuration in the test phase

**Fig. 3**
Confusion matrices for the best experiments. From left to right: experiment 2, experiment 5, and experiment 8

**Fig. 4**
Roc Curves for the best experiments. From left to right: experiment 2, experiment 5, and experiment 8

See this image and copyright information in PMC

References

1. Citarella AA, Marco FD, Biasi LD, Risi M, Tortora G. Gene ontology terms visualization with dynamic distance-graph and similarity measures (S). In: Chang S, editor. The 27th international DMS conference on visualization and visual languages, DMSVIVA 2021, KSIR Virtual Conference Center, USA, 2021, KSI Research Inc.; 2021. pp. 85–91. 10.18293/DMSVIVA21-013
1. Citarella AA, Marco FD, Biasi LD, Risi M, Tortora G. PADD: dynamic distance-graph based on similarity measures for GO terms visualization of Alzheimer and Parkinson diseases. J Vis Lang Comput. 2021;2021(1):19–28. doi: 10.18293/JVLC2021-N1-013. - DOI
1. Allen G. Sequencing of proteins and peptides. Work TS, Burdon R, editors (1981)
1. Citarella AA, Porcelli L, Di Biasi L, Risi M, Tortora G. Reconstruction and visualization of protein structures by exploiting bidirectional neural networks and discrete classes. In: 2021 25th international conference information visualisation (IV), 2021. pp. 285–290. 10.1109/IV53921.2021.00053. IEEE
1. Soto C. Protein misfolding and disease; protein refolding and therapy. FEBS lett. 2001;498(2–3):204–207. doi: 10.1016/S0014-5793(01)02486-3. - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

ENTAIL: yEt aNoTher amyloid fIbrils cLassifier

Affiliations

ENTAIL: yEt aNoTher amyloid fIbrils cLassifier

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Medical