Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 6;25(15):4841.
doi: 10.3390/s25154841.

(H-DIR)2: A Scalable Entropy-Based Framework for Anomaly Detection and Cybersecurity in Cloud IoT Data Centers

Affiliations

(H-DIR)2: A Scalable Entropy-Based Framework for Anomaly Detection and Cybersecurity in Cloud IoT Data Centers

Davide Tosi et al. Sensors (Basel). .

Abstract

Modern cloud-based Internet of Things (IoT) infrastructures face increasingly sophisticated and diverse cyber threats that challenge traditional detection systems in terms of scalability, adaptability, and explainability. In this paper, we present (H-DIR)2, a hybrid entropy-based framework designed to detect and mitigate anomalies in large-scale heterogeneous networks. The framework combines Shannon entropy analysis with Associated Random Neural Networks (ARNNs) and integrates semantic reasoning through RDF/SPARQL, all embedded within a distributed Apache Spark 3.5.0 pipeline. We validate (H-DIR)2 across three critical attack scenarios-SYN Flood (TCP), DAO-DIO (RPL), and NTP amplification (UDP)-using real-world datasets. The system achieves a mean detection latency of 247 ms and an AUC of 0.978 for SYN floods. For DAO-DIO manipulations, it increases the packet delivery ratio from 81.2% to 96.4% (p < 0.01), and for NTP amplification, it reduces the peak load by 88%. The framework achieves vertical scalability across millions of endpoints and horizontal scalability on datasets exceeding 10 TB. All code, datasets, and Docker images are provided to ensure full reproducibility. By coupling adaptive neural inference with semantic explainability, (H-DIR)2 offers a transparent and scalable solution for cloud-IoT cybersecurity, establishing a robust baseline for future developments in edge-aware and zero-day threat detection.

Keywords: RDF/SPARQL explainability; associated random neural network (ARNN); cloud–IoT security; entropy-based anomaly detection; hybrid distributed information retrieval; semantic adaptive cyber defense; sub-second detection latency.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
(Workflow): simulation pipeline of the (H-DIR)2 framework.
Figure 2
Figure 2
(Dual-scalability): Dual-scalability of the H-DIR architecture.
Figure 3
Figure 3
(Dual-level cycle): bidirectional semantic–neural coupling and its dynamic update cycle.
Figure 4
Figure 4
(ETL-bridge): the diagram shows how RDF triples and SPARQL rules are vectorized and injected into the ARNN core, producing neural activations and a dynamic attack graph.
Figure 5
Figure 5
(Graph matrix): (a) spatial distribution of the entropy variation ΔH in the RPL DAO-DIO attack (red = higher disorder). (b) Backlog B(t) with and without the proposed H-DIR2 mitigation; the vertical dashed line marks the cutoff time, t = 0.43 s.
Figure 6
Figure 6
(dao-rpl): Dao Dio attack. Comparison before–after mitigation.
Figure 7
Figure 7
(dao-routing): NTP amplification. Dynamically reconfigured routing.
Figure 8
Figure 8
(NTP amp): Traffic overload observed during a spoofed NTP amplification attack (amplification ×500).
Figure 9
Figure 9
(NTP amp): (a) peak load reduction achieved by four mitigation stacks as the number of edge nodes increases. (b) Learning dynamics of the ARNN early-stage predictor over 20 training epochs.
Figure 10
Figure 10
(Scalability of stress): throughput and latency vs. device count. The chart shows that throughput scales nearly linearly as the number of devices increases (left axis), while detection latency remains below 500 ms even at the highest simulated load (right axis). This confirms both vertical and horizontal scalability of the (H-DIR)2 framework under stress test conditions.
Figure 11
Figure 11
(Semantic graph): view of the semantic–adaptive integration loop.

Similar articles

References

    1. Mirsky Y., Doitshman T., Elovici Y., Shabtai A. Kitsune: An ensemble of autoencoders for online network intrusion detection. arXiv. 2018 doi: 10.48550/arXiv.1802.09089.1802.09089 - DOI
    1. Sicari S., Rizzardi A., Coen-Porisini A. 5G in the Internet of Things Era: An Overview on Security and Privacy Challenges. Comput. Netw. 2020;179:107345. doi: 10.1016/j.comnet.2020.107345. - DOI
    1. García-Teodoro P., Díaz-Verdejo J., Maciá-Fernández G., Vázquez E. Anomaly-based network intrusion detection: Techniques, systems and challenges. Comput. Secur. 2009;28:18–28. doi: 10.1016/j.cose.2008.08.003. - DOI
    1. Feily M., Shahrestani A., Ramadass S. A survey of botnet and botnet detection; Proceedings of the 2009 Third International Conference on Emerging Security Information, Systems, and Technologies; Athens, Greece. 18–23 June 2009; pp. 268–273.
    1. Kurtz N., Song J. Cross-entropy-based adaptive importance sampling using Gaussian mixture. Struct. Saf. 2013;42:35–44. doi: 10.1016/j.strusafe.2013.01.006. - DOI

LinkOut - more resources