Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Dec 31;20(12):e0339675.
doi: 10.1371/journal.pone.0339675. eCollection 2025.

Meta-path guided policy distillation for resilient coordination in autonomous unmanned swarm

Affiliations

Meta-path guided policy distillation for resilient coordination in autonomous unmanned swarm

Xingye Han et al. PLoS One. .

Abstract

Enhancing the resilience of Autonomous Unmanned Swarms (AUS) requires policies that remain effective under severe, structured disruptions while respecting the heterogeneous semantics of inter-subsystem interactions. Existing reinforcement learning (RL) approaches typically aggregate first-order neighborhoods in a path-agnostic manner, thereby blurring typed, ordered, and directed multi-hop dependencies encoded by domain meta-paths. We propose MPGPD-RC, a Meta- Path Guided Policy Distillation framework for Resilient Coordination that couples: (i) meta-path-guided embeddings learned by path-specific graph attention with contrastive reconstruction and attention fusion, and (ii) a teacher-student scheme in which a PPO teacher trained with a relaxed meta-path mask provides trajectories, and a student aligns both action distributions (KL) and trajectory-level structural codes via path-aware contrastive learning. Empirical evaluations validate that MPGPD-RC consistently surpasses state-of-the-art baselines across diverse perturbation scenarios by modeling complex, high-order dependencies that underpin resilient coordination.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Depiction of two real time recovery strategies within a four class AUS configuration.
(a) Replacement action: a compromised Decision node (marked with a red cross) is substituted with a standby node of equivalent classification. The original SensorDecisionInfluencerTarget meta-path is reconstituted via reactivation of corresponding links (indicated by dashed blue and green arrows). (b) Cooperation action: in the absence of available redundant assets, operational continuity is achieved through dynamic rerouting across an alternative assemblage of functional nodes.
Fig 2
Fig 2. The proposed Meta- Path Guided Policy Distillation for Resilient Coordination (MPGPD-RC) framework comprises two interdependent modules:
(a) Meta-Path Guided Embedding, responsible for encoding heterogeneous inter-node relationships into semantically rich and structurally aware representations, and (b) Meta-Path-Aware Policy Distillation, which facilitates the transfer of expert-level decision policies from a teacher model to a computationally efficient student model while maintaining the structural semantics intrinsic to the AUS topology.
Fig 3
Fig 3. Illustration of the AUS network topology for demo1 and demo2.
The system includes four types of entities: Sensor, Decision, Communication, and Actuator.
Fig 4
Fig 4. Robustness to random initial conditions.
For each damage ratio (40%–60%) we re-initialise the AUS topology fifty times and report the mean resilience (bars) with one-standard-deviation whiskers across the four attack modes.
Fig 5
Fig 5. Impact of hyperparameters α and β on resilience under four attack scenarios: (a) Sensor Node Attacks (SA), (b) Decision Node Attacks (DA), (c) Influencer Node Attacks (IA), and (d) Mixed Node Attacks (MA).
Fig 6
Fig 6. Comparison of resilience value () during training for MPGPD-RC (red) and its variant without MPA (blue).

References

    1. Du Z, Luo C, Min G, Wu J, Luo C, Pu J, et al. A survey on autonomous and intelligent swarms of Uncrewed Aerial Vehicles (UAVs). IEEE Trans Intell Transport Syst. 2025;26(10):14477–500. doi: 10.1109/tits.2025.3569500 - DOI
    1. Li H, Zhong Y, Zhuang X. A soft resource optimization method based on autonomous coordination of unmanned swarms system driven by resilience. Reliability Engineering & System Safety. 2024;249:110227. doi: 10.1016/j.ress.2024.110227 - DOI
    1. Al-lQubaydhi N, Alenezi A, Alanazi T, Senyor A, Alanezi N, Alotaibi B, et al. Deep learning for unmanned aerial vehicles detection: a review. Computer Science Review. 2024;51:100614. doi: 10.1016/j.cosrev.2023.100614 - DOI
    1. Ersü C, Petlenkov E, Janson K. A systematic review of cutting-edge radar technologies: applications for Unmanned Ground Vehicles (UGVs). Sensors (Basel). 2024;24(23):7807. doi: 10.3390/s24237807 - DOI - PMC - PubMed
    1. Wang Y, Shen C, Huang J, Chen H. Model-free adaptive control for unmanned surface vessels: a literature review. Systems Science & Control Engineering. 2024;12(1). doi: 10.1080/21642583.2024.2316170 - DOI

LinkOut - more resources