. 2025 Aug 19;16(1):7705.

doi: 10.1038/s41467-025-62597-x.

Knowledge-guided self-learning control strategy for mixed vehicle platoons with delays

Jingyao Wang¹, Huinian Wang², Jian Song¹, Xingyu Chen¹, Jinghua Guo³, Keqiang Li⁴, Xunrui Li¹, Bowen Huang⁵

Affiliations

¹ School of Aerospace Engineering, Xiamen University, Xiamen, P. R. China.
² Pen-Tung Sah Institute of Micro-Nano Science and Technology, Xiamen University, Xiamen, P. R. China.
³ Pen-Tung Sah Institute of Micro-Nano Science and Technology, Xiamen University, Xiamen, P. R. China. guojh@xmu.edu.cn.
⁴ School of Vehicle and Mobility, Tsinghua University, Beijing, P. R. China. likq@tsinghua.edu.cn.
⁵ School of Mechanical and Vehicle Engineering, Chongqing University, Chongqing, P. R. China.

PMID: 40830216
PMCID: PMC12365019
DOI: 10.1038/s41467-025-62597-x

Knowledge-guided self-learning control strategy for mixed vehicle platoons with delays

Jingyao Wang et al. Nat Commun. 2025.

. 2025 Aug 19;16(1):7705.

doi: 10.1038/s41467-025-62597-x.

Authors

Jingyao Wang¹, Huinian Wang², Jian Song¹, Xingyu Chen¹, Jinghua Guo³, Keqiang Li⁴, Xunrui Li¹, Bowen Huang⁵

Affiliations

¹ School of Aerospace Engineering, Xiamen University, Xiamen, P. R. China.
² Pen-Tung Sah Institute of Micro-Nano Science and Technology, Xiamen University, Xiamen, P. R. China.
³ Pen-Tung Sah Institute of Micro-Nano Science and Technology, Xiamen University, Xiamen, P. R. China. guojh@xmu.edu.cn.
⁴ School of Vehicle and Mobility, Tsinghua University, Beijing, P. R. China. likq@tsinghua.edu.cn.
⁵ School of Mechanical and Vehicle Engineering, Chongqing University, Chongqing, P. R. China.

PMID: 40830216
PMCID: PMC12365019
DOI: 10.1038/s41467-025-62597-x

Abstract

As autonomous vehicles and traditional vehicles will coexist for several decades, how to efficiently manage the mixed traffic, while enhancing road throughput, fuel consumption and traffic stability becomes a challenge. This is due to the randomness and heterogeneity of traditional vehicles interspersed among autonomous vehicles. Moreover, communication delays arising from the shared wireless communication network substantially degrade the performance of platooning control for connected autonomous vehicles. To address these challenging problems, this paper proposes a knowledge-guided self-learning mixed platoon control strategy. Firstly, the proposed strategy extracts key features of the continuous and aggregated behavior of traditional vehicles, such as desired time-varying time gap and standstill spacing, by integrating knowledge from the kinematic wave model and Newell's car-following model. This helps autonomous vehicles predict traditional vehicles' trajectories. Secondly, to tackle delayed current state information, the study incorporates previous control instructions into the state representation of the soft actor-critic algorithm. Simulations show the proposed strategy outperforms existing methods in traffic stability, passenger comfort, energy consumption cost and traffic oscillation dampening, with a zero collision rate in vehicle merging and diverging scenarios. The framework provides a generalizable and scalable solution for the development and adoption of connected autonomous vehicle systems.

PubMed Disclaimer

Conflict of interest statement

Competing interests: Authors declare no competing interests.

Figures

**Fig. 1. An overview of the mixed platoon composed of autonomous vehicles and human-driven vehicles.**
a Comparisons between autonomous vehicles and human-driven vehicles (HV). b The NGSIM data shows that the sudden breaking and acceleration behaviors of an HV lead to the traffic oscillation of the following HVs. c The general mixed platoon (I) and the “CV-HVs-CV” sub-platoon (II). The general mixed platoon (I) can be decomposed to sub-platoons (II), which are shown in the dash-dotted blocks.

**Fig. 2. The proposed knowledge-guided self-learning mixed platoon control framework.**
CV = connected vehicle. HV = human-driven vehicle. V2X = vehicle-to-everything. Here, CV 0 is designated as the leading vehicle.

**Fig. 3. The velocity, acceleration and position curves of a mixed platoon under different control strategies.**
CV = connected vehicle. HV = human-driven vehicle. a A schematic diagram showing the mixed platoon composed of seven vehicles. b Simulated results using CVDS-IDM strategy. c Simulated results using DDPG. d Simulated results using AC. e Simulated results using PPO. f Simulated results using the proposed SAC strategy. Source data are provided as a Source Data file.

**Fig. 4. Performance comparison among AC, PPO, CVDS-IDM, DDPG and the proposed knowledge-guided self-learning SAC strategies.**
PPO = proximal policy optimization. DDPG = deep deterministic policy gradient. AC = actor-critic. CVDS-IDM = connected vehicle driving strategy integrated with an intelligent driver model. SAC = soft actor-critic. Source data are provided as a Source Data file.

**Fig. 5. The superiority percentage of the proposed strategy compared with existing methods for CVs in mixed platoon.**
a Following CV 2. b Following CV 6. PPO = proximal policy optimization. DDPG = deep deterministic policy gradient. AC = actor-critic. CVDS-IDM = connected vehicle driving strategy integrated with an intelligent driver model. SAC = soft actor-critic. Source data are provided as a Source Data file.

**Fig. 6. The velocity, acceleration and spacing curves of CVs in a mixed platoon under different communication delays.**
a–c Following CV 2. d–f Following CV 6. Source data are provided as a Source Data file.

**Fig. 7. The velocity and acceleration heat map based on the position trajectories for the mixed platoon with different penetration rates.**
Source data are provided as a Source Data file.

**Fig. 8. Average dampening ratio, driver comfort cost, energy consumption cost and flow stability of six vehicles for random delays.**
Center blue lines of the boxes represent the median. Vehicle 1 is guided by the NGSIM data, while vehicles 2 and 6 are connected autonomous vehicles controlled by the proposed strategy. The remaining vehicles are traditional vehicles. Source data are provided as a Source Data file.

**Fig. 9. Averaged performance indicators for different platoons with different number of vehicles.**
The oscillation dampening, driving comfort cost, energy consumption and flow stability with the proposed knowledge-guided self-learning SAC algorithm of the mixed platoon are presented with the orange colored curves, while those of the tradition platoon are shown with the blue curves. HV = human-driven vehicle. Source data are provided as a Source Data file.

**Fig. 10. Position profiles of a mixed platoon in the presence of lane changing behavior of an HV (in red) under CVDS-IDM, DDPG, PPO and the proposed SAC strategy.**
CV = connected vehicle. HV = human-driven vehicle. a–c CVDS-IDM. d–f DDPG. g–i PPO. j–l The proposed knowledge-guided self-learning SAC strategy. Source data are provided as a Source Data file.

**Fig. 11. The algorithm procedure of knowledge-guided self-learning mixed platoon control framework.**
CV = connected vehicle. HV = human-driven vehicle. SAC = soft actor-critic.

See this image and copyright information in PMC

References

1. Wang, X. et al. Traffic light optimization with low penetration rate vehicle trajectory data. Nat. Commun.15, 1306 (2024). - PMC - PubMed
1. Zhou, Y., Lin, Y., Ahn, S., Wang, P. & Wang, X. Platoon trajectory completion in a mixed traffic environment under sparse observation. IEEE Trans. Intell. Transp. Syst.23, 16217–16226 (2022).
1. Antonakaki, A., Oikonomou, M. G., Garefalakis, T. & Yannis, G. Driving automation systems penetration and traffic safety: Implications for infrastructure design and policy. Infrastructures9, 234 (2024).
1. Milakis, D., Snelder, M., Van Arem, B., Van Wee, B. & Homem De Almeida Correia, G. Development and Transport Implications of Automated Vehicles in the Netherlands: Scenarios for 2030 and 2050. Eur. J. Transp. Infrastruct. Res.17, 63–85 (2017).
1. Li, T., Guo, F., Krishnan, R., Sivakumar, A. & Polak, J. Right-of-way reallocation for mixed flow of autonomous vehicles and human driven vehicles. Transp. Res. C Emerg. Technol.115, 102630 (2020).

Grants and funding

LinkOut - more resources

Full Text Sources
- Nature Publishing Group
- PubMed Central
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Knowledge-guided self-learning control strategy for mixed vehicle platoons with delays

Affiliations

Knowledge-guided self-learning control strategy for mixed vehicle platoons with delays

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

References

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials

Miscellaneous

Abstract

Conflict of interest statement

Figures

Similar articles

References

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials

Miscellaneous