Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 19;16(1):7705.
doi: 10.1038/s41467-025-62597-x.

Knowledge-guided self-learning control strategy for mixed vehicle platoons with delays

Affiliations

Knowledge-guided self-learning control strategy for mixed vehicle platoons with delays

Jingyao Wang et al. Nat Commun. .

Abstract

As autonomous vehicles and traditional vehicles will coexist for several decades, how to efficiently manage the mixed traffic, while enhancing road throughput, fuel consumption and traffic stability becomes a challenge. This is due to the randomness and heterogeneity of traditional vehicles interspersed among autonomous vehicles. Moreover, communication delays arising from the shared wireless communication network substantially degrade the performance of platooning control for connected autonomous vehicles. To address these challenging problems, this paper proposes a knowledge-guided self-learning mixed platoon control strategy. Firstly, the proposed strategy extracts key features of the continuous and aggregated behavior of traditional vehicles, such as desired time-varying time gap and standstill spacing, by integrating knowledge from the kinematic wave model and Newell's car-following model. This helps autonomous vehicles predict traditional vehicles' trajectories. Secondly, to tackle delayed current state information, the study incorporates previous control instructions into the state representation of the soft actor-critic algorithm. Simulations show the proposed strategy outperforms existing methods in traffic stability, passenger comfort, energy consumption cost and traffic oscillation dampening, with a zero collision rate in vehicle merging and diverging scenarios. The framework provides a generalizable and scalable solution for the development and adoption of connected autonomous vehicle systems.

PubMed Disclaimer

Conflict of interest statement

Competing interests: Authors declare no competing interests.

Figures

Fig. 1
Fig. 1. An overview of the mixed platoon composed of autonomous vehicles and human-driven vehicles.
a Comparisons between autonomous vehicles and human-driven vehicles (HV). b The NGSIM data shows that the sudden breaking and acceleration behaviors of an HV lead to the traffic oscillation of the following HVs. c The general mixed platoon (I) and the “CV-HVs-CV” sub-platoon (II). The general mixed platoon (I) can be decomposed to sub-platoons (II), which are shown in the dash-dotted blocks.
Fig. 2
Fig. 2. The proposed knowledge-guided self-learning mixed platoon control framework.
CV = connected vehicle. HV = human-driven vehicle. V2X = vehicle-to-everything. Here, CV 0 is designated as the leading vehicle.
Fig. 3
Fig. 3. The velocity, acceleration and position curves of a mixed platoon under different control strategies.
CV = connected vehicle. HV = human-driven vehicle. a A schematic diagram showing the mixed platoon composed of seven vehicles. b Simulated results using CVDS-IDM strategy. c Simulated results using DDPG. d Simulated results using AC. e Simulated results using PPO. f Simulated results using the proposed SAC strategy. Source data are provided as a Source Data file.
Fig. 4
Fig. 4. Performance comparison among AC, PPO, CVDS-IDM, DDPG and the proposed knowledge-guided self-learning SAC strategies.
PPO = proximal policy optimization. DDPG = deep deterministic policy gradient. AC = actor-critic. CVDS-IDM = connected vehicle driving strategy integrated with an intelligent driver model. SAC = soft actor-critic. Source data are provided as a Source Data file.
Fig. 5
Fig. 5. The superiority percentage of the proposed strategy compared with existing methods for CVs in mixed platoon.
a Following CV 2. b Following CV 6. PPO = proximal policy optimization. DDPG = deep deterministic policy gradient. AC = actor-critic. CVDS-IDM = connected vehicle driving strategy integrated with an intelligent driver model. SAC = soft actor-critic. Source data are provided as a Source Data file.
Fig. 6
Fig. 6. The velocity, acceleration and spacing curves of CVs in a mixed platoon under different communication delays.
ac Following CV 2. df Following CV 6. Source data are provided as a Source Data file.
Fig. 7
Fig. 7. The velocity and acceleration heat map based on the position trajectories for the mixed platoon with different penetration rates.
Source data are provided as a Source Data file.
Fig. 8
Fig. 8. Average dampening ratio, driver comfort cost, energy consumption cost and flow stability of six vehicles for random delays.
Center blue lines of the boxes represent the median. Vehicle 1 is guided by the NGSIM data, while vehicles 2 and 6 are connected autonomous vehicles controlled by the proposed strategy. The remaining vehicles are traditional vehicles. Source data are provided as a Source Data file.
Fig. 9
Fig. 9. Averaged performance indicators for different platoons with different number of vehicles.
The oscillation dampening, driving comfort cost, energy consumption and flow stability with the proposed knowledge-guided self-learning SAC algorithm of the mixed platoon are presented with the orange colored curves, while those of the tradition platoon are shown with the blue curves. HV = human-driven vehicle. Source data are provided as a Source Data file.
Fig. 10
Fig. 10. Position profiles of a mixed platoon in the presence of lane changing behavior of an HV (in red) under CVDS-IDM, DDPG, PPO and the proposed SAC strategy.
CV = connected vehicle. HV = human-driven vehicle. ac CVDS-IDM. df DDPG. gi PPO. jl The proposed knowledge-guided self-learning SAC strategy. Source data are provided as a Source Data file.
Fig. 11
Fig. 11. The algorithm procedure of knowledge-guided self-learning mixed platoon control framework.
CV = connected vehicle. HV = human-driven vehicle. SAC = soft actor-critic.

Similar articles

References

    1. Wang, X. et al. Traffic light optimization with low penetration rate vehicle trajectory data. Nat. Commun.15, 1306 (2024). - PMC - PubMed
    1. Zhou, Y., Lin, Y., Ahn, S., Wang, P. & Wang, X. Platoon trajectory completion in a mixed traffic environment under sparse observation. IEEE Trans. Intell. Transp. Syst.23, 16217–16226 (2022).
    1. Antonakaki, A., Oikonomou, M. G., Garefalakis, T. & Yannis, G. Driving automation systems penetration and traffic safety: Implications for infrastructure design and policy. Infrastructures9, 234 (2024).
    1. Milakis, D., Snelder, M., Van Arem, B., Van Wee, B. & Homem De Almeida Correia, G. Development and Transport Implications of Automated Vehicles in the Netherlands: Scenarios for 2030 and 2050. Eur. J. Transp. Infrastruct. Res.17, 63–85 (2017).
    1. Li, T., Guo, F., Krishnan, R., Sivakumar, A. & Polak, J. Right-of-way reallocation for mixed flow of autonomous vehicles and human driven vehicles. Transp. Res. C Emerg. Technol.115, 102630 (2020).

LinkOut - more resources