Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr 29;10(9):e30117.
doi: 10.1016/j.heliyon.2024.e30117. eCollection 2024 May 15.

Crash severity analysis: A data-enhanced double layer stacking model using semantic understanding

Affiliations

Crash severity analysis: A data-enhanced double layer stacking model using semantic understanding

Di Yang et al. Heliyon. .

Abstract

The crash severity analysis is of significant importance in traffic crash prevention and emergency resource allocation. A range of innovations offers potential traffic crash severity prediction models to improve road safety. However, the semantic information inherent in traffic crash data, which is crucial in enabling a deeper understanding of its underlying factors and impacts, has yet to be fully utilized. Moreover, traffic crash data are commonly characterized by a small sample size, which leads to sample imbalance problem resulting in prediction performance decline. To tackle these problems, we propose a semantic understanding-based data-enhanced double-layer stacking model, named EnLKtreeGBDT, for crash severity prediction. Specifically, to fully leverage the inherent semantic information within traffic crash data and analyze the factors influencing crashes, we design a semantic enhancement module for multi-dimensional feature extraction. This module aims to enhance the understanding of crash semantics and improve prediction accuracy. Then we introduce a data enhancement module that utilizes data denoising and migration techniques to address the challenge of data imbalance, reducing the prediction model's dependence on large sample crash data. Furthermore, we construct a two-layer stacking model that combines multiple linear and nonlinear classifiers. This model is designed to augment the capability of learning linear and nonlinear mixed relationships, thereby improving the accuracy of predicting the severity of crashes on complex urban roads. Experiments on historical datasets of UK road safety crashes validate the effectiveness of the proposed model, and superior performance of prediction precision is achieved compared with the state-of-the-arts. The ablation experiments on both semantic and data enhancement modules further confirm the indispensability of each module in the proposed model.

Keywords: Crash severity analysis; Data enhancement; Semantic understanding; Stacking model; Urban traffic crashes.

PubMed Disclaimer

Conflict of interest statement

The authors declare no potential conflicts of interest with respect to the research.

Figures

Figure 1
Figure 1
EnLKtreeGBDT for traffic crash severity prediction.
Figure 2
Figure 2
Three types of traffic crash features for feature derivation.
Figure 3
Figure 3
Temporal features derivation.
Figure 4
Figure 4
Feature correlation analysis.
Figure 5
Figure 5
Feature contribution diagram.
Figure 6
Figure 6
Data processing based on OSS.
Figure 7
Figure 7
OSS processing results.
Figure 8
Figure 8
LKtreeGBDT.
Algorithm 1
Algorithm 1
Stacking algorithm.
Figure 9
Figure 9
Comparison of model results.
Figure 10
Figure 10
Contributions of each factor.
Figure 11
Figure 11
Comparisons of ablation experimental results where DE is the data enhancement module and SE represents the semantic enhancement module.

Similar articles

Cited by

References

    1. Ahmad N., Wali B., Khattak A.J. Heterogeneous ensemble learning for enhanced crash forecasts–a frequentist and machine learning based stacking framework. J. Saf. Res. 2023;84:418–434. doi: 10.1016/j.jsr.2022.12.005. - DOI - PubMed
    1. Chen F., Song M., Ma X. Investigation on the injury severity of drivers in rear-end collisions between cars using a random parameters bivariate ordered probit model. Int. J. Environ. Res. Public Health. 2019;16:2632. doi: 10.3390/ijerph16142632. - DOI - PMC - PubMed
    1. Dai Q., Zhang B., Dong S. Eclipse attack detection for blockchain network layer based on deep feature extraction. Wirel. Commun. Mob. Comput. 2022;2022 doi: 10.1155/2022/1451813. - DOI
    1. Drosu A., Cofaru C., Popescu M.V. Fatal injury risk model (firm) of the road accidents that occurred in rainy conditions—a probabilistic approach. Int. J. Automot. Technol. 2021;22:1415–1426. doi: 10.1007/s12239-021-0123-2. - DOI
    1. Elamrani Abou Elassad Z., Mousannif H., Al Moatassime H. Class-imbalanced crash prediction based on real-time traffic and weather data: a driving simulator study. Traffic Inj. Prev. 2020;21:201–208. doi: 10.1080/15389588.2020.1840563. - DOI - PubMed

LinkOut - more resources