R.Y. Qian, X. Wang
Traffic safety has been of great concern in recent years. The prediction of the severity of traffic accidents is an important part of it. The occurrence of traffic accidents shows the characteristics of uncertainty and non-linearity because of the influence of random factors. However, most of the existing models are single machine learning (ML) models, which have limitations in accuracy and generalization. This study proposes a traffic accident severity prediction model based on a combination of XGBoost (eXtreme Gradient Boosting) and Backpropagation Neural Network (BPNN). Firstly, feature selection is performed using the XGBOOST model. Secondly, the selected feature is used as the input layer of BPNN. In addition, traffic accidents have class imbalance, so the total cost is minimized by using cost-sensitive algorithm. Finally, the precision, recall and area under the curve (AUC) are used to evaluate the prediction results of the model. The 2005-2014 UK traffic accident dataset is used for prediction and compared with other machine learning models. Experiments show that (1) the XGBoost-BPNN model outperformed the single XGBoost, logistic regression (LR), and Support vector machine (SVM) models in terms of AUC, recall, and precision. (2) The number of neurons, the number of hidden layers and the learning rate of a neural network model have a large impact on the prediction accuracy. Increasing the number of neurons appropriately can improve the convergence speed and prediction effect of the model. This study can provide a reference for traffic accident prevention and early warning.
Keywords: neural network; traffic accident risk predicting; imbalanced dataset