A. Nickkar, A. Yazdizadeh, Y.-J. Lee
While the cost of crashes nears $1 trillion a year in the U.S., the availability of high-resolution Highway Safety Information System (HSIS) data allows researchers to conduct an in-depth analysis of factors that contribute to crashes, and design appropriate interventions. The current study has two main goals: First, finding possible relationships between contributing factors and the severity of crashes on urban expressways and freeways, and, second, improving the prediction accuracy by using a machine learning approach to classify the crash severity and evaluate the performance of this classifier algorithm. We used the crash data on urban expressways and freeways from 2005 to 2015 in the state of Washington provided by HSIS. This study uses the random forest model to predict the severity of crashes based on the attributes. The random forest model was able to predict the severity of crashes with 88.6 % accuracy while we observed precision of 89.9%, 62.1%, and 40.1% for classes 1-3, respectively. We found that crash type, functional class of road, AADT, and location type played a more important role than other variables in predicting the severity of crashes. Furthermore, lighting conditions, weather, year of the crash, and the road characteristics did not have much effect on the severity of crashes.
Keywords: data mining; machine learning; crash severity; crash analysis