G. Singh, S.N. Sachdeva, M. Pal
Present study models the injury severity of acidents on non-urban sections of highways in India using multinomial logit model and two non-parametric models, namely, decision tree and random forest model. Dataset used consists of 2664 accidents with four severity levels and twenty five explanatory variables belonging to seven highways in state of Haryana, India. As there was an imbalance in crashes by injury level with the available dataset, two approaches for class balancing: synthetic minority oversampling (SMOTE) and randomise class balancing were used. Analysis of results suggests a significant improvement in predicting injury severity after class balancing. A comparison of results provided by used approaches suggests a superior performance by random forest classifier in terms of classification of accident severity levels. Another advantage of the use of random forest classifier is the availability of a ranked list of all input variables during classification of injury severity, which can be used to identify important variables required for classification of accident data in different severity levels. Analysis of results reveals that in order to reduce fatal and severe accidents, more attention needs to be provided to local and vulnerable road user traffic entering and crossing the highways in hazardous situations. The contribution of side swipe crashes and crashes with parked vehicles in severe outcomes was found to be another area of concern, requiring enforcement of lane discipline on highways and construction of lay-by bays as per the requirement of traffic.
Keywords: random forest; decision tree; multinomial logit model; accident severity