W. Zhang, L. Xiao, Y. Wang, K.B. Kelarestaghi
Traditional crash prediction models use roadway geometric features, traffic control types, and annual average daily traffic volumes, as inputs to predict the annual crashes of a roadway site. Developing such models requires careful sampling of the crash sites from different locations and advanced statistical techniques; using them requires knowledge of the site and local calibrations. This paper introduces a big data approach to predicting the annual crashes of a roadway facility based on predictive analytics. It predicts what will happen in the future by analysing the rich historical data, detecting the underlying patterns and trends, and using them to predict future events. A tool was developed that uses the multi-year comprehensive state wide crash history dataset as backend data access. The tool can predict the annual crashes around a user selected location anywhere in the state. This tool was developed based on the rational that a multi-year state wide crash dataset covers all the possible locations where any types of traffic crashes could happen on a regular basis. Also, the causation factors of all documented crashes and the information encapsulated in such a big dataset should contain enough information to allow prediction of annual crashes anywhere on the state’s roadway network. This method requires ready access to the statewide crash dataset, but no necessary prior knowledge of the roadway facility of interest. An auto searching algorithm was developed to perform dynamic sampling around the user selected location, followed by data processing to detect patterns and trends. This method is area based, however, by properly adjusting the searching criteria, the result can converge to an intersection or a roadway segment. It inherently considers the influences of nearby facilities.
Keywords: big data; crash prediction; predictive analytics