Prediction of Fatal and Major Injury of Drivers, Cyclists, and Pedestrians in Collisions

  • Dalia Shanshal Data Science Laboratory, Department of Mechanical & Industrial Engineering, Ryerson University
  • Ceni Babaoglu Data Science Laboratory, Department of Mechanical & Industrial Engineering, Ryerson University
  • Ayşe Başar Data Science Laboratory, Department of Mechanical & Industrial Engineering, Ryerson University
Keywords: collision, injury severity, prediction, classification, behavioural patterns


Traffic-related deaths and severe injuries may affect every person on the roads, whether driving, cycling or walking. Toronto, the largest city in Canada and the fourth largest in North America, aims to eliminate traffic-related fatalities and serious injuries on city streets. The aim of this study is to build a prediction model using data analytics and machine learning techniques that learn from past patterns, providing additional data-driven decision support for strategic planning. A detailed exploratory analysis is presented, investigating the relationship between the variables and factors affecting collisions in Toronto. A learning-based model is proposed to predict the fatalities and severe injuries in traffic collisions through a comparison of two predictive models: Lasso Regression and Random Forest. Exploratory data analysis results reveal both spatio-temporal and behavioural patterns such as the prevalence of collisions in intersections, in the spring and summer and aggressive driving and inattentive behaviours in drivers. The prediction results show that the best predictor of injury severity for drivers, cyclists and pedestrians is Random Forest with an accuracy of 0.80, 0.89, and 0.80, respectively. The proposed methods demonstrate the effectiveness of machine learning application to traffic and collision data, both for exploratory and predictive analytics.


World Health Organization. Violence and Injury Prevention, Road traffic injuries. Available from: [Accessed June 2nd 2018].

Government of Canada, Transport Canada. Canadian Motor Vehicle Traffic Collision Statistics: 2016. Available from: [Accessed June 2nd 2018].

McGillivray K. Toronto traffic fatalities hit 14-year high. CBC News. December 5 2016. Available from: [Accessed June 18th 2018].

Yannis G, Dragomanovits A, Laiou A, Richter T, Ruhl S, La Torre F, Domenichini L, Graham D, Karathodorou N, and Li H. Use of Accident Prediction Models in Road Safety Management An International Inquiry. Transportation Research Procedia. 2016;14: 4257-4266. Available from: doi:10.1016/j.trpro.2016.05.397 [Accessed June 18th 2018].

Toronto Police Service. Vision Zero Plan Overview. Available from: [Accessed June 2nd 2018].

Vision Zero Network. European Cities Lead the Way Toward Vision Zero. Available from: [Accessed June 2nd 2018].

Edmonton Traffic Safety: Vision Zero. About Vision Zero. Available from: [Accessed June 2nd 2018].

Amit D, Arason N, Mussell L, and Woolsey D. Moving to Vision Zero: Road Safety Strategy Update and Showcase of Innovation in British Columbia. Ministry of Public Safety and Solicitor General RoadSafetyBC. 2016. Available from:

strategy-update-vision-zero.pdf [Accessed June 2nd 2018].

Minutes of Ottawa Transportation Committee, July 5, 2017. Available from: [Accessed: August 1st 2018].

City of Toronto. Toronto’s Road Safety Plan Vision Zero. Available from: [Accessed June 2nd 2018].

Raslavičius L, Bazaras L, Keršys R. Accident Reconstruction and Assessment of Cyclist's Injuries Sustained in Car-to-bicycle Collision. Procedia Engineering. 2017;187: 562-569. Available from: doi:10.1016/j.proeng.2017.04.415 [Accessed June 13th 2018].

Shi L, Han Y, Huang H, Li Q, Wang B, Mizuno K. Analysis of pedestrian-to-ground impact injury risk in vehicle-to-pedestrian collisions based on rotation angles. Journal of Safety Research. 2018;64: 37-47. Available from: doi:10.1016/j.jsr.2017.12.004 [Accessed June 13th 2018].

Sun Y, Zhou X, Cuiping J, Yan C, Huang M, Xiang H. Childhood injuries from motor vehicle-pedestrian collisions in Wuhan, The People's Republic of China. International Journal of the Care of the Injured. 2006;37: 416-422. Available from: doi:10.1016/j.injury.2005.12.002 [Accessed June 2nd 2018].

Skjerven-Martinsen M, Aksel Naess P, Bond Hansen T, Gaarder C, Lereim I, Stray-Pedersen A. A prospective study of children aged <16 years in motor vehicle collisions in Norway: Severe injuries are observed predominantly in older children and are associated with restraint misuse. Accident Analysis & Prevention. 2014;73: 151-162. Available from: doi:10.1016/j.aap.2014.09.004 [Accessed June 2nd 2018].

Bedard M, Guyatt G-H, Stones M-J, Hirdes J-P. The independent contribution of driver, crash, and vehicle characteristics to driver fatalities. Accident Analysis & Prevention. 2002;34(6): 717-727. Available from: doi:10.1016/S0001-4575(01)00072-0 [Accessed 18th June 2018].

Wickens CM, Mann RE, Ialomiteanu AR, Stoduto G. Do driver anger and aggression contribute to the odds of a crash? A population-level analysis. Transportation Research Part F: Traffic Psychology and Behaviour. 2016;42: 389-399. Available from: doi:10.1016/j.trf.2016.03.003 [Accessed June 2nd 2018].

Olutayo V-A, Eludire A-A. Traffic Accident Analysis Using Decision Trees and Neural Networks. I.J. Information Technology and Computer Science. 2014;6(2): 22-28. Available from: 10.5815/ijitcs.2014.02.03 [Accessed 2nd June 2018].

Bener A, Yildirim E, Özkan T, Lajunen T. Driver sleepiness, fatigue, careless behavior and risk of motor vehicle crash and injury: Population-based case and control study. Journal of Traffic and Transportation Engineering (English Edition). 2017;4(5): 496-502. Available from: doi:10.1016/j.jtte.2017.07.005 [Accessed August 3rd 2018].

Abellán J, López G, Oña J. Analysis of traffic accident severity using Decision Rules via Decision Trees. Experts Systems with Applications. 2013;40(15): 6047-6054. Available from: doi:10.1016/j.eswa.2013.05.027 [Accessed June 2nd 2018].

Xiong X, Chen L, Liang J. Analysis of roadway traffic accidents based on rough sets and Bayesian networks. Promet – Traffic&Transportation. 2018;30(1): 71-81. Available from: doi:10.7307/ptt.v30i1.2502 [Accessed 22nd June 2018].

Yu B, Wang YT, Yao JB, Wang JY. A Comparison of the Performance of ANN and SVM for the Prediction of Traffic Accident Duration. International Journal on Non-Standard Computing and Artificial Intelligence. 2016;26: 271-287. Available from: doi:10.14311/NNW.2016.26.015 [Accessed 2nd June 2018].

Bülbül HI, Kaya T, Tulgar Y. Analysis for Status of the Road Accident Occurrence and Determination of the Risk of Accident by Machine Learning in Istanbul. 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), 18-20 December 2016, Anaheim, CA, USA. IEEE; 2016. p. 426-430.

Tay R. Comparison of the binary logistic and skewed logistic (Scobit) models if injury severity in motor vehicle collisions. Accident Analysis & Prevention. 2016;88: 52-55. Available from: doi:10.1016/j.aap.2015.12.009 [Accessed June 2nd 2018].

Taamneh M, Alkheder S, Taamneh S. Data-mining techniques for traffic accident modeling and prediction in the United Arab Emirates. Journal of Transportation Safety & Security. 2017;9(2): 146-166. Available from: doi:10.1080/19439962.2016.1152338 [Accessed June 2nd 2018].

Tambouratzis T, Dora S, Miltiadis C, Andreas G. Maximising Accuracy and Efficiency of Traffic Accident Prediction Combining Information Mining with Computational Intelligence Approaches and Decision Trees. Journal of Artificial Intelligence and Soft Computing Research. 2014;4(1): 31-42. Available from: doi:10.2478/jaiscr-2014-0023 [Accessed 18th June 2017] .

Oña J, Mujalli RO, Calvo FJ. Analysis of traffic accident injury severity on Spanish rural highways using Bayesian networks. Accident Analysis & Prevention. 2011;43(1): 402-411. Available from: doi:10.1016/j.aap.2010.09.010 [Accessed 2nd June 2018].

Böehmlaender D, Hasirlioglu S, Yano V. Advantages in Crash Severity Prediction Using Vehicle to Vehicle Communication. 015 IEEE International Conference on Dependable Systems and Networks Workshops, Rio de Janeiro, 2015. p. 112-117. Available from: doi:10.1109/DSN-W.2015.23 [Accessed: June 2nd 2018].

Xu C, Wang W, Liu P. A Genetic Programming Model for Real-Time Crash Prediction on Freeways. IEEE Transactions on Intelligent Transportation Systems, vol. 14, no. 2; June 2013. p. 574-586.

Abdel-Aty M, Abdelwahab H. Analysis and prediction of traffic fatalities resulting from angle collisions including the effect of vehicles configuration and compatibility. Accident Analysis & Prevention. 2004;36(3): 457-469. Available from: doi:10.1016/S0001-4575(03)00041-1 [Accessed 18th June 2018].

Chen Z, Gao Z, Yu R, Wang M, Sun P. Macro-level accident fatality prediction using a combined model based on ARIMA and multivariable linear regression. Proceedings of the 2016 International Conference on Progress in Informatics and Computing (PIC), 23 – 25 December 2016, Shanghai, China. IEEE Xplore; 2017. p. 133-137.

Chen F, Chen S, Ma X. Analysis of hourly crash likelihood using unbalanced panel data mixed logit model and real-time driving environmental big data. Journal of Safety Research. 2018 Jun 1;65: 153-9. Available from: doi:10.1016/j.jsr.2018.02.010 [Accessed June 5th 2019].

Chen F, Chen S. Injury severities of truck drivers in single-and multi-vehicle accidents on rural highways. Accident Analysis & Prevention. 2011 Sep 1;43(5): 1677-88. Available from: doi:10.1016/j.aap.2011.03.026 [Accessed June 5th 2019].

Ma X, Chen S, Chen F. Multivariate space-time modeling of crash frequencies by injury severity levels. Analytic Methods in Accident Research. 2017 Sep 1;15: 29-40. Available from: doi:10.1016/j.amar.2017.06.001 [Accessed June 5th 2019].

Roberts SE, Vingilis E, Wilk P, Seeley J. A comparison of self-reported motor vehicle collision injuries compared with official collision data: an analysis of age and sex trends using the Canadian National Population Health Survey and Transport Canada data. Accident Analysis & Prevention. 2008;40(2): 559-566. Available from: doi:10.1016/j.aap.2007.08.017 [Accessed June 2nd 2018].

Nhan C, Rothman L, Staler M, Howard A. Back-over Collisions in Child Pedestrians from the Canadian Hospitals Injury Reporting and Prevention Program. Traffic Injury Prevention. 2009;10(4): 350-353. Available from: doi:10.1080/15389580902995166 [Accessed June 13th 2018].

Cui G, Wang X, Kwon DW. A framework of boundary collision data aggregation into neighbourhoods. Accident Analysis & Prevention. 2015;83: 1-17. Available from: [Accessed June 2nd 2018]

Aljeri N, Boukerche A. A predictive collision detection protocol using vehicular networks. 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), Montreal, QC; 2017. p. 1-5.

Zhao P, Lee C. Assessing rear-end collision risk of cars and heavy vehicles on freeways using a surrogate safety measure. Accident Analysis & Prevention. 2018;113: 149-158. Available from: doi:10.1016/j.aap.2018.01.033 [Accessed August 2nd 2018].

Grisé E, Buliung R, Rothman L, Howard A. A geography of child and elderly pedestrian injury in the City of Toronto, Canada. Journal of Transport Geography. 2018;66: 321-329. Available from: doi:10.1016/j.jtrangeo.2017.10.003 [Accessed June 2nd 2018].

Teschke K, Frendo T, Shen H, Harris MA, Reynolds CC, Cripton AP, Brubacher J, Cusimano MD, Friedman SM, Hunte G, Monro M, Vernich L, Babul S, Chipman M, Winters M. Bicycling crash circumstances vary by route type: a cross-sectional analysis. BMC Public Health. 2014;14: 1205. Available from: doi:10.1186/1471-2458-14-1205 [Accessed June 2nd 2018].

Toronto Police Service. Public Safety Data Portal. Available from:

Ethem Alpaydin. Introduction to Machine Learning. 3rd ed. Cambridge, Massachusetts: MIT Press; 2014.

Savolainen PT, Mannering FL, Lord D, Quddus MA. The statistical analysis of highway crash-injury severities: A review and assessment of methodological alternatives. Accident Analysis & Prevention. 2011 Sep 1;43(5): 1666-76. Available from: doi:10.1016/j.aap.2011.03.025 [Accessed June 5th 2019].

Winston C, Maheshri V, Mannering F. An exploration of the offset hypothesis using disaggregate data: The case of airbags and antilock brakes. Journal of Risk and Uncertainty. 2006 Mar 1;32(2): 83-99. Available from: doi:10.1007/s11166-006-8288-7 [Accessed June 6th 2019].

Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research. 2002;16: 321-57. Available from: doi:10.1613/jair.953 [Accessed June 3rd 2018] .

Breiman L. Random forests. Machine learning. 2001 Oct 1;45(1): 5-32. Available from: doi:10.1023/A:1010933404324 [Accessed June 1st 2018].

Toronto Police Service. Public Safety Data Portal: KSI Glossary. Available from: [Accessed June 11th 2018].

How to Cite
Shanshal D, Babaoglu C, Başar A. Prediction of Fatal and Major Injury of Drivers, Cyclists, and Pedestrians in Collisions. PROMET [Internet]. 2020Feb.6 [cited 2020Feb.21];32(1):39-3. Available from: