Repositório Digital de Publicações Científicas: Comparison of Statistical and Machine Learning Models on Road Traffic Accident Severity Classification


Sign on to:
	Login
	My DSpace authorized users
	Edit Profile
	Receive email updates

Browse
	Communities & Collections
	Issue Date
	Author
	Title
	Subject

Helps
	Regulamento RDPC
	Guia do Utilizador RDPC
	Depósito RDPC
	Faq's RDPC

	Integração CV DeGóis
	Workshop Open Access

	Newsletter Open Access


	About Dspace
	DSpace Software

Repositorio Digital de Publicacoes Cientificas da Universidade de Evora

/ CIMA - Centro de Investigação em Matemática e Aplicações / CIMA - Publicações - Artigos em Revistas Internacionais Com Arbitragem Científica /

Please use this identifier to cite or link to this item: http://hdl.handle.net/10174/33513

Title:	Comparison of Statistical and Machine Learning Models on Road Traffic Accident Severity Classification
Authors:	infante, paulo Jacinto, Gonçalo Afonso, Anabela Rego, Leonor Nogueira, Vitor Quaresma, Paulo Saias, José Santos, Daniel Nogueira, Pedro Silva, Marcelo Costa, Rosalina Góis, Patrícia Manuel, Paulo Rebelo
Keywords:	injury logistic regression machine learning road traffic accidents severity of victims
Issue Date:	16-May-2022
Citation:	3. Infante, P., Jacinto, G., Afonso, A., Rego, L., Nogueira, V., Quaresma, P., Saias, J., Santos, D., Nogueira, P., Silva, M., Costa, R. P., Gois, P., Manuel, P. R. (2022). Comparison of Statistical and Machine Learning Models on Road Traffic Accident Severity Classification. Computers, 11, 80. https://doi.org/10.3390/computers11050080
Abstract:	Portugal has the sixth highest road fatality rate among European Union members. This is a problem of different dimensions with serious consequences in people’s lives. This study analyses daily data from police and government authorities on road traffic accidents that occurred between 2016 and 2019 in a district of Portugal. This paper looks for the determinants that contribute to the existence of victims in road traffic accidents, as well as the determinants for fatalities and/or serious injuries in accidents with victims. We use logistic regression models, and the results are compared to the machine-learning model results. For the severity model, where the response variable indicates whether only property damage or casualties resulted in the traffic accident, we used a large sample with a small imbalance. For the serious injuries model, where the response variable indicates whether or not there were victims with serious injuries and/or fatalities in the traffic accident with victims, we used a small sample with very imbalanced data. Empirical analysis supports the conclusion that, with a small sample of imbalanced data, machine-learning models generally do not perform better than statistical models; however, they perform similarly when the sample is large and has a small imbalance.
URI:	http://hdl.handle.net/10174/33513
Type:	article
Appears in Collections:	CIMA - Publicações - Artigos em Revistas Internacionais Com Arbitragem Científica

Files in This Item:

File	Description	Size	Format
computers_11-00080.pdf		247.99 kB	Adobe PDF	View/Open

Serviços de Ciência e Cooperação - Universidade de Évora