Skip to Main Content (Press Enter)

Logo UNIBS
  • ×
  • Home
  • People
  • Organizations
  • Expertise & Skills
  • Outputs
  • Jobs
  • Degrees
  • Courses
  • Third Mission

Expertise & Skills
Logo UNIBS

|

Expertise & Skills

unibs.it
  • ×
  • Home
  • People
  • Organizations
  • Expertise & Skills
  • Outputs
  • Jobs
  • Degrees
  • Courses
  • Third Mission
  1. Outputs

Exploring and Modelling Team Performances of the Kaggle European Soccer Database

Academic Article
Publication Date:
2019
Abstract:
This study explores a big and open database of soccer leagues in 10 European countries. Data related to players, teams and matches covering 7 seasons (from 2009/2010 to 2015/2016) were retrieved from Kaggle, an online platform in which big data are available for predictive modelling and analytics competition among data scientists. Based on both preliminary data analysis, experts' evaluation and players' position on the football pitch, role-based indicators of teams' performance have been built and used to estimate the win probability of the home team with the Binomial Logistic Regression (BLR) Model, that has been extended including the ELO rating predictor and two random effects, due to the hierarchical structure of the dataset. The predictive power of the BLR Model and its extensions has been compared with the one of other statistical modelling approaches (Random Forest, Neural Network, k-NN, Naive Bayes). Results showed that role-based indicators substantially improved the performance of all the models used in both this work and in previous works available on Kaggle. The base BLR Model increased prediction accuracy by 10 percentage points, and showed the importance of defensive performances, especially in the last seasons. Inclusion of both ELO rating predictor and the random effects did not substantially improve prediction, as the simpler BLR Model performed equally good. With respect to the other models, only Naive Bayes showed more balanced results in predicting both win and no-win of the home team.
CRIS type:
1.1 Articolo in rivista
Keywords:
Kaggle European Soccer Database, Binomial Logistic Regression Model, player performance indicators, prediction of match results.
List of contributors:
Carpita, Maurizio; Ciavolino, Enrico; Pasca, Paola
Authors of the University:
CARPITA Maurizio
Handle:
https://iris.unibs.it/handle/11379/510484
Published in:
STATISTICAL MODELLING
Journal
  • Support
  • Privacy
  • Use of cookies
  • Legal notes

Powered by VIVO | Designed by Cineca | 26.5.0.0