Following last week’s submission deadline for the second stage of the 2024 Opta Forum research competition, four projects have been nominated for showcasing on stage, with a further five recommended for poster presentations.
Each proposal submitted was judged based on four key criteria: innovation, methodology, relevance and application. Now the nine strongest projects will be showcased to an invited audience of industry delegates in London.
One new initiative for the 2024 Opta Forum is the introduction of a new stage, where each of the five poster exhibitors will get the opportunity to deliver a 10-minute spotlight talk. These will take place during scheduled networking breaks, enabling delegates to find out more about their projects and key findings.
The Opta Forum, which is being staged for the eleventh consecutive year, remains a key date in the football analytics calendar. The event has seen a number of previous presenters go on to work in various roles across the professional football industry, as highlighted here.
In addition to these research presentations, the 2024 Opta Forum will feature a packed agenda of guest talks and panel discussions, which will focus on key themes around the role of data to enhance performance and recruitment in a team setting, as well as enhance audience engagement from a fan-facing perspective.
The full line-up of research presentations and posters for the 2024 Opta Forum, listed in no specific order, is as follows:
Stage Presentations
Harsh Mishra – A Causal Analysis of Corner Kicks
This project was chosen in the open submission category, which builds on Laurie Shaw’s previous research on establishing a playbook for corner kicks through the lens of a Causal Framework.
Using a combination of event and tracking data, Harsh’s project breaks down over 600 different corner kicks, attempted during the 2021-22 Ligue 1 season, based on the delivery type, defensive setup (man marking or zonal) and the overall outcome. Recurring attacking setups at corners, taking into account player start and end locations, dynamic movement and opposition marking are also clustered.
Harsh’s causal framework is then applied to establish the probability of a shot occurring for each different attacking setup, both when playing against specific defensive setups or when targeting specific types of corner delivery.
The aim of the project is to help set piece analysts quickly identify specific types of corner, making their workflows more efficient and maximising the time available to analysing plays at set pieces in both an attacking and defending context.
Originally from India and now based in the UK, Harsh possesses an MSc in Computer Science and is currently working as a Machine Learning Engineer at Rothamsted Research. He is an active member of the global football analytics community, where you can access his blog or connect with him on LinkedIn.
Alex Sattari – Identifying Player Styles through the Analysis of Passes, Receptions, and High Intensity Runs for Optimised Recruitment Profiling
This project was chosen in the Opta Forum’s practitioner-led submission category, set by Mathieu Lacome of Parma, which applies Opta Vision data to create new interpretable characterisations of individual player styles.
After establishing dedicated passing, receiving and high intensity run networks, covering every player in a specific competition, Alex proposes applying Non-negative Matrix Factorization (NMF), with the aim of uncovering common passing, receiving, and high intensity running patterns amongst all players. The NMF analysis will also be used to identify the unique combination of these common patterns for each player, leading to a new, interpretable characterisation of each player’s style.
The aim of this research is to enhance a recruitment department’s existing profiling pipeline, streamlining the process for analysts to pinpoint players with stylistic similarities and on-field characteristics sought after for player profiles within a club’s game model.
Alex possesses a PhD in Geophysics from the University of Calgary in Canada and more recently has worked as a Postdoctoral Research Associate in Sports Analytics at Jönköping University in Sweden. He is currently based in Calgary.
Niamh Graham, Yasmin Hengster and Maia Trower – Net Gains: Analysing the Impact of Diversity on Performance in Women’s Football
Using core player biographical data and Opta event data from the FA Women’s Super League, Niamh, Yasmin and Maia’s project addresses the question of whether there is a link between diversity and cohesion across WSL teams and individual player performance.
They will analyse the relationship between how diverse a squad is in terms of player age, nationality, and club background and how cohesive a team is measured by time the players have spent playing together in the past.
Then by building passing networks for each chain of possession in a WSL match, they will look to assign a diversity score and a cohesion score to each possession. They will then use an xG value for every possession chain which ends in a shot, to evaluate the impact of the diversity and cohesion on both team and individual performance.
Niamh, Yasmin and Maia are all based at the University of Edinburgh, where they each currently completing PhDs in mathematics.
Matthieu le Gall – Analysing the Impact of Throw-Ins in Modern Football: Possession, Patterns, and Game Context
Matthieu’s presentation applies Opta event data with the aim of identifying recurring patterns of play, following a throw-in, to help identify teams who are most effective at creating goalscoring opportunities from sequences starting from a throw, in different game scenarios.
Matthieu’s project is broken down into two parts. In the first part, he will break down throw-ins into different categories, from both an attacking and defending perspective, taking into account key factors including possession time, passes made during a sequence and territory gained. This information will be used to create an index of throw-ins for teams across the league, highlighting how teams use the ball from a throw, such as quick restarts, ball retention or in a defensive context, high pressure to force a turnover.
In the second part he will analyse passing patterns, from throw-ins, to identify insights into how teams strategically capitalise on throw-in opportunities to generate goalscoring threat. He will also show an analysis of probabilities for pass-end location coordinates for pass events following a throw-in.
Based in Rennes, France, Matthieu works as a Data Scientist for the French Ministry of Armed Forces. He combines his work with studying for a diploma in Big Data and AI at the Conservatoire National des Arts et Métiers. He is due to complete his diploma later this year and then wishes to join a club or company working in sport as a data scientist.
Poster Exhibitors
Hoyoung Choi – A Graph Neural Network Approach to Evaluate Defensive Formations
This poster will introduce an approach for measuring the performance of various defensive formations adopted by teams across a season.
Through developing and applying a Graph Neural Network (GNN) model, with a specific emphasis on possession retention, Choi will share insights designed to help enhance the understanding of a team’s defensive strategies and the collective influence of all players on the pitch in relation to their effectiveness at keeping the ball.
This GNN approach is designed to offer leading teams a practical and applicable system for real-world analysis of defensive structures.
Choi is an undergraduate student who earned a bachelor’s degree earlier this year and is now pursuing a Masters at the KAIST department of Industrial & Systems Engineering, which is located in Daejeon, Republic of Korea.
Leo Martins de Sá Freire – Introducing “Threat-to-Goal” – Measuring Collective Efficiency in Converting Offensive Volume into Goal-Scoring Opportunities
In recent years, the application of event data has focused significantly on action valuation metrics such as Possession Value (PV) and Expected Threat (xT). These metrics allow data analysts to measure the quality of each on-ball action during a match. However, the relationship between the values generated by a team through these actions and the danger posed to an opponent through goalscoring chances remains underexplored.
Leo’s poster aims to fill this gap by presenting the findings of an in-depth investigation into the relationship between two advanced metrics, Expected Goals (xG) and Expected Threat (xT), and using the results to showcase a new metric, Threat to Goal, which aims to provide insight into how effective a team is at utilising their on-ball threat to generate high quality shot attempts.
Based in Belo Horizonte, Brazil, Leo combines his studies at Universidade Federal de Minas Gerais, where he is completing an MSc in Computer Science, with a role as a data scientist at Serie A side Atletico Mineiro.
Atom Scott and Taiga Someya – FootballGPT: Counterfactual Evaluation With a Foundation Model for Football
Counterfactual evaluation allows data scientists to explore hypothetical scenarios and understand the potential impact of different tactical decisions in football. This typically involves training a generative model to predict player movements and then comparing actual and simulated movements to specific scenarios.
Atom and Taiga’s poster will present the findings of building FootballGPT, a foundation model that focuses on collective movement in football. They will aim to answer questions such as, “How does a higher press by the defensive line affect vulnerability to long balls?” or “Would the probability of scoring increase if an additional player joined a counterattack?” Furthermore, they will analyse FootballGPT’s capability to generalise to downstream tasks.
By utilising their FootballGPT model as a simulator for selected scenarios, Atom and Taiga aim to empower teams to make strategic evaluations and decisions based on quantitative data. The simulator would enable teams to simulate the effects of different tactical changes in real match scenarios, such as alterations in the defensive line or the number of players involved in counterattacks, as well as making performance analysis processes more efficient through identifying and analysing similar possession sequences across multiple games.
Atom is currently working towards a PhD in Artificial Intelligence at Nagoya University in Japan and is the founder of an emerging start-up focused on sports analysis. Taiga is studying for an MSc in Computational Linguistics at the University of Tokyo and, using his background as a former U15/16 Japan national team prospect, is also exploring research in sports analytics, especially in football.
Andrew Kang, Travis Curson and Scott Powers – Not all features are created equal: Objective-specific Clustering and Cluster Evaluation Motivation
Player role clustering is a well-known problem in sports analytics, often hindered by a common flaw in many clustering methodologies: the equal weighting of all features, regardless of their relevance to the specific analysis objective.
To identify players from different teams who are stylistically similar on the field to enhance recruitment workflows, Andrew, Travis and Scott’s poster will apply Opta Vision data to present a two-step recursive K-means clustering algorithm. This algorithm first clusters role styles based on a player’s on- and off-ball tendencies and then creates sub-clusters based on the specific requirements of a recruitment analyst, such as applying passing-related features when looking to identify playmakers.
To compare the skill of players in each cluster and assess their contribution to wins, Andrew, Travis and Scott will adapt the Box Plus-Minus (BPM) method used in basketball analytics. This involves using the Regularized Adjusted Plus-Minus (RAPM) to estimate each player’s contribution to their team’s expected goal (xG) differential. By doing so, the analysis will highlight the numeric importance of each skill that different player clusters must excel in, for their team’s success.
Andrew is an undergraduate studying Computer Science at Rice University in Houston, Texas, where he also acts as a data analyst for Rice Athletics’ women’s soccer team, whose backroom team includes Travis as an Assistant Coach. They compete in NCAA Division I. Scott joined Rice University’s Department of Sport Management as an Assistant Professor in Sport Analytics in 2023.
Matthew Hilton – Beyond the Scoreboard: A Bayesian Approach to Valuing Football Players’ Actions
To analyse the performance of individual players, we must scrutinise each player’s contribution to scoring and conceding goals. However, as there are many different actions a player can take within a game, which actions are most valuable in that they are more likely to lead to scoring a goal and less likely to lead to conceding a goal?
Matt’s poster introduces a novel Bayesian framework for the evaluation of football players’ actions. The framework is built upon a Bayesian grounded xG model which explicitly quantifies the uncertainty associated with each goal-scoring prediction instead of relying on point estimates. This framework can be extended to analyse all actions performed by players and how they contribute to the probability of scoring and conceding. Crucially, data analysts can assess the variance in players’ actions and their execution.
Possessing an MSc in in Data Science, Matt currently works for Barclays in the UK. As part of the bank’s data science team he works in the financial modelling of their Mortgage Portfolio.
Stats Perform would like to thank everyone who submitted a proposal and congratulate the nine groups who will be presenting or exhibiting at the 2024 Opta Forum.