10 - Ansu Fati

KingLeo10

Senior Member
Cup games affected by unique factors like pressure, different tactical setups, but the main one would be the quality of players is much different to the usual.

If xG is an absolute average of average scoring rate it stands to reason that much better players are going to overperform it very regularly. However, this becomes even more skewed when its world class vs world class.
xG numbers weren't modelled off such high level games, they were modelled off average scoring rates, so it will be harder to apply them to a much higher level than they were formed off.

They are a poor model of high leverage games because they are a shit predictor of which team (even among teams that both have high quality players) will wilt under pressure, which players (among those with high quality) will rise to the occasion and have a moment of genius, which manager will hold their bottle and not lose the plot etc. etc.

It's not that they weren't trained on that high leverage games...it's that even if they were trained on them, they would have a predictive accuracy that's not substantially better than a coin toss. And before we go the sample size route, yes, it's lower than the plethora of low leverage games, but there's enough data across say the last few CL editions, last couple of Euro/WC editions to build a predictive model...if only the right parameters could be calibrated (which, as of now, they can't).

I'm not really interested in what xG has to say about Barca v bottom 7-20 of LL. I already know over the course of the season we'll beat the shit out of those teams. There'll be an aberration here or there, but across 28 games for those 14 teams...my common sense can predict about as well as xG.

Now, when it starts predicting El Clasicos or CL/WC/Euro KOs with accuracy, you have my interest.
 

ajnotkeith

Senior Member
They are a poor model of high leverage games because they are a shit predictor of which team (even among teams that both have high quality players) will wilt under pressure, which players (among those with high quality) will rise to the occasion and have a moment of genius, which manager will hold their bottle and not lose the plot etc. etc.

It's not that they weren't trained on that high leverage games...it's that even if they were trained on them, they would have a predictive accuracy that's not substantially better than a coin toss. And before we go the sample size route, yes, it's lower than the plethora of low leverage games, but there's enough data across say the last few CL editions, last couple of Euro/WC editions to build a predictive model...if only the right parameters could be calibrated (which, as of now, they can't).

I'm not really interested in what xG has to say about Barca v bottom 7-20 of LL. I already know over the course of the season we'll beat the shit out of those teams. There'll be an aberration here or there, but across 28 games for those 14 teams...my common sense can predict about as well as xG.

Now, when it starts predicting El Clasicos or CL/WC/Euro KOs with accuracy, you have my interest.
Well I would guess in a general sense it is still accurate. The winning team in a final will usually have the most xG.
 

KingLeo10

Senior Member
Well I would guess in a general sense it is still accurate. The winning team in a final will usually have the most xG.
Do we really need to revisit some of RM's CL runs? :lol:
Or us v Pool/Roma etc. etc.?
Or a handful of Clasicos?
Or City's meltdowns in CL for 5-6 years before they finally began winning?
Or PSG on the yearly?
Or Bayern in many years?
 

ajnotkeith

Senior Member
Do we really need to revisit some of RM's CL runs? :lol:
Or us v Pool/Roma etc. etc.?
Or a handful of Clasicos?
Or City's meltdowns in CL for 5-6 years before they finally began winning?
Or PSG on the yearly?
Or Bayern in many years?
Something that can be done is to relativise the figures team per team instead of just looking at the number for analysis.

Different teams have different xG sweetspots. The numbers mean different things for different teams (Madrid usually prefer high Xg games mutually because they usually have incredible attackers that will win out in those games) for example.

I think its useful, it just needs context like every other stat. And it probably is mostly accurate in terms of winner and loser in knockouts (except some black magic wins of Madrid).
The important thing to remember though, is that it's not necessarily a predictor of winner or loser, and you don't have to use it like that. The only thing it objectively states is what the quality of the chances created were. Some people like it as a predictive model of results but it isn't absolutely that in every case.
 

KingLeo10

Senior Member
2021 CL final: xG Manchester City 0.57-1.25 Chelsea (Actual score MCFC 0-1 CFC)

2022 CL final: xG Liverpool 1.98- 0.85 Real Madrid shots map (Actual score LFC 0-1 RMA)

2023 CL final: xG Man City 0.66-1.19 Inter Milan shots map (Actual score MCI 1-0 INT)




So, it predicted 1 out of 3 (33% accuracy, and yes low sample size but just an illustrative point using recent data), and when it was incorrect, it wasn't just slightly incorrect, incorrect by a margin of 1+ goal in one of the instances. Lol.

@ajnotkeith It's a poor metric (for now) for games which matter
 

ajnotkeith

Senior Member
2021 CL final: xG Manchester City 0.57-1.25 Chelsea (Actual score MCFC 0-1 CFC)

2022 CL final: xG Liverpool 1.98- 0.85 Real Madrid shots map (Actual score LFC 0-1 RMA)

2023 CL final: xG Man City 0.66-1.19 Inter Milan shots map (Actual score MCI 1-0 INT)




So, it predicted 1 out of 3 (33% accuracy, and yes low sample size but just an illustrative point using recent data), and when it was incorrect, it wasn't just slightly incorrect, incorrect by a margin of 1+ goal in one of the instances. Lol.

@ajnotkeith It's a poor metric (for now) for games which matter
Like I said, the only thing it objectively tells you is what were the quality of chances created. After that you analyse as you want. It doesnt have to be a predictive model.
The best predictor is the strength of the squad.

However, it is probably more right than wrong across all the CL knockouts and finals and stuff. I will do a post chronicling it later if I get some time.
 

KingLeo10

Senior Member
Like I said, the only thing it objectively tells you is what were the quality of chances created. After that you analyse as you want. It doesnt have to be a predictive model.
The best predictor is the strength of the squad.

However, it is probably more right than wrong across all the CL knockouts and finals and stuff. I will do a post chronicling it later if I get some time.
We agree 100% on this, but I've seen some very funny assessments on managerial quality, team quality, fairness of trophies won, fairness of results, the actual results etc. etc. extrapolated solely based on this metric. Which is when it began to take a more predictive lens in discussions.

And yes, I suspect you're right that it does better than 50% accuracy across all KO rounds. But I would, and you probably do too, expect a big difference in accuracy for KOs compared to low leverage games. I guess this delta is where I'm interested in this metric improving.

It's just really hard to train models that have any sort of good predictive interpretation for high leverage scenarios...for example if you take a bunch of CL KO games where Messi and CR7 were dominant, or when Barca, RM, Bayern etc. had super teams, the model will be overfitted and specific to just that type of data. Training on this set and using it to predict a City v Inter final or a Pool v Spurs final won't do well.
 

ajnotkeith

Senior Member
We agree 100% on this, but I've seen some very funny assessments on managerial quality, team quality, fairness of trophies won, fairness of results, the actual results etc. etc. extrapolated solely based on this metric. Which is when it began to take a more predictive lens in discussions.

And yes, I suspect you're right that it does better than 50% accuracy across all KO rounds. But I would, and you probably do too, expect a big difference in accuracy for KOs compared to low leverage games. I guess this delta is where I'm interested in this metric improving.

It's just really hard to train models that have any sort of good predictive interpretation for high leverage scenarios...for example if you take a bunch of CL KO games where Messi and CR7 were dominant, or when Barca, RM, Bayern etc. had super teams, the model will be overfitted and specific to just that type of data. Training on this set and using it to predict a City v Inter final or a Pool v Spurs final won't do well.
League games will be much more accurate because its closer to the situations it was modelled off. Most top league games are played with average or above average players with no confounding factors like cup games do that are very inconsistent.

xG as a predictive model for finals is not that useful. Its decided on so much more than that.
 

JohnN

Senior Member
you know when a scientific tool is considered obsolete? when it is wrong once. when it can't make predictions.
statistics is nit a science.
xG is a model trying to depict something real. and it fails. miserably.
just use at your own risk to make any assumptions.
 

Birdy

Senior Member
xG was not devised to make predictions. You start from the wrong premise
xG was devised to evaluate the value of chances created by a team, as Ajno explained.
It's primarily an evaluative tool used internally by a coach and his staff in order to measure how well they attack/defend and find ways to improve that.
So, when a game is done you look at xG scoreline in order to evaluate chance creation, not in order to see if the values assigned came out correctly. That is totally dumb thing to do

Now, there are sites like FiveThirtyEight that use xG data - but NOT only that, there is a bunch of sophisticated stats like non-shot xG, xT, and others - to measure the relative strength of a team, and then compile based on that predictive models primarily for leagues and not cups
 

KingLeo10

Senior Member
xG was not devised to make predictions. You start from the wrong premise
xG was devised to evaluate the value of chances created by a team, as Ajno explained.
It's primarily an evaluative tool used internally by a coach and his staff in order to measure how well they attack/defend and find ways to improve that.
So, when a game is done you look at xG scoreline in order to evaluate chance creation, not in order to see if the values assigned came out correctly. That is totally dumb thing to do

Now, there are sites like FiveThirtyEight that use xG data - but NOT only that, there is a bunch of sophisticated stats like non-shot xG, xT, and others - to measure the relative strength of a team, and then compile based on that predictive models primarily for leagues and not cups
Then maybe you should take your own advice and stop crying about unfair wins based on xG :lol:
 

Birdy

Senior Member
Then maybe you should take your own advice and stop crying about unfair wins based on xG :lol:

But it's fair to say that it is unfair when there is a big discrepancy between xG and G scorelines...
We know that football in an unfair sport before xG became a thing.
The difference is that now we can point to something objective in order to back such claim
 

fergus90

Senior Member
Brighton were rank. Had a 10 minute spell after half time when they looked decent but apart from that, they deserved a thrashing.
 

lucas24

Member
But it's fair to say that it is unfair when there is a big discrepancy between xG and G scorelines...
We know that football in an unfair sport before xG became a thing.
The difference is that now we can point to something objective in order to back such claim
Exactly. Due to the low number of points (goals) gained, football is the most random team sport. In a single match, luck plays a big role. Everyone probably remembers the match Argentina vs Arabia at the World Cup. Result was 1-2, but xg 2.26 vs 0.15. Good statistics only translate into good results after a several games. In volleyball or basketball the better team on the field almost always wins, cuz a lot of points are gained.

Last CL season, xg matched the result by 61,6% (77 out of 125). I considered xg as correct when team won and had an xg advantage of 0.51+ or when was tie and the xg difference was 0.50 or less. Rounding xg into integer and creating a virtual score (to compare vs real) would be unfair (xg 0.51 vs 1.49 would give 1-1 and 0.51 vs 0.49 would give 1-0). I think this is a good result, considering the 3-way solution. No other statistic returns the result better. In general, xg is only counted if the action ends with a shot. It can only be underestimated, never overestimated.

The result determines the game. 7/10 games 1st goal scorer wins, 2/10 is a draw and only 1/10 team which lost 1st goal is victorious. Often the team that leads gives the initiative back to the rival and the match stats are faked. Finals have their own rules. 3 matches is not a sample. Data is the most important thing in statistics. The more of them, the better. In the last CL final before the goal was scored Inter had an xg lower than Haaland.
 

Home of Barca Fans

Top