Wonderstar AnalyticsSee all entries in this blog |
Player Attributes: Caveats & Details (24/03/2014 08:46) |
"Tackling? I don't know what to believe any more. The numbers drive me to madness." - Belizio This isn't a full blog post, rather a place to store a bunch of caveats, musings and technical details related to this already-way-too-long manuleetcode about player attributes. If you haven't read it yet, go there first! Fair warning though - I end up claiming that Shooting is only the fourth most important attribute for a striker. And I do so with a straight face.
Caveats and Details I've mentioned some caveats as I've gone along, but I think I'm going to keep a list of them down here, along with some more technical details and asides that i think are important for fully understanding the patterns we've seen above. I'm happy to update this too as people suggest things! First though, a caveat to the caveats: I've considered and factored in all of the issues below, and I still think the analysis is useful (and gives us a lot of unique, objective data that complements the intuition and rumour that we currently rely on). Put it this way - if I thought any of the caveats made the analysis worthless, I wouldn't have posted it at all! Having said that, as always everybody is free to judge the evidence for themselves believe however much they like - after all, the more accurate you are, the greater your advantage as a manager.
:-) 1. The crowdsourced and regression formulae here are trying to predict one thing, and one thing only - match performance. In general, the performance rating is a pretty good measure of how useful a player was during a match, but it won't be perfect. Any improvements you can make to how performance is measured will change the formulae a bit. UPDATE: It's become apparent from reading the forums that quite a few people see performance as not just imperfect, but virtually useless, and that it has little effect on match outcomes. I have my own reservations about performance, but it so happens that I have a dataset of 456 matches with both the performance ratings and scores for each team. The difference in performance correlates exceptionally strongly with the final score (R=0.88). Here it is visually: Figure 1: Difference in goals scored versus difference in match performance for 456 games
So although perfromance doesn't perfectly predict match outcome - luck plays a role - it's pretty damn close. The better-performing team in ManagerLeague win their game around 90% of the time, which is way more predictable than real life. It's also worth noting that performance is more closely tied to goals scored than it is to goals conceded. That's probably why performance is more predictable for attackers and midfielders than it is for defenders (see Figure 2 further down) - it's because defender performance is more dependent on how well their opponents played. 2. Value is not determined by a formula, it is determined by the market. So if a majority of managers judge players using their Q value, rather than these more sophisticated formulae, then your striker with excellent heading might not fetch as high a price as you'd hoped. On the other hand, you should be able to find value by buying players who are likely to outperform their Q. 3. The attributes used in each position will probably change with tactics, formations and position. For example, if you shoot from distance your midfielder shooting might become more important, and conversely your attacker heading might be required more, since you should earn more corners after saves. Alternatively, if you're shooting only when safe, then shooting and perception should be more critical for your strikers to avoid the offside trap and finish, while midfielder might end up using passing more. And of course a winger could use a different balance of attributes than a defensive midfielder. 4. Attributes are not the only factor to explain performance! Amongst other things, fitness, tactics, teamstats, opposition quality, lineup age, hidden attributes and many more factors are involved. Most of these should average out in the regression analysis, giving us pretty stable attribute estimates, but that does not mean they aren't going to have an effect in your matches. 5. Having said that, player attributes are very important for performance. Here's a way to think about it. If you take two random players in a given position, the one with the higher ability (according to the regression formula) has around a 70% chance of performing better over the course of a season. 6. The regression formulae help you predict performance more accurately than the standard Q formulae. This graph shows how much of a player's performance is predicted solely by his attributes (known as R-squared): Figure 2: Amount of performance explained by standard Q formulae (solid bars) and regression-derived formulae (lighter bars). Using the regression formulae gives a significantly better prediction of performance than Q alone. Certain positions are less predictable than others, for example defender performance is probably more heavily determined by opponent skill than attacker performance is. Even the simple Q formulae tell you a lot about performance - with these sample sizes, an R-squared of even 1% would be sufficient to demonstrate a real relationship; here R-squared reaches as high as 37%.The rest of the variance is split between everything else - formation, lineup age, minutes played, opposition strength, pressure, fitness, hidden attributes, positioning, playstyle, and - biggest of all - chance. With the exception of chance, it's highly unlikely any of those factors on their own gets very close to explaining the amount of performance that attributes do. Attributes aren't the whole story, but they're almost certainly the biggest part of it. 7. The regression analysis might be affected if an attribute is correlated with something else that affects performance. For example see the entries above on Goalkeepers and Midfielders for how Age, Passing, Perception, Heading and other stats could be influenced by positioning, roles at set pieces, or gaining rates throughout the season. 8. Some stats have a benefit or cost that isn't easily captured by performance. For example Stamina helps replenish fitness between games, while Age affects the value of the player, as well as the performance of the players around him. 9. Regression estimates will change with additional data; they shouldn't be considered perfectly accurate. All stats estimated at >4% (i.e. every positive value except stamina for midfielders or defenders) is considered to be reliably greater than zero by the regression. The average error of an attribute's importance is about +/- 4%. Likewise, that means even if an attribute is rated at 0% in the regression, it's still possible that it's actually 2% or something, because of that margin of error. 10. The "Tackling = Ball Control" theory is not perfect, and it could be totally wrong (though how you explain the regression then, I have no idea). For example, we might expect tackling to matter more for midfielders than it does. At some point I will run a more detailed analysis comparing tackling attributes to actual ball control outcomes in matches. That should give a more decisive answer. Much like Friedrich Nietzsche, it also raises the troubling question of whether a cruel and indifferent Spinner might have allowed us to be mislead until now.
- Belizio
Click Here to return to the Wonderstar Analytics home page |
Share on Facebook |
This blogger owns the team The Wonderstars. (TEAM:154471) |