Joshua Kimmich's goal against Borussia Dortmund had just a 6% goal probability. AWS reveals how the ... [+] statistic was produced and what features it wants to add in the future. (Photo by Federico Gambarini/Pool via Getty Images)Federico Gambarini/Pool via Getty Images
Challenged in at a tight space, Joshua Kimmich needed just one glance to see that Borussia Dortmund's goalkeeper Roman Bürki was just a step too far out of his goal. The German midfielder then beat the Dortmund keeper with a lob from 17 meters out. A fantastic goal that, according to Amazon Web Services (AWS), which is the official technology provider of the Bundesliga, had just a 6% probability of going in. It was the goal that decided not just the Klassiker but probably also the 2019/20 Bundesliga title.
AWS's senior sports technology specialist Luuk Figdor laughs when he is confronted with that number. "It shows the skills of the player," Figdor said. "Expected goals is not a model that is specific to a player. It shows how difficult for any football player it would be to score from that position."
Expected goals, or xG, is one of the most used statistics in the football world. Loved by fans, often hotly debated by journalists, and carefully used by those working in the industry, various stats companies have struggled to develop a generic model to measure xG. The saying is that if you look at three different stats providers, you will get three different xG models.
Once again, Figdor produces a smile when confronted with that statement. "Yeah, it is a great point," Figdor said when asked about the accuracy of DFL’s xG model. Figdor explains that it is all about the amount of data you feed into your model. In DFL’s case, the company used AWS to feed 40,000 goals dating back to 2007. "It is a lot of shots, a lot of data that went into the creation of this algorithm." The shots are, however, just a small part of the overall data that went into creating AWS's xG model. "We have at hand also the positional data, the positions of all players every 40 milliseconds and the ball. We then use this data, working with football experts to work on a number of features that they think impact the chance of the ball going in."
That second aspect is essential to understand DFL’s xG model. Because as Figdor explains, you can be two meters away from the goal, but scoring would be complicated if there are 11 players between you and the goal line. "That being said, it is still a probability; a 6% chance of scoring a goal can still only result into two outcomes, a goal or not a goal," Figdor said.
Speed! Alphonso Davies is regularly one of the fastest players in the Bundesliga. AWS uses ... [+] statistics like speed to help bring context to a game and provide storylines for journalists. (Photo by Alexander Hassenstein/Getty Images)Getty Images
For Figdor and AWS, xG or any other statistic provided during a Bundesliga matchday is supposed to underline a player's skill. But AWS would like to take that statistic even further. The next step would be to create xG for specific players. "In one of our new statistics, you will see something called shot efficiency," Figdor said. "It will highlight that the best strikers in the league are often better in converting chances into goals than the average striker. We are hoping to bring that insight to our fans or the fans of the league."
The strategic partnership between AWS and the Bundesliga, after all, serves many different purposes. One is to provide clubs with essential data, but another is to enhance the viewing experience and turn the game into a story for millions of fans worldwide. That story can be an improbable goal deciding the German championship like Kimmich's goal against Borussia Dortmund. But the statistics go way beyond xG. They also provide the viewers with speed data, possession data, and much more. Viewers in North America, for example, will be increasingly familiar with the fastest runner statistic as Alphonso Davies is often the fastest player on the pitch. The fastest runner stat is one of the many data points provided by AWS.
In order to provide that data to the fans, journalists, and clubs, AWS uses data from 16 to 20 cameras a game. "They capture the positions of players, the ball 25 times a second," Figdor said. "So if you do some quick math, that is 3.6 million positional data points per match that you have to evaluate to power those statistics." AWS then aggregates that data within 500 milliseconds, calculate the statistics, and then sends them to all countries around the globe, allowing viewers to see that data in an instant on their television screen.
With that in mind, though, the amount of data captured also opens the door to an infinite amount of statistics and, in turn, countless possibilities to tell the story of a game. "I think moving forward; we are going to see more and more commentators and also broadcasters using these insights to explain to viewers the story of this game and which players make the difference to make those little moments even more memorable."
Manuel Veth is the editor-in-chief of the Futbolgrad Network and the Area Manager USA at Transfermarkt. He has also been published in the Guardian, Newsweek, Howler, Pro Soccer USA, and several other outlets. Follow him on Twitter: @ManuelVeth