Explore Defensive Duel Value

Background

Valuing actions in soccer is one of the hardest and most crucial parts for both game analysis and recruiting. Frameworks like xG, xT, and VAEP excel at modeling offensive contributions but struggle with defensive actions that prevent events rather than cause them. This work takes a counterfactual approach: measuring how a defender's action shifts danger relative to what was expected given the duel's context.

Concretely, for an observed outcome \(c^\star\) at location \((x, y)\) with speed-of-play context \(\text{SoP}\), the duel's value is \[ \mathrm{DDV} \;=\; \underbrace{\sum_{c}\; P(c\mid x,y,\text{SoP})\,P(\text{goal}\mid c,x,y,\text{SoP})}_{\text{expected danger}} \;-\; \underbrace{P(\text{goal}\mid c^\star,x,y,\text{SoP})}_{\text{realized danger}}. \] Positive DDV means the defender averted more danger than the average duel in that situation would have.

Where defensive duels happen

Smoothed marginal density of defensive-duel start locations across the dataset. Bright bands show where contests cluster — wide channels in our own half and along the defensive third — and motivate why DDV is most informative where duels actually occur.

Methods

Data from 800k+ defensive duels across 7k+ Wyscout matches were analyzed. For each duel, outcome and spatial features were extracted, and a binary danger label was assigned based on whether a goal was conceded within 20 seconds of the duel. Three speed-of-play features captured pre-duel ball tempo:

net x-displacement of the ball in the prior 30 s — \(\Delta x_{30}\)
total ball distance in the prior 30 s — \(d_{30}\)
ratio of ball distance over 30 vs. 60 s — \(d_{30}/d_{60}\)

Two LightGBM models were trained on the spatial and speed-of-play features:

\[ \text{outcome model:}\quad P(c\mid x,y,\text{SoP}),\;\; c\in\{\text{beat, stopped, recovered}\} \] \[ \text{danger model:}\quad P(\text{goal in next 20s}\mid c, x, y,\text{SoP}) \]

Combining the two yields the per-duel value via the formula above. The interactive panels below let you query both surfaces at any \((x, y, \text{SoP})\) the models support.

Interactive duel breakdown

Click anywhere on the defensive half to place a duel, pick the outcome, and adjust the three speed-of-play context sliders to see how the location prior \(P(\text{outcome}\mid x, y, \text{SoP})\) and the conditional goal risk \(P(\text{goal}\mid x, y, \text{outcome}, \text{SoP})\) respond. The Duel Value is the gap between expected and realized danger, both evaluated at the same location and speed-of-play context.

Step 1 · click to place the duel

Step 2 · pick the observed outcome

Step 3 · Speed of play context

x—

y—

outcome—

Click on the pitch and pick an outcome to see the DDV breakdown.

Where does this duel rank?

Histogram of duel values across all defensive duels in the dataset. The red line marks this duel's value; percentile is the share of duels with a smaller value.

duel value—

percentile—

population mean—

population std—

How the probabilities vary across the pitch

The speed-of-play sliders below control the context used by all four surface tabs. Each tab shows one panel per outcome (beat / recovered / stopped). Toggle tabs to see where the models place each quantity across the defensive half.

Speed of play context for surfaces

\(P(\text{outcome}\mid x, y, \text{SoP})\)

\(P(\text{goal}\mid x, y, \text{outcome}, \text{SoP})\)

\(\mathrm{DDV}\mid x, y, \text{outcome}, \text{SoP}\)

\(\mathrm{DDV}\,\cdot\,P(\text{duel}\mid x, y)\)

How speed of play impacts the probabilities

How much does shifting a speed-of-play feature alone change the model's output at each location? Each panel is a difference heatmap: the probability evaluated with the SoP feature pinned at its 90th percentile minus the probability evaluated with that feature pinned at its 10th percentile, with the other two SoP features held at their population medians. Hot cells = the location is markedly more likely to produce that outcome (or concede a goal) under high-tempo play; cold cells = high tempo suppresses it. Panels are arranged as SoP feature (rows) × outcome class (columns); use the tabs to switch between the duel outcome and conditional goal models.

\(\Delta P(\text{outcome}\mid x, y, \text{SoP})\)

\(\Delta P(\text{goal}\mid x, y, \text{outcome}, \text{SoP})\)