Run Boy Run (An Analysis of Runs in Behind in Soccer)

Abstract

Off-the-ball runs in behind are a fundamental attacking mechanism in soccer, influencing both the creation of goal-scoring opportunities and the disruption of defensive structures. This study analyzes the characteristics of such runs and their relationship to shot creation using a large-scale spatiotemporal tracking dataset provided by SkillCorner. By integrating player-tracking data with event-level annotations of off-ball runs, we reconstructed possessions, identified whether each possession led to a shot, and extracted a comprehensive set of run-level features, including spatial endpoints, run angle, speed, defensive line interactions, and positional relationships to the ball.

We first established that possessions containing at least one run in behind were (14%) more likely to result in a shot, confirming their attacking value. To identify which run characteristics predict shot outcomes, we conducted a series of one-variable logistic regression models, supported by distributional comparisons and correlation analysis to detect feature relevance and redundancy. Several features emerged as significant predictors of shot-producing possessions: end location of the run, distance covered, width change during the run, diagonal run angle, change in defensive line height, and proximity of the runner to the ball carrier.

Analysis of shot-leading versus non-shot-leading runs revealed consistent patterns. Runs that travel longer distances, move centrally, end near the top of the penalty box, follow diagonal trajectories, create measurable defensive-line displacement, and occur closer to the player in possession all show increased likelihood of producing a shot. These findings align with tactical intuition around counterattacks, striker movement, and defensive manipulation.

While the linear models used cannot fully capture the complex interdependence between features, this work provides evidence-based insights into what constitutes an effective run in behind. The results offer a foundation for future modeling using more expressive nonlinear methods and provide practical guidance for coaching, player development, and data-driven tactical analysis.

Discussion and Conclusion

Off-the-ball runs in behind play a critical role in soccer, serving both as a direct attacking threat and as a mechanism for disrupting defensive structures. We first conclude that possessions containing at least one run in behind have a significantly higher (14%) probability of leading to a shot. This analysis was made possible by the data collection efforts of SkillCorner, who provided the large-scale spatiotemporal tracking dataset used in this study.

SkillCorner applied proprietary algorithms to detect and categorize off-the-ball runs. Using the SkillCorner datasets, we combined player- and ball-tracking data with an event-level dataset of off-the-ball runs to compute run-level features essential to our analysis. These features included spatial endpoints (x, y), run angle, speed, curvature, defensive-line interactions, and the number of opponents passed. We then reconstructed possessions across matches, identified whether each possession resulted in a shot, and created a possession index linking runs to possessions. This process yielded the final dataset used for analysis.

The second major component of our analysis focused on identifying which run characteristics significantly predict whether a possession leads to a shot. Using one-variable logistic regression models, we reduced the full feature set to a smaller subset of significant predictors. These included end location of the run along the x-axis, distance covered, width at the end of the run and change in width during the run, run angle, defensive-line height and the runner’s relationship to that line, and distance between the runner and the player in possession.

It is important to note that soccer actions and off-the-ball runs cannot be fully characterized using linear relationships alone. Many features interact and exhibit strong correlations, which limits the ability of simple models to capture their combined effects. Our correlation analysis revealed several highly correlated features, such as distance covered and change in defensive-line height, which is expected given how attacking plays unfold. Nevertheless, these linear models still provide meaningful insight into how individual run characteristics relate to shot creation.

Beyond identifying significant predictors, we also examined how these features influence the likelihood of a run contributing to a shot. By separating runs into those that led to shots and those that did not, we compared feature distributions to uncover consistent patterns. From this analysis, we identified six key characteristics of effective runs in behind:

  1. Runs that start higher up the field and end near the top of the penalty box are more likely to lead to shots. This aligns with the objective of advancing the ball into dangerous central areas closer to goal.
  2. Runs that cover longer distances tend to produce more shots, consistent with the high threat posed by counterattacks in which players exploit large spaces at speed.
  3. Runs that significantly reduce width by moving centrally and that finish in central areas have a higher likelihood of leading to shots than runs that remain wide.
  4. Diagonal runs toward the center of the field are more effective than purely vertical runs, supporting tactical principles commonly associated with striker movement.
  5. Runs that create substantial disruption to the defensive line, relative to its initial height, are more likely to result in shots. Both forcing a low defensive line to drop further and compelling a high line to retreat substantially increase shot probability.
  6. Runs made closer to the player in possession are significantly more likely to lead to shots, emphasizing the importance of providing an accessible and immediate passing option.

A key limitation of this work is that many of the analyzed variables interact in nonlinear ways that simple logistic regression models cannot fully capture. However, the findings offer clear, evidence-based insights into what constitutes an effective run in behind. These results help reinforce tactical principles for striker and winger movement and provide a foundation for player development, coaching, and future modeling using more expressive nonlinear approaches.