The Q: NFL CoD, Gaussian version

Quang Nguyen

So, I’ve been working a lot with angular/circular/directional data for my PhD dissertation, which focuses on modeling and evaluating player movement in sports with tracking data. Last summer, I wrote a paper on evaluating change of direction (CoD) of NFL ball carriers. I also gave a talk on this same topic at Harvard last fall (and even presented about it during a science & comedy show when I visited Chicago last summer.)

At the heart of my paper is the following model: \begin{aligned} \varphi_{ijt} &\sim \textsf{von Mises}(\mu_{ijt}, \kappa_{ijt}),\\ \tan \frac{\mu_{ijt}}{2} &= \alpha_0 + \mathbf{x}_{ijt}^\top \boldsymbol{\beta},\\ \log \kappa_{ijt} &= \gamma_0 + \mathbf{z}_{ijt}^\top \boldsymbol{\psi} + u_j,\\ u_j &\sim \mathcal N\left(0, \tau^2_{p[j]}\right).\\ \end{aligned} To quickly summarize (taken shamelessly straight from my paper):

I use a von Mises distribution to model the turn angle \varphi_{ijt} for ball carrier j at frame t of play i.
I explicitly model both the mean \mu and concentration \kappa parameters with various covariates (see my paper for full details).
The model includes a random intercept u_j with variance \tau^2_{p[j]} depending on player position, where p[j] denotes position group of ball carrier j, so each position (RB, WR, TE) has its own variance.
I can then identify the shiftiest ball carriers with high variability (low concentration) in CoD.
The model is implemented in a Bayesian framework, and the full code is available on GitHub.

I personally think this is a super neat model and a cool application of directional statistics in sports. The paper has also been well-received by the football analytics community (see, e.g., here).

The only con of this model, though, is that it took forever to fit. Thus, I’ve been looking for ways to speed up the computation. This would be especially helpful because, in another project, I’m using this model as a piece to simulate player movement.

Here is a nice way to achieve this. It is known that as \kappa \rightarrow \infty, a \textsf{von Mises}(\mu, \kappa) distribution converges to a \mathcal N(\mu, \sigma^2 = 1/\kappa) distribution. (Side note: as \kappa \rightarrow 0, the von Mises distribution converges to a uniform.) This can be easily be found on the von Mises distribution’s Wikipedia. The Stan documentation even recommends using a Gaussian in place of von Mises when \kappa is large for the sake of numerical stability.

Anyway, since the observed turn angle distribution in my data is highly concentrated (around 0), I can rely on this fact and fit the following Gaussian model: \begin{aligned} \varphi_{ijt} &\sim \mathcal N(\mu_{ijt}, \sigma_{ijt}),\\ \mu_{ijt} &= \alpha_0 + \mathbf{x}_{ijt}^\top \boldsymbol{\beta},\\ \log \sigma_{ijt} &= \gamma_0 + \mathbf{z}_{ijt}^\top \boldsymbol{\psi} + u_j,\\ u_j &\sim \mathcal N\left(0, \tau^2_{p[j]}\right).\\ \end{aligned} After fitting, I get basically the same results as what I presented in my paper using the von Mises model. What’s nice about this is that the Gaussian model fit about three times faster. (Well, it still took a couple of hours, but the point is that it’s way faster.)

Below, I provide the full code for model fitting and some analyses. Again, these should match the results shown in my paper.

Model fitting with brms

Show code

library(tidyverse)
theme_set(theme_light())

library(brms)
angle_gaussian <- brm(
  bf(
    turn_angle ~ prev_angle + adj_bc_x + adj_bc_y + adj_bc_x_from_first_down +
      n_left_bc_defense + n_front_bc_defense + n_left_bc_offense + 
      n_front_bc_offense + adj_x + adj_y + adj_x_change + adj_y_change_abs + 
      dist_to_bc + def_s + angle_with_bc,
    sigma ~ bc_s + bc_a + bc_cum_dis + bc_type + bc_position + 
      (1 | gr(bc_id, by = bc_position)), 
    decomp = "QR"
  ),
  chains = 4,
  iter = 4000,
  warmup = 2000,
  seed = 60660,
  cores = 4,
  init = "0",
  control = list(adapt_delta = 0.9),
  backend = "cmdstanr",
  data = tracking_angle
)

Fixed effect coefficients when modeling the Gaussian standard deviation parameter

Higher variability in CoD is associated with lower speed, higher acceleration, and higher cumulative distance covered.
Players on run plays display less variability in CoD (relative to pass plays).
WRs display more variability in CoD (relative to RBs).

Show code

angle_gaussian |> 
  summary() |> 
  pluck("fixed") |> 
  rownames_to_column(var = "term") |> 
  filter(str_detect(term , "sigma_")) |> 
  mutate(across(where(is.numeric), ~ round(.x, 3)))

                 term Estimate Est.Error l-95% CI u-95% CI  Rhat
1     sigma_Intercept   -0.540     0.014   -0.567   -0.512 1.002
2          sigma_bc_s   -0.381     0.001   -0.383   -0.379 1.000
3          sigma_bc_a    0.050     0.002    0.046    0.053 1.001
4    sigma_bc_cum_dis    0.009     0.000    0.009    0.010 1.000
5 sigma_bc_typerusher   -0.003     0.009   -0.020    0.014 1.000
6 sigma_bc_positionTE   -0.020     0.026   -0.069    0.031 1.003
7 sigma_bc_positionWR    0.090     0.020    0.050    0.129 1.004
   Bulk_ESS Tail_ESS
1  2031.881 3717.247
2 12021.661 6458.493
3 17089.828 6047.897
4  8708.711 6816.165
5  8529.049 5738.125
6  1316.987 2467.838
7  1073.820 1832.806

Positional differences in random effect variance

There’s more within-position variation among TEs and WRs, compared to RBs.
This makes total sense given the types of movement of ball carriers in these position groups. (RB movements are typically more structured. TE and WR movements are often in the open-field.)

Show code

angle_gaussian |> 
  as_tibble() |> 
  select(contains("sd_")) |> 
  pivot_longer(everything(), names_to = "position") |> 
  mutate(position = str_remove(position, "sd_bc_id__sigma_Intercept:bc_position")) |> 
  ggplot(aes(value)) +
  geom_histogram(alpha = 0.7, bins = 60, show.legend = FALSE) +
  facet_wrap(~ position, nrow = 3) +
  labs(x = "Within-position standard deviation",
       y = "Frequency")

Ball carrier ratings: wide receivers (top/bottom 10)

Higher estimates now correspond directly to higher variance in CoD.
George Pickens is the shiftiest. DK Metcalf is the least shifty. (Curious, Steelers…)

Show code

library(tidybayes)
players <- read_csv("players.csv")

# wr_filtered <- tracking_angle |> 
#   distinct(gameId, playId, bc_id, bc_position, bc_type) |> 
#   filter(bc_position == "WR") |> 
#   count(bc_id) |> 
#   filter(n >= 15) |> 
#   pull(bc_id)

wr_posterior <- angle_gaussian |> 
  spread_draws(r_bc_id__sigma[nflId,term]) |>
  left_join(players) |> 
  filter(nflId %in% wr_filtered) |>
  group_by(nflId, displayName, position) |> 
  summarize(posterior_mean = mean(r_bc_id__sigma)) |> 
  ungroup() |> 
  arrange(posterior_mean)

wr_top_bottom <- wr_posterior |> 
  slice(c(1:10, (n()-9):n())) |> 
  pull(nflId)

angle_gaussian |> 
  spread_draws(r_bc_id__sigma[nflId,term]) |> 
  ungroup() |> 
  left_join(players) |> 
  filter(nflId %in% wr_top_bottom) |> 
  mutate(displayName = factor(displayName, 
                              levels = wr_posterior$displayName)) |> 
  ggplot(aes(x = r_bc_id__sigma, y = displayName)) +
  ggridges::geom_density_ridges(rel_min_height = 0.01, alpha = 0.7) +
  labs(x = "Ball carrier standard deviation random effect",
       y = NULL)

Lastly, a quick note on prediction.

The Gaussian distribution is defined over the entire real line. When making angular predictions with the Gaussian model, there will be values that leak past the circular boundaries [-\pi, \pi]. (The proportion of these values should be super small if the data are highly concentrated.) If this happens, just simply wrap those values back into the circle. (Use atan2(sin(x), cos(x)) in R.) Think of this as follows: In a linear world, the number line goes on forever in both directions. In a circular world, that line is “glued” to the surface of a circle.

Enjoy this clip from Uncontrolled Variables—the science & comedy show I mentioned at the start of this post.

NFL CoD, Gaussian version

Citation