EPL Goal Scoring Time Data Shiny App

shiny football goals

My first Shiny app.

Quang Nguyen https://github.com/qntkhvn
August 24, 2021

One of the items on my to-do list for this summer is to learn and make a Shiny app. I finally did it! Check out the app here.

Description

This is a very simple Shiny app which provides data on all goal scoring events in the English Premier League, from its first season, 1992-93, to the last completed season, 2020-21.

The data were obtained from transfermarkt. In this dataset, the most important feature is the scoring time (in minute) of each goal. As improbable as it may seem, it is impossible (at least for me) to find nicely clean and publicly available data on this specific piece of information. Back in 2019-2020 when I was working on my undergraduate honors thesis on modeling EPL goal scoring, due to the amount of time and most importantly data science skills I had back then, I couldn’t either find a dataset on scoring time, or scrape the extremely messy data from the web, so I ended up collecting the data by hand. Thus, this is my motivation for developing this app.

I certainly would like to add more information to this data, such as the dates and kick-off times of the matches. There are also several issues with this app that need to be addressed in the future. In particular, the data do not specify whether a goal was an own-goal or not. As a result, the variable goal scorer represents players that scored both for and against their team. Another problem I noticed was for the scoring events in the early seasons (1990s), most goals scored during injury time of both halves had 45 (for first half) and 90 (for second half) as the scoring minutes, which did not express how much deep into stoppage time at which the goals occurred. There are other data sources out there that I have not explored which will probably help me fix this issue.

As usual, the source code for this project can be found on GitHub.

Exploring

Let’s quickly explore the EPL goal scoring data provided by my new Shiny app.

library(tidyverse)
library(kableExtra)
theme_set(theme_bw())
goals <- read_csv("goals.csv")

Famous Goals

If you are a long time follower of the EPL, these famous goals should be familiar to you.

Beckham from the halfway line

goals %>% 
  filter(goal_scorer == "David Beckham" & season == "1996-1997" & matchweek == 1) %>% 
  kable()
season matchweek home_club away_club final_score goal_club goal_scorer minute
1996-1997 1 Wimbledon FC Man Utd 0:3 Man Utd David Beckham 90

Rooney’s bicycle-kick

goals %>% 
  filter(goal_scorer == "Wayne Rooney" & season == "2010-2011" & matchweek == 27) %>% 
  kable()
season matchweek home_club away_club final_score goal_club goal_scorer minute
2010-2011 27 Man Utd Man City 2:1 Man Utd Wayne Rooney 78

The Hat-trick by number 20 to clinch title number 20 for Man United

goals %>% 
  filter(goal_scorer == "Robin Van Persie" & season == "2012-2013" & matchweek == 34) %>% 
  kable()
season matchweek home_club away_club final_score goal_club goal_scorer minute
2012-2013 34 Man Utd Aston Villa 3:0 Man Utd Robin Van Persie 2
2012-2013 34 Man Utd Aston Villa 3:0 Man Utd Robin Van Persie 13
2012-2013 34 Man Utd Aston Villa 3:0 Man Utd Robin Van Persie 33

And yes…93:20

goals %>% 
  filter(goal_scorer == "Sergio Aguero" & minute == "90+4") %>% 
  kable()
season matchweek home_club away_club final_score goal_club goal_scorer minute
2011-2012 38 Man City QPR 3:2 Man City Sergio Aguero 90+4

EDA

Who were the Top 5 All-Time EPL Goal Scorers?

goals %>% 
  count(goal_scorer, sort = TRUE) %>% 
  slice_head(n = 5) %>% 
  kable()
goal_scorer n
Alan Shearer 260
Wayne Rooney 209
Andy Cole 187
Sergio Aguero 184
Frank Lampard 178

Which players have scored more than 30 goals in a season?

goals %>% 
  count(goal_scorer, season, sort = TRUE) %>% 
  filter(n >= 30) %>% 
  kable()
goal_scorer season n
Alan Shearer 1994-1995 34
Andy Cole 1993-1994 34
Mohamed Salah 2017-2018 32
Alan Shearer 1993-1994 31
Alan Shearer 1995-1996 31
Cristiano Ronaldo 2007-2008 31
Luis Suarez 2013-2014 31
Harry Kane 2017-2018 30
Kevin Phillips 1999-2000 30
Robin Van Persie 2011-2012 30
Thierry Henry 2003-2004 30

(Note: There were 22 clubs and 42 matchweeks in the first 3 EPL seasons (92-93, 93-94, 94-95), before the number of clubs was reduced to 20 (hence 38 matchweeks) at the start of 95-96.)

What are the highest-scoring teams in the EPL history?

goals %>% 
  count(goal_club, season, sort = TRUE) %>% 
  filter(n >= 90) %>% 
  kable()
goal_club season n
Man City 2017-2018 106
Chelsea 2009-2010 103
Man City 2013-2014 102
Man City 2019-2020 102
Liverpool 2013-2014 101
Man Utd 1999-2000 97
Man City 2018-2019 95
Man City 2011-2012 93

What is the goal scoring trend since 2010?

goals %>% 
  filter(as.numeric(str_sub(season, end = -6)) > 2009) %>% 
  mutate(season = str_replace(season, "-", "-\n")) %>% # re-format season
  count(season) %>% 
  ggplot(aes(x = season, y = n, group = 1)) +
  geom_point(aes(size = n), show.legend = FALSE) +
  geom_line() +
  labs(title = "2014-2015 was a low-scoring season")

Citation

For attribution, please cite this work as

Nguyen (2021, Aug. 24). The Q: EPL Goal Scoring Time Data Shiny App. Retrieved from https://qntkhvn.netlify.app/posts/2021-08-17-epl-scoring-time-shiny/

BibTeX citation

@misc{nguyen2021epl,
  author = {Nguyen, Quang},
  title = {The Q: EPL Goal Scoring Time Data Shiny App},
  url = {https://qntkhvn.netlify.app/posts/2021-08-17-epl-scoring-time-shiny/},
  year = {2021}
}