My first Shiny app.
One of the items on my to-do list for this summer is to learn and make a Shiny app. I finally did it! Check out the app here.
This is a very simple Shiny app which provides data on all goal scoring events in the English Premier League, from its first season, 1992-93, to the last completed season, 2020-21.
The data were obtained from transfermarkt. In this dataset, the most important feature is the scoring time (in minute) of each goal. As improbable as it may seem, it is impossible (at least for me) to find nicely clean and publicly available data on this specific piece of information. Back in 2019-2020 when I was working on my undergraduate honors thesis on modeling EPL goal scoring, due to the amount of time and most importantly data science skills I had back then, I couldn’t either find a dataset on scoring time, or scrape the extremely messy data from the web, so I ended up collecting the data by hand. Thus, this is my motivation for developing this app.
I certainly would like to add more information to this data, such as the dates and kick-off times of the matches. There are also several issues with this app that need to be addressed in the future. In particular, the data do not specify whether a goal was an own-goal or not. As a result, the variable goal scorer represents players that scored both for and against their team. Another problem I noticed was for the scoring events in the early seasons (1990s), most goals scored during injury time of both halves had 45 (for first half) and 90 (for second half) as the scoring minutes, which did not express how much deep into stoppage time at which the goals occurred. There are other data sources out there that I have not explored which will probably help me fix this issue.
As usual, the source code for this project can be found on GitHub.
Let’s quickly explore the EPL goal scoring data provided by my new Shiny app.
library(tidyverse)
library(kableExtra)
theme_set(theme_bw())
goals <- read_csv("goals.csv")
If you are a long time follower of the EPL, these famous goals should be familiar to you.
Beckham from the halfway line
goals %>%
filter(goal_scorer == "David Beckham" & season == "1996-1997" & matchweek == 1) %>%
kable()
season | matchweek | home_club | away_club | final_score | goal_club | goal_scorer | minute |
---|---|---|---|---|---|---|---|
1996-1997 | 1 | Wimbledon FC | Man Utd | 0:3 | Man Utd | David Beckham | 90 |
Rooney’s bicycle-kick
goals %>%
filter(goal_scorer == "Wayne Rooney" & season == "2010-2011" & matchweek == 27) %>%
kable()
season | matchweek | home_club | away_club | final_score | goal_club | goal_scorer | minute |
---|---|---|---|---|---|---|---|
2010-2011 | 27 | Man Utd | Man City | 2:1 | Man Utd | Wayne Rooney | 78 |
The Hat-trick by number 20 to clinch title number 20 for Man United
goals %>%
filter(goal_scorer == "Robin Van Persie" & season == "2012-2013" & matchweek == 34) %>%
kable()
season | matchweek | home_club | away_club | final_score | goal_club | goal_scorer | minute |
---|---|---|---|---|---|---|---|
2012-2013 | 34 | Man Utd | Aston Villa | 3:0 | Man Utd | Robin Van Persie | 2 |
2012-2013 | 34 | Man Utd | Aston Villa | 3:0 | Man Utd | Robin Van Persie | 13 |
2012-2013 | 34 | Man Utd | Aston Villa | 3:0 | Man Utd | Robin Van Persie | 33 |
And yes…93:20
season | matchweek | home_club | away_club | final_score | goal_club | goal_scorer | minute |
---|---|---|---|---|---|---|---|
2011-2012 | 38 | Man City | QPR | 3:2 | Man City | Sergio Aguero | 90+4 |
Who were the Top 5 All-Time EPL Goal Scorers?
goals %>%
count(goal_scorer, sort = TRUE) %>%
slice_head(n = 5) %>%
kable()
goal_scorer | n |
---|---|
Alan Shearer | 260 |
Wayne Rooney | 209 |
Andy Cole | 187 |
Sergio Aguero | 184 |
Frank Lampard | 178 |
Which players have scored more than 30 goals in a season?
goal_scorer | season | n |
---|---|---|
Alan Shearer | 1994-1995 | 34 |
Andy Cole | 1993-1994 | 34 |
Mohamed Salah | 2017-2018 | 32 |
Alan Shearer | 1993-1994 | 31 |
Alan Shearer | 1995-1996 | 31 |
Cristiano Ronaldo | 2007-2008 | 31 |
Luis Suarez | 2013-2014 | 31 |
Harry Kane | 2017-2018 | 30 |
Kevin Phillips | 1999-2000 | 30 |
Robin Van Persie | 2011-2012 | 30 |
Thierry Henry | 2003-2004 | 30 |
(Note: There were 22 clubs and 42 matchweeks in the first 3 EPL seasons (92-93, 93-94, 94-95), before the number of clubs was reduced to 20 (hence 38 matchweeks) at the start of 95-96.)
What are the highest-scoring teams in the EPL history?
goal_club | season | n |
---|---|---|
Man City | 2017-2018 | 106 |
Chelsea | 2009-2010 | 103 |
Man City | 2013-2014 | 102 |
Man City | 2019-2020 | 102 |
Liverpool | 2013-2014 | 101 |
Man Utd | 1999-2000 | 97 |
Man City | 2018-2019 | 95 |
Man City | 2011-2012 | 93 |
What is the goal scoring trend since 2010?
goals %>%
filter(as.numeric(str_sub(season, end = -6)) > 2009) %>%
mutate(season = str_replace(season, "-", "-\n")) %>% # re-format season
count(season) %>%
ggplot(aes(x = season, y = n, group = 1)) +
geom_point(aes(size = n), show.legend = FALSE) +
geom_line() +
labs(title = "2014-2015 was a low-scoring season")
For attribution, please cite this work as
Nguyen (2021, Aug. 24). The Q: EPL Goal Scoring Time Data Shiny App. Retrieved from https://qntkhvn.netlify.app/posts/2021-08-17-epl-scoring-time-shiny/
BibTeX citation
@misc{nguyen2021epl, author = {Nguyen, Quang}, title = {The Q: EPL Goal Scoring Time Data Shiny App}, url = {https://qntkhvn.netlify.app/posts/2021-08-17-epl-scoring-time-shiny/}, year = {2021} }