5 Part II - Goals

5.1 Introduction

5.1.1 Topic

Cristiano Ronaldo

Cristiano Ronaldo

Cristiano Ronaldo is arguably one of the top football (soccer) player of all time. Ronaldo is famous for his speed, dribbling skills, and most importantly, his goal scoring ability. As a guy who has been following soccer for more than 15 years, I have witnessed so many goals that Ronaldo has scored throughout his career, from stunning direct free kicks to crucial game-winning shots. Ronaldo used to be my favorite player, due to the fact that he spent 6 seasons from 2003 to 2009 with my favorite ball squad - Manchester United. Then he left MU and joined Real Madrid, where he has elevated his game to a whole new level and established himself as one of the top footballers in the world. Because of all the information I just mentioned, I decided to choose a dataset about Ronaldo’s goals and provide some analyses on it.

5.1.2 Data

The data was collected from Transfermarkt.com. Notice that all the goals in this dataset are all Ronaldo’s goal at the Club level (i.e. none of them are his International goals for his home nation - Portugal). The dataset does not have any names for the variables, so I will assign the variable’s names. Each of Ronaldo’s goals is represented by the following variables (which I’ll assign, of course!):

  • Season: A total of 16 seasons from 2002-03 to 2017-18

  • Competition: 11 different leagues and cups that Ronaldo has scored in

  • Competition_Type: Whether a competion is a Domestic League, Domestic Cup, European Cup or International Cup

  • Club: The 3 clubs that Ronaldo has played for

  • Opponent: Clubs that Ronaldo has scored against

  • Opponents_Country: The opposing squad’s “nationality”

  • Treated_as: For the most part, this is basically where the goals were in the back of the net (Home/Away). But in some cases, for example, UEFA Champions League Final, the game was played at a neutral field, a team will be treated as the “Home” team and got to wear their Home uniform.

  • Final_Score: The result of the matches

  • Minute: At what point of the match did a goal took place?

  • Score_at_this_point: The game score after a goal

  • Type_of_goal: Whether a goal is a header, left/right-footed shot, tap-in, penalty or direct free kick

5.2 Analysis

I started off by loading the packages that I’ll need to analyze the dataset.

library(tidyverse)
library(mosaic)
library(readxl)
library(maps)
library(knitr)

Next, the dataset was loaded from my personal drive.

Goals <- read_excel("~/Data229/Project/Goals/CR7 Goals.xlsx", col_names = FALSE)

Like I mentioned above, the following names were being assigned to the variables of this dataset:

colnames(Goals) <- c("Season","Competition","Competition_Type","Club","Opponent","Opponents_Country","Treated_as","Final_Score","Minute","Score_at_this_point","Type_of_Goal")

There were also a number of missing cells in the inital dataset. This is because when Ronaldo scored multiple goals in one match, they just recorded the goals’ minute, the score after a goal and the type of goal. So the fill() function was used to complete this dataset.

Goals <- Goals %>% 
  fill(Season, Competition, Competition_Type, Club, Opponent, Opponents_Country, Treated_as, Final_Score, Minute, Score_at_this_point, Type_of_Goal)

5.2.1 Club Goals

The first figure that I chose to analyze was Ronaldo’s club goals. Below are the visual and numerical summaries for Ronaldo’s goals for his 3 teams:

Goals %>% 
  group_by(Club) %>% 
  summarise(Club_Goals = n()) %>% 
  kable()
Club Club_Goals
Manchester United 118
Real Madrid 449
Sporting CP 5
Goals %>% 
  group_by(Club) %>% 
  summarise(Club_Goals = n()) %>% 
  ggplot(mapping = aes(x = "", y = Club_Goals, fill = Club)) +
  geom_bar(stat = "identity") +
  coord_polar("y", start = 0) + scale_fill_brewer(palette ="Blues") +  theme_minimal() +
  ggtitle("Ronaldo's Club Goals") +
  xlab("") + ylab("")

Ronaldo has scored 449 goals for Real Madrid, which is the highest among the 3 clubs that he has played for. He also had 118 goals for Manchester United, and his career Sporting CP goals is only 5.

5.2.2 Competition Goals

Below are the visual and numerical summaries for Ronaldo’s goals in leagues and cups. I actually created a table called “CompetitionsGoals” since I’ll use this later on.

kable(Competition_Goals <- Goals %>% 
  group_by(Competition) %>% 
  summarise(Competition_Goals = n()))
Competition Competition_Goals
Copa del Rey 22
English Football League Cup 3
English Premier League 84
FA Cup 14
FIFA Club World Cup 7
La Liga 310
Primeira Liga 3
Spanish Supercopa 4
Taca de Portugal 2
UEFA Champions League 121
UEFA Supercup 2
Competition_Goals %>% 
  ggplot(mapping = aes(x = "", y = Competition_Goals, fill = Competition)) +
  geom_bar(stat = "identity") +
  coord_polar("y", start = 0) + scale_fill_brewer(palette ="Set3") +  theme_minimal() +
  ggtitle("Ronaldo's Competition Goals") +
  xlab("") + ylab("")

Ronaldo has scored the most goals in the La Liga (310). His next 2 highest scoring compeitions are UEFA Champions League (121) and English Premier League (84). On the other hand, Ronaldo’s number of goals in smaller competitions like UEFA Supercup, Taca de Portugal, Primeira Liga,… are very small, primarily due to him playing a tiny amount of games in those leagues/cups (which I’ll get to in just a bit)

5.2.3 Goals vs Appearances

Is the a connection between Ronaldo’s goals and the number of competition matches he has participated in? We’ll soon find out!

The table below illustrates the total number of games Ronaldo has played in 11 different leagues/cups (The data was collected from Wikipedia):

kable(Appearances <- tribble(
  ~Competition, ~Matches,
  "English Premier League", 196,
  "La Liga", 291,
  "Primeira Liga", 25,
  "Taca de Portugal", 3,
  "Copa del Rey", 30,
  "FA Cup", 26,
  "English Football League Cup", 12,
  "UEFA Supercup", 2,
  "Spanish Supercopa", 7,
  "FIFA Club World Cup", 8,
  "UEFA Champions League", 158))
Competition Matches
English Premier League 196
La Liga 291
Primeira Liga 25
Taca de Portugal 3
Copa del Rey 30
FA Cup 26
English Football League Cup 12
UEFA Supercup 2
Spanish Supercopa 7
FIFA Club World Cup 8
UEFA Champions League 158

Now, let’s make a graph to find out the relationship between goals and games played. But before the plot is made, I joined the 2 tables Competition_Goals and Appearances to get a new table called “Goals_n_Matches”

Here’s a look at that table:

kable(Goals_n_Matches <- full_join(Competition_Goals, Appearances, by = "Competition"))
Competition Competition_Goals Matches
Copa del Rey 22 30
English Football League Cup 3 12
English Premier League 84 196
FA Cup 14 26
FIFA Club World Cup 7 8
La Liga 310 291
Primeira Liga 3 25
Spanish Supercopa 4 7
Taca de Portugal 2 3
UEFA Champions League 121 158
UEFA Supercup 2 2

Now I can use the table I just created to make my graph.

Goals_n_Matches %>% 
  ggplot(mapping = aes(x = Matches, y = Competition_Goals)) +
  geom_point(mapping = aes(color = Competition)) +
  stat_smooth(method = "lm", se = FALSE)

The plot reveals a strong, linear and positive relationship between Competition goals and matches. The overall trend is the more games Ronaldo plays in a competition, the more goals he scores.

lm(Competition_Goals ~ Matches, data = Goals_n_Matches)

Call:
lm(formula = Competition_Goals ~ Matches, data = Goals_n_Matches)

Coefficients:
(Intercept)      Matches  
    -9.4125       0.8912  

The regression equation is CompGoals^ = 0.8912*Matches - 9.41

The slope of this equation is 0.8912, which indicates that every extra match is associated with an increase of 0.89 in goals. (If he plays 100 matches, his total goals will increase by about 89)

5.2.4 Goals per Season

Goals %>% 
  group_by(Season, Club) %>% 
  summarise(Total_Goals = n()) %>%
  ggplot(mapping = aes(x = as.factor(Season), y = Total_Goals)) +
  geom_point(mapping = aes(size = Total_Goals, color = Club)) +
  ggtitle("Ronaldo's Club Goals per Season") +
  xlab("Season") + ylab("Goals")

Based on the graph, Ronaldo’s scoring has improved throughout his career. During the 6 years stretch from 2010 to 2015, Ronaldo scored 40 or more goals in every single season. His highest scoring season was 2014-15 where he netted more than 60 goals. The number of goals during Ronaldo first 4 years season is not high, simply because he was still a “baby” back then and did not have plenty of playing time.

5.2.5 Multiple Goals

Ronaldo is a great scorer, and we’ve seen him scored multiple goals in a match so many times. This table shows how many matches Ronaldo has put the ball in the opponent’s net more than 1 time. I used the following soccer lingo to illustrate the amount of goals scored:

  • Brace = 2 goals
  • Hat-trick = 3 goals
  • Poker = 4 goals
  • Glut = 5 goals
kable(MultipleGoals <- Goals %>%
  group_by(Season, Competition, Competition_Type, Club, Opponent, Opponents_Country, Treated_as, Final_Score) %>% 
  summarise(Scored = n()) %>% 
  filter(Scored > 1) %>%
  group_by(Scored) %>% 
  summarise(Total = n()) %>% 
  spread(key = Scored, value = Total) %>%
  rename(Brace = "2","Hat-trick" = "3", Poker = "4", Glut = "5") %>%
  gather(Brace, 'Hat-trick', Poker, Glut, key = Scored, value = Total))
Scored Total
Brace 104
Hat-trick 37
Poker 6
Glut 2

Ronaldo has played against many opponents from different nations in his career. This table is named MultipleGoals and it shows several different countries and how many matches Ronaldo has scored more than 1 goal against opponents from those countries. For example, if Country = France and HowManyTimes = 4, this means CR7 has scored multiple goals in 4 matches against French teams. (This is not the same as the number of goals he has had in France!)

kable(MultipleGoals <- Goals %>%
  group_by(Season, Competition, Competition_Type, Club, Opponent, Opponents_Country, Treated_as, Final_Score) %>% 
  summarise(Scored = n()) %>% 
  filter(Scored > 1) %>% 
  group_by(Opponents_Country) %>%
  rename(Country = Opponents_Country) %>% 
  summarise(HowManyTimes = n()))
Country HowManyTimes
Cyprus 3
Denmark 1
England 24
France 4
Germany 8
Italy 4
Japan 1
Netherlands 2
Portugal 1
Russia 1
Spain 92
Sweden 2
Switzerland 1
Turkey 2
Ukraine 3

The table below shows all the countries in the MultipleGoals table and their latitude (lat) and longtitude (long). I chose that lat and long of the capital city or biggest city of each nation to represent the whole nation.

kable(Places <- tribble(
  ~Country, ~lat, ~long,
  "Cyprus", 35.2, 33.4,
  "Denmark", 55.7, 12.6,
  "England", 51.5, -0.1,
  "France", 48.9, 2.3,
  "Germany", 52.5, 13.4,
  "Italy", 41.9, 12.5,
  "Netherlands", 52.4, 4.9,
  "Portugal", 38.7, -9.1,
  "Russia", 55.8, 37.6,
  "Spain", 40.4, -3.7,
  "Sweden", 59.3, 18.1,
  "Switzerland", 47.6, 7.6,
  "Turkey", 41.0, 29.0,
  "Ukraine", 50.5, 30.5,
  "Japan", 35.7, 139.8))
Country lat long
Cyprus 35.2 33.4
Denmark 55.7 12.6
England 51.5 -0.1
France 48.9 2.3
Germany 52.5 13.4
Italy 41.9 12.5
Netherlands 52.4 4.9
Portugal 38.7 -9.1
Russia 55.8 37.6
Spain 40.4 -3.7
Sweden 59.3 18.1
Switzerland 47.6 7.6
Turkey 41.0 29.0
Ukraine 50.5 30.5
Japan 35.7 139.8

Next, I used a full_join to join 2 tables Places and MultipleGoals and I named my new table “Opponents”

kable(Opponents <- full_join(Places, MultipleGoals, by = "Country"))
Country lat long HowManyTimes
Cyprus 35.2 33.4 3
Denmark 55.7 12.6 1
England 51.5 -0.1 24
France 48.9 2.3 4
Germany 52.5 13.4 8
Italy 41.9 12.5 4
Netherlands 52.4 4.9 2
Portugal 38.7 -9.1 1
Russia 55.8 37.6 1
Spain 40.4 -3.7 92
Sweden 59.3 18.1 2
Switzerland 47.6 7.6 1
Turkey 41.0 29.0 2
Ukraine 50.5 30.5 3
Japan 35.7 139.8 1
World <- map_data("world")

This map indicates the countries that Ronaldo has scored multiple goals against teams from those countries.

World %>% 
  ggplot(mapping = aes(x = long, y = lat)) +
  geom_polygon(mapping = aes(group = group), fill = "lightgrey") +
  geom_point(data = Opponents, mapping = aes(x = long, y = lat, fill = "blue", size = HowManyTimes, color = Country)) +
  coord_fixed(ratio = 1.8, xlim = c(-5,138), ylim = c(33, 75)) +
  theme(legend.position = "none") 

The bigger the point is, the more times Ronaldo has scored more than 1 goal in a match against teams from the country represented by that point. Spain (the very big blue point) and England (the big orange point) are the 2 nations where their teams often conceeded more than 1 Ronaldo’s goal in a game. All but 1 country is European - the only 1 that is not is Japan (the lonely dot to the right of the map).

5.2.6 Types of Goal

Goals %>% 
  group_by(Type_of_Goal) %>% 
  summarise(count = n()) %>% 
  kable()
Type_of_Goal count
Direct free kick 45
Header 91
Left-footed shot 92
Penalty 96
Right-footed shot 234
Tap-in 14
Goals %>% 
  ggplot(mapping = aes(Type_of_Goal)) +
  geom_bar(fill = c(7:12))

Overall, we can see that Ronaldo has had more right-footed shots than any other types of goal, which is not surprised because he is right-footed. He is also an all-around scorer. The number of goals as a header, a left-footed shot and a penalty are about the same. Ronaldo has also turned about 50 direct free kicks into goals and the number tap-in’s is the least among the 6 types of goal.