library(tidyverse)
library(readr)
The dataset that includes candidates’ tweets between February 1st and March 10th. It was retrieved from Twitter API with help of the rtweet package. This code vignette explains how I downloaded the Twitter data. The raw data can be accessed through Github repo for this work.
all_mayors_tweets <- readRDS("~/Desktop/tweets/rds_data/all_mayors_tweets.rds")
#and let's remove the retweets and used the complete name of candidates
mayors_tweets <- all_mayors_tweets %>% filter(is_retweet ==FALSE) %>% mutate(screen_name = recode(screen_name, "BA_Yildirim"="Binali Yıldırım", "ekrem_imamoglu" = "Ekrem İmamoğlu", "mansuryavas06" = "Mansur Yavaş","mehmetozhaseki"="Mehmet Özhaseki","ZeybekciNihat"= "Nihat Zeybekçi","tuncsoyer"="Tunç Soyer"))
Let’s specify our theme and customize it (we’ll use it later)
theme_custom1 <- function() {
theme_minimal() +
theme(
text = element_text(family = "Proxima Nova", color = "gray25"),
plot.title = element_text(face = "bold",size = 14),
plot.subtitle = element_text(size = 13),
axis.text.x= element_text(size=11),
axis.text.y = element_text(size=11),
plot.caption = element_text(size = 11, color = "gray30"),
plot.background = element_rect(fill = "#f6f5f5"),
legend.position = "none",
strip.background = element_rect(colour = "#d9d9d9", fill = "#d9d9d9"),
strip.text.x = element_text(size = 11, colour = "gray25", face = "bold"))
}
Lastly, choose Color Palette (we’ll use it for charts), one color per alliance
custom_col <- c("Binali Yıldırım" = "#e67d31",
"Ekrem İmamoğlu" = "#3c841d",
"Mansur Yavaş" = "#3c841d",
"Mehmet Özhaseki" = "#e67d31")
This analysis will only focus on mayoral candidates in Ankara and Istanbul in the upcoming March 31th Local Election in Turkey. Let’s first look at the average number of likes candidates got. The table below shows Mansur Yavas has the highest number of likes.
mayors_tweets %>% group_by(screen_name)%>% summarise(average_like = mean(favorite_count)) %>% arrange(average_like)
## # A tibble: 6 x 2
## screen_name average_like
## <chr> <dbl>
## 1 Tunç Soyer 683.
## 2 Nihat Zeybekçi 978.
## 3 Binali Yıldırım 1412.
## 4 Mehmet Özhaseki 2429.
## 5 Ekrem İmamoğlu 2598.
## 6 Mansur Yavaş 2699.
Another calculation shows that Binali Yıldırım posted more tweest per day among the mayoral candidates.
mayors_tweets %>% count(city,screen_name, sort = TRUE) %>% mutate(tweet_per_day = n/38)
## # A tibble: 6 x 4
## city screen_name n tweet_per_day
## <chr> <chr> <int> <dbl>
## 1 İstanbul Binali Yıldırım 520 13.7
## 2 İstanbul Ekrem İmamoğlu 356 9.37
## 3 Ankara Mehmet Özhaseki 312 8.21
## 4 İzmir Tunç Soyer 199 5.24
## 5 Ankara Mansur Yavaş 156 4.11
## 6 İzmir Nihat Zeybekçi 155 4.08
Let’s look at the daily number likes per tweets (based on their followers’ number) of mayoral candidates in İstanbul. As can be seen chart below, Imamoğlu has the highest number of likes per day based on the followers number.
mayors_tweets %>% filter(city == "İstanbul") %>% mutate(popularity = favorite_count/followers_count) %>% ggplot(aes(date2, popularity, color = screen_name))+geom_line(size=0.8, show.legend = FALSE) +geom_point(show.legend = FALSE) +facet_wrap(~screen_name)+theme_custom1()+scale_color_manual(values = custom_col)
Also to see the trends of popularity (likes) over time, let’s look at the smoothed lines, which represent the local regression between date and popularity variables.
#smoothing the lines
mayors_tweets %>% filter(city == "İstanbul") %>% mutate(popularity = retweet_count/followers_count) %>% ggplot(aes(date2,popularity, color = screen_name)) +geom_jitter(size = 3,alpha =0.2, show.legend = FALSE)+theme_custom1() +facet_wrap(~screen_name)+ geom_smooth(show.legend = FALSE)+scale_color_manual(values = custom_col)+labs( x="", y="", title = "Twitter Popularity of Candidates in İstanbul Based on Their Followers",subtitle ="Includes tweets between February 1st and March 10th")
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
We can do the same analysis by filtering the city to “Ankara” and repeating the same steps.
mayors_tweets %>% filter(city == "Ankara") %>% mutate(popularity = favorite_count/followers_count) %>% ggplot(aes(date2, popularity, color = screen_name))+geom_line(size=0.8, show.legend = FALSE) +geom_point(show.legend = FALSE) +facet_wrap(~screen_name)+theme_custom1()+scale_color_manual(values = custom_col)
In order to see the main trend, we employ a regressian analysis which yields a trends over time
mayors_tweets %>% filter(city == "Ankara") %>% mutate(popularity = favorite_count/followers_count) %>% ggplot(aes(date2,popularity, color = screen_name)) +geom_jitter(size = 3,alpha =0.2, show.legend = FALSE)+theme_custom1() +facet_wrap(~screen_name)+ geom_smooth(show.legend = FALSE)+scale_color_manual(values = custom_col)+labs( x="", y="", title = "Twitter Popularity of Candidates in Ankara Based on Their Followers",subtitle ="Includes tweets between February 1st and March 10th")
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'