Blog Post 8: Visualizations & Tables + Further Data Exploration

2022-04-29

First load and clean the data

library(tidyverse)
source(
  here::here("static", "load_and_clean_data.R"),
  echo = FALSE
)

There are many improvements we hope to make to polish our visualizations. Utilizing ggtheme in the ggpubr package, we hope to change the theme to theme_minimal() as we believe this theme will allow the colored points associated with countries to best be represented, while still maintaining the gridlines to reference the plot points. We can also ensure the y axis labels are properly labeled in scientific notation, utilizing the scale_y_continious() function. We will change the x and y axis labels to be clearer than the labels in our dataset. We will change the x axis to GDP($) and the y axis to Total Passengers(count). Furthermore, we will use the labs_pubr function in ggpubr to improve the formatting of the labels. Finally, we will add a title using the labs function, and center the title by setting the hjust to 0.5.

library(ggpubr)

high_growth <- plane_data_join_GDP %>% filter(Year == 2019) %>% group_by(Country) %>% transmute(FG_wac, GDPGrowth) %>% arrange(desc(GDPGrowth)) %>% unique() %>% head(10)

p1 <- plane_data_join_GDP %>% filter(Year < 2020,FG_wac %in% high_growth$FG_wac) %>% group_by(Year, Country) %>% summarize(TotalPassengers = sum(Total), gdp = log(GDP), Country) %>% unique() %>% ggplot(aes(gdp, TotalPassengers)) + geom_point(aes(color = Country))

p1 + theme_minimal() + scale_y_continuous(labels = function(x) format(x, scientific = TRUE)) + labs_pubr(base_size=11, base_family ="") + labs(x = "log(GDP($))", y = "Total Passengers (count)", title = "Log(GDP) vs Total Passengers per Country") + theme(plot.title = element_text(hjust = 0.5))