Blog Post 7

2022-04-22

First load and clean the data

library(tidyverse)
source(
  here::here("static", "load_and_clean_data.R"),
  echo = FALSE
)

We joined GDP growth data per year to the travel data in addition to the previously joined GDP data per year. Take the 10 countries with highest GDP growth in 2019.

high_growth <- plane_data_join_GDP %>% filter(Year == 2019) %>% group_by(Country) %>% transmute(FG_wac, GDPGrowth) %>% arrange(desc(GDPGrowth)) %>% unique() %>% head(10)

(knitr::kable(high_growth))
Country FG_wac GDPGrowth
Ethiopia 522 8.364086
Bangladesh 703 8.152684
Armenia 405 7.600000
Dominica 221 7.571392
Cambodia 709 7.054107
Vietnam 791 7.017435
Marshall Islands 844 6.644664
Ghana 529 6.507775
Philippines 766 6.118526
China 713 5.949714

Plot the log(GDP) over total passenger travel data for these 10 countries. Take the log to take into account the large differences between GDP for these countries.

plane_data_join_GDP %>% filter(Year < 2020,FG_wac %in% high_growth$FG_wac) %>% group_by(Year) %>% summarize(TotalPassengers = sum(Total), gdp = log(GDP), Country) %>% unique() %>% ggplot(aes(gdp, TotalPassengers)) + geom_point(aes(color = Country))

A tentative thesis shown in the plot could be that countries with high GDP growth exhibit more of a correlation between GDP and travel from the US. Possible issues is that US doesn’t necessarily have direct flights to certain countries.