B Data Visualization

B.1 Barplots

  1. Using the geom_bar create a barplot for the frequencies of cars at different levels of carb from the dataset carDt (use load() to load the dataset first)
load('HWdatasets/carDt.RDA')
ggplot(carDt, aes(x=carb)) + geom_bar() 

  1. Add to the barchart from question 1 a different fill color to each bar that maps to the variable gear. Then:
  1. Add a title “Count of cars by each carb type”
  2. Add a subtitle “The count for each carb type is divided by number of gears”
ggplot(carDt, aes(x=carb, fill=gear)) + geom_bar() +
  ggtitle('Fist', 'Second')

B.2 Line Chart

  1. Use the beerDt dataset to plot a linechart of the alcohol consumption per capita by year.
  1. Add a title ‘Alcohol Consumption in US from 1850 to 2015’
  2. Add a subtitle ‘Quantity is expressed in gallons per capita’
  3. Change the color of the line to a color of your choice. Remember that non-data ink should not be declared within aes()
  4. Add a dot for each observation on the line using geom_point
load('HWdatasets/beerDt.RDa')
ggplot(beerDt, aes(x = Year, y = GallonsCapita)) + 
  ggtitle('Alcohol Consumption in US from 1850 to 2015', 'Quantity is expressed in gallons per capita') +
  geom_line(color = 'firebrick3') + 
  geom_point() + 
  theme_minimal()

  1. Replicate the chart from 3 adding a geom_line() for the GPD growth rate from the dataset growthDt. Remember that you are dealing with two different datasets, beerDt and growthDt, and that you cannot pass both of them in the same ggplot() call. You have two options:
  1. To pass one dataset to the ggplot() call and the second to the geom_line()
  2. To leave ggplot() empty and call one dataset in each geom_*.
load('dataset/growthDt.RDA')
ggplot(beerDt, aes(x=Year, y=GallonsCapita)) + geom_line() + 
  geom_point() + 
  geom_line(data=growthDt, aes(x=Year, y=GrowthRate_s, color='red')) 

B.3 Faceting

  1. Use the dataset titanicDt to create:
  1. A barchart of the survivors, where the fill color maps the to survivors’ gender. Remember that geom_bar() will count by default, while in this dataset the column n contains the count already. Change the attribute stat to stat="identity" (it is defaulted to stat='count') or use geom_col instead.
  2. Use facet_wrap() to creat a different barchart for each level of the variable Class
  3. Add a complete theme of your choice (e.g.: theme_minimal(), theme_classic())
  4. Add a title and a subtitle to the chart
load('HWdatasets/titanicDt.RDA')
ggplot(titanicDt, aes(x = Sex, y = n, fill = Survived)) +
  ggtitle('The Titanic shipwreck', 'Victims by class') +
  geom_bar(stat='identity') +
  facet_wrap(~Class) +
  theme_classic()