Loading R packages for Homework Assignment 1

library(tidyverse)



Question 1

Q1a

Download the compressed file, bikeshare-2011-01-01.zip, from the Files section in our Canvas web-site. Extract the file, bikeshare-2011-01-01.zip, so that you can use the file, bikeshare-2011-01-01.csv. Read the data file, bikeshare-2011-01-01.csv, as the data.frame object with the name, bikeshare2011_01_01, using (1) the read_csv() function and (2) the absolute path name of the file bikeshare_2011_01_01.csv from your local hard disk drive in your laptop.



Q1b

Report the mean, median, minimum, maximum, and standard deviation for each numeric variable in the data.frame bikeshare2011_01_01.




Question 2

Q2a

Read the data file, bikeshare_cleaned.csv, as the data.frame object with the name, bikeshare, using (1) the read_csv() function and (2) its URL, https://bcdanl.github.io/data/bikeshare_cleaned.csv.



Use the data.frame bikeshare for the rest of questions in Question 2.

Description of variables in the data file, bikeshare_cleaned.csv

The data set, bikeshare_cleaned.csv, includes 17376 observations of hourly counts from 2011 to 2012 for bike rides (rentals) in Washington D.C.


  • cnt: count of total bikes rented out
  • year: year
  • month: month
  • date: date
  • hr: hours
  • wkday: week day
  • holiday: holiday if holiday == 1; non-holiday otherwise
  • seasons: season
  • weather_cond: weather condition
  • temp: temperature, measured in standard deviations from average.
  • hum: humidity, measured in standard deviations from average.
  • windspeed: wind speed, measured in standard deviations from average.



Q2b

Provide both (1) ggplot codes and (2) a couple of sentences to describe the distribution of cnt.



Q2c

Provide both (1) ggplot codes and (2) a couple of sentences to describe the distribution of cnt by year and month.



Q2d

Provide both (1) ggplot codes and (2) a couple of sentences to describe the distribution of temp by year and month.



Q2e

Provide both (1) ggplot codes and (2) a couple of sentences to describe the distribution of hum by year and month.



Q2f

Provide both (1) ggplot codes and (2) a couple of sentences to describe the distribution of windspeed by year and month.



Q2g

Provide both (1) ggplot codes and (2) a couple of sentences to describe the relationship between temp and cnt.



Q2h

Provide both (1) ggplot codes and (2) a couple of sentences to describe the relationship between temp and cnt by year and month.



Q2i

Provide both (1) ggplot codes and (2) a couple of sentences to describe the relationship between weather_cond and cnt.



Q2j

Provide both (1) ggplot codes and (2) a couple of sentences to describe the relationship between weather_cond and cnt by hr.



Q2k

Provide both (1) ggplot codes and (2) a couple of sentences to describe the relationship between wkday and cnt.



Q2l

Provide both (1) ggplot codes and (2) a couple of sentences to describe the relationship between wkday and cnt by hr.




Question 3

Q3a

Read the data file, NY_school_enrollment_socioecon.csv, as the data.frame object with the name, NY_school_enrollment_socioecon, using (1) the read_csv() function and (2) its URL, https://bcdanl.github.io/data/NY_school_enrollment_socioecon.csv.



For description of variables in NY_school_enrollment_socioecon, refer to the file, ny_school_enrollment_socioecon_description.zip, which is in the Files section in our Canvas web-page. (I recommend you to extract the zip file, and then read the file, ny_school_enrollment_socioecon_description.csv, using Excel or Numbers.)



Q3b

Provide both (1) ggplot codes and (2) a couple of sentences to describe the relationship between college enrollment and educational attainment of population 45 to 64 years, and how such relationship varies by the type (public or private) of colleges.



Q3c

Provide both (1) ggplot codes and (2) a couple of sentences to describe how the relationships described in Q3b vary by gender of population 45 to 64 years.



Q3d

Provide both (1) ggplot codes and (2) a couple of sentences to describe how the relationships described in Q3b vary by gender of college enrollment.