• Question 1
    • Q1a
    • Q1b
    • Q1c
    • Q1d
  • Question 2
    • Q2a
    • Q2b
  • Question 3
    • Q3a.
    • Q3b.
    • Q3c.
  • Question 4
    • Q4a
    • Q4b
    • Q4c
    • Q4d
    • Q4e
    • Q4f
    • Q4g
  • The following is the Python libraries we need for this homework.
import pandas as pd
import numpy as np

Question 1

  • The following is the list, a:
a = [0.1, 1.2, 2.3, 3.4, 4.5]

Q1a

Write a Python code that uses the list, a, to create the following Numpy Array:

array([0.1, 1.2, 2.3, 3.4, 4.5])


Answer

arr_a = np.array(a)

Q1b

Write a Python code that uses the list, a, to create the following pandas Series:

########################
# Index      0

# 0         0.1
# 1         1.2
# 2         2.3
# 3         3.4
# 4         4.5

# dtype: float64    
########################


Answer

s = pd.Series(a)
s

Q1c

Write a Python code that uses the list, a, to create the following pandas Series:

########################
# Index      0

# a         0.1
# b         1.2
# c         2.3
# d         3.4
# e         4.5

# dtype: float64    
########################


Answer

s = pd.Series(a, 
              index = ['a','b','c','d','e'])
s

Q1d

Write a Python code that uses (1) the Series created in Q1c and (2) Boolean indexing to get the following Series:

########################
# Index      0

# c         2.3
# d         3.4
# e         4.5

# dtype: float64    
########################  


Answer

s[ s > 2 ]




Question 2

The next line creates a list of tuples that are percentiles and Household Incomes at the specified percentiles

hh_income = [ (10, 14629), (20, 25600), (30, 37002),
              (40, 50000), (50, 63179), (60, 79542),
              (70, 100162), (80, 130000), (90, 184292) ]


Q2a

Write a Python code that uses the list, hh_income, to assign the object, hh_income_array, to the following Numpy array:

############################

# array([[    10,  14629],
#        [    20,  25600],
#        [    30,  37002],
#        [    40,  50000],
#        [    50,  63179],
#        [    60,  79542],
#        [    70, 100162],
#        [    80, 130000],
#        [    90, 184292]])

############################

Answer

hh_income_array = np.array(hh_income)


Q2b

Write a Python code that uses the print() function to report the dimensions of the ndarray and the number of elements in hh_income_array as follows:

Dimensions of the NumPy array, hh_income_array, is: (9, 2)
Number of elements in the NumPy array, hh_income_array, is: 18


Answer

print("Dimensions of the NumPy array, hh_income_array, is: ", hh_income_array.shape)
print("Number of elements in the NumPy array, hh_income_array, is: ", hh_income_array.size)




Question 3

  • The following is the NumPy array, c:
c = np.array([ [1.0, 2], [3, 4] ])

Q3a.

Write a Python code that uses the NumPy array, c, to create the following DataFrame:

############################

# index     0    1
# 0         1.0  2.0
# 1         3.0  4.0

############################


Answer

df = pd.DataFrame(c)
df


Q3b.

Write a Python code that uses the NumPy array, c, to create the following DataFrame:

############################

# index     dogs    cats
# 0         1.0     2.0
# 1         3.0     4.0

############################


Answer

df = pd.DataFrame(c, columns=['dogs','cats'])
df


Q3c.

Write a Python code that uses the NumPy array, c, to create the following DataFrame:

############################

# index             dogs    cats
# byeong-hak        1.0     2.0
# your_first_name   3.0     4.0

############################


Answer

df = pd.DataFrame(c, 
                  columns=['dogs','cats'],
                  index = ['byeong-hak', 'your_first_name'])
df




Question 4

Download the file, US_state_GDP.zip, from the Files section in our Canvas. Extract the zip file, US_state_GDP.zip, to use the CSV file, US_state_GDP.csv.

Assign path_csv to the string of the absolute pathname of the file, US_state_GDP.csv.

####################################################################################################################################
# For example

# path_csv = '/Users/byeong-hakchoe/Google Drive/suny-geneseo/teaching-materials/lecture-data/US_state_GDP.csv'
# path_csv = 'C:/byeong-hakchoe/Google Drive/suny-geneseo/teaching-materials/lecture-data/US_state_GDP.csv'

####################################################################################################################################

Q4a

Read the data file, US_state_GDP.csv, as the object name, state_gdp, using (1) path_csv and (2) pd.read_csv() function.


Answer

# This is an example of the absolute path of the CSV file
path_csv = '/Users/byeong-hakchoe/Google Drive/suny-geneseo/teaching-materials/lecture-data/US_state_GDP.csv'
state_gdp = pd.read_csv(path_csv)


Q4b

Write a Python code that uses the DataFrame, state_gdp, to create the DataFrame, whose first five rows are as follows:

############################################

# index    state_code           state

# 0          AK                Alaska
# 1          AL               Alabama
# 2          AR              Arkansas
# 3          AZ               Arizona
# 4          CA            California

############################################


Answer

state_gdp[ [ 'state_code', 'state' ] ]


Q4c

Write a Python code that uses (1) the DataFrame, state_gdp, and (2) state_gdp.columns to create the DataFrame, whose first five rows are as follows:

############################################

#                    state  gdp_2009

# 0                 Alaska     44215
# 1                Alabama    149843
# 2               Arkansas     89776
# 3                Arizona    221405
# 4             California   1667152

############################################


Answer

cols = state_gdp.columns
state_gdp[ cols[1:3] ]


Q4d

Write a Python code to get the first three rows of the DataFrame, state_gdp:


Answer

state_gdp[ 1:3 ]


Q4e

Write a Python code to get all the rows of the DataFrame, state_gdp, for which the value of gdp_growth_2010 is less than 0


Answer

state_long_recession = state_gdp['gdp_growth_2010'] < 0
state_gdp[ state_long_recession ]


Q4f

Write a Python code that uses state_gdp.loc[] to get the following DataFrame:


############################################

#       state  gdp_growth_2010

# 0    Alaska             -1.7
# 3   Arizona             -0.2
# 33   Nevada             -0.4
# 50  Wyoming             -1.3

############################################

Answer

state_gdp.loc[ state_long_recession,['state', 'gdp_growth_2010'] ]


Q4g

Write a Python code that uses state_gdp.iloc[] to get the following DataFrame:


############################################

#    state_code     state

# 10         GA   Georgia
# 11         HI    Hawaii
# 12         IA      Iowa
# 13         ID     Idaho
# 14         IL  Illinois

############################################

Answer

state_gdp.iloc[ 10:15, :2 ]