## All Weeks Python Data Analysis Coursera Quiz Answers

## Table of Contents

### About Python Data Analysis Course

### Python Data Analysis Coursera Quiz Answers

#### Week 1: Python Data Analysis

#### Quiz 1: Understanding the Data

Q1. How many columns does the dataset have?

**26**- 205

#### Quiz 2: Python Packages for Data Science

Q1. What is a Python library?

- A file that contains data.
**A collection of functions and methods that allows you to perform lots of actions without writing your code**

#### Quiz 3: Importing and Exporting Data in Python

Q1. What does the following method do to the dataframe? **df** : **df.head(12)**

**Show the first 12 rows of dataframe.**- Shows the bottom 12 rows of dataframe.

#### Quiz 4: Getting Started Analyzing Data in Python

Q1. What is the correct output of?

`1df.describe(include=“all”) `

**b)**

#### Quiz 5: Importing Datasets

Q1. What do we want to predict from the dataset?

- price
- colour
- make

Q2. Select the libraries you will use for this course?

- matplotlib
- pandas
- scikit-learn

Q3. What task does the following command perform?

`1 df.to_csv("A.csv")`

- change the name of the column to “A.csv”
- load the data from a csv file called “A” into a dataframe
- Save the dataframe df to a csv file called “A.csv”

Q4. Consider the segment of the following dataframe:

- What is the type of the column make?
- int64
- float64
- object

Q5. How would you generate descriptive statistics for all the columns for the dataframe df?

`1 df.describe()`

`1 df.describe(include = "all")`

`1 df.info`

#### Week 2: Python Data Analysis

#### Quiz 1: Dealing with Missing Values in Python

How would you access the column ”body-style” from the dataframe **df?1 point**

`1df[ "body-style"] `

`1df==”bodystyle"`

Q2. What is the correct symbol for missing data?

- nan
- no-data

#### Quiz 2: Data Formatting in Python

Q1. How would you rename the column “city_mpg” to “city-L/100km”?

**1df.rename(columns={”city_mpg”: “city-L/100km”}, inplace=True)**

`1df.rename(columns={”city_mpg”: “city-L/100km”})`

#### Quiz 3: Data Normalization in Python

Q1. What is the maximum value for feature scaling?

Enter your Answers

#### Quiz 4: Turning categorical variables into quantitative variables in Python

Q1. Why do we convert values of Categorical Variables into numerical values?**1 point**

**Most statistical models cannot take in objects or strings as inputs**- To save memory

#### Quiz 5: Data Wrangling

Q1. What task do the following lines of code perform?

`1 avg=df['horsepower'].mean(axis=0)`

2 df['horsepower'].replace(np.nan, avg)

- replace all the NaN values with the mean
- calculate the mean value for the ‘horsepower’ column and replace all the NaN values of that column by the mean value
- nothing; because the parameter inplace is not set to true

Q2. How would you rename column name from “highway-mpg” to “highway-L/100km”?

`1 df.rename(columns={'"highway-mpg"':'highway-L/100km'}, inplace=True)`

`1 rename(df,columns={'"highway-mpg"':'highway-L/100km'})`

Q3. How would you cast the column “losses” to an integer?

`1 df[["losses"]]=df[["losses"]].astype("int")`

`1 df[["losses"]].astype("int")`

Q4. The following code is an example of:

`1 (df["length"]-df["length"].mean())/df["length"].std()`

- simple feature scaling
- min-max scaling
- z-score

Q5. Consider the two columns ‘horsepower’, and ‘horsepower-binned’; from the dataframe df; how many categories are there in the ‘horsepower-binned’ column?

Enter answer here

#### Week 3: Python Data Analysis

#### Quiz 1: Descriptive Statistics

Q1. What plot would you see after running the following lines of code?

`1 x=df[“engine-size”]`

2 y=df[“price”]

3 plt.scatter(x,y)

4 plt.title(“Scatterplot of Engine Size vs Price”)

5 plt.xlabel(“Engine Size”)

6 plt.ylabel(“Price”)

**a**

#### Quiz 2: GroupBy in Python

Q1. Which of the following tables representing number of drive wheels, body style and price is a Pivot Table?

**a)**

b)

#### Quiz 3: Correlation

Q1. Select the scatter plot with weak correlation:

**b**

a

#### Quiz 4: Correlation – Statistics

Q1. Select the plot with a negative correlation:

a

**b**

#### Quiz 5: Exploratory Data Analysis

Q1. What task does the method value_counts perform?

- Returns summary statistics
- Returns counts of unique values
- Returns the first five columns of a dataframe

Q2. If we have 10 columns and 100 samples, how large is the output of df.corr()?

- 10 x 100
- 10×10
- 100×100

Q3. If the p-value of the Pearson Correlation is 1, then …

- The variables are correlated
- The variables are not correlated
- None of the above

Q4. Consider the following dataframe:

`1 df_test = df[['body-style', 'price']]`

The following operation is applied:

`1 df_grp = df_test.groupby(['body-style'], as_index=False).mean()`

What are the resulting values of: df_grp[‘price’]?

- The average price for each body style
- The average price
- The average body style

Q5. What is the Pearson Correlation between variables X and Y, if X=Y?

- -1
- 1
- 0

#### Week 4: Python Data Analysis

#### Quiz 1: Linear Regression and Multiple Linear Regression

Q1. consider the following lines of code, what is the name of the column that contains the target

`1 from sklearn.linear_model import LinearRegression`

2 lm=LinearRegression()

3 X = df[['highway-mpg']]

4 Y = df['price']

5 lm.fit(X, Y)

6 Yhat=lm.predict(X)

**‘price’**- ‘highway-mpg’

Q2. consider the following equation:

what is the parameter b_0 (b subscript 0)**1 point**

- the predictor or independent variable
- the target or dependent variable
**the intercept**- the slope

#### Quiz 2: Model Evaluation using Visualization

Q1. Consider the following **Residual Plot**, is our linear model correct :

- yes
**no**

#### Quiz 3: Polynomial Regression and Pipelines

Q1. what is the order of the following Polynomial

**3**- 1
- 2

#### Quiz 4: Measures for In-Sample Evaluation

Q1. Of the following answer values, which one is the minimum value of **R^2**?

- 10
- 0
- 1

#### Quiz 5: Model Development

Q1. What does the following line of code do?

`1 lm = LinearRegression()`

- Fit a regression object lm
- Create a linear regression object
- Predict a value

Q2. What is the maximum value of R^2 that can be obtained?

- 10
- 0
- 1

Q3. We create a polynomial feature as follows “PolynomialFeatures(degree=2)”; what is the order of the polynomial?

- 0
- 1
- 2

Q4. What value of R^2 (coefficient of determination) indicates your model performs best?

- -1
- 1
- 0

Q5. Consider the following equation:

- The variable y is what?
- The predictor or independent variable
- The target or dependent variable
- The intercept

#### Week 5: Python Data Analysis

#### Quiz 1: Model Evaluation

Q1. What is the correct use of the “train_test_split” function such that 90% of the data samples will be utilized for training, the parameter “random_state” is set to zero, and the input variables for the features and targets are x_data, y_data respectively.

`1 train_test_split(x_data, y_data, test_size=0.9, random_state=0)`

**1 train_test_split(x_data, y_data, test_size=0.1, random_state=0)**

#### Quiz 2: Overfitting, Underfitting and Model Selection

Q1. In the following plot, the vertical axis shows the mean square error and the horizontal axis represents the order of the polynomial. The red line represents the training error the blue line is the test error. Should you select the 16 order polynomial.

**no**- yes

#### Quiz 3: Ridge Regression

Q1. the following models were all trained on the same data, select the model with the lowest value for alpha:

**a**- b
- c

#### Week 6: Final Exam

Q1. What type of file allows data to be saved in a tabular format?

- csv
- html

Q2. What Python library is used forstatistical modelling including regression and classification?

- Numpy
- Matplotlib
- Scikit-learn

Q3. What path tells us where the data is stored?

- Encoding path
- Scheme path
- File path

Q4. What attribute or function returns the data types of each column?

- dtypes
- tail()
- head()

Q5. The Pandas library allows us to read what?

- Only headers
- Only rows
- Various datasets into a data frame

Q6. The Matplotlib library is mostly used for what?

- Data analysis
- Data visualization
- Machine learning

Q7. What would the following code segment output from a dataframe df?

df.head(5)

- It would return the last 5 rows of the dataframe
- It would return the first 5 rows of the dataframe
- It would return all of the rows of the dataframe

Q8. What is the function used to remove rows and columns with Null or NaN values?

- removena()
- replacena()
- dropna()

Q9. How would you multiply each element in the column df[“c”] by 5 and assign it back to the column df[“c”]?

- 5*df[“b”]
- df[“a”]=df[“c”]*5
- df[“c”]=5*df[“c”]

Q10. What does the below code segment give an example of for the column “length”?

- df[“length”] = (df[“length”]-df[“length”].min())/ (df[“length”].max()-df[“length”].min())
- It gives an example of the max-min method
- It gives an example of the z-score or standard score

Q11. What is it called when you subtract the mean and divide by the standard deviation?

- Min-max method
- Data standardization
- One-hot encoding

Q12. What segment of code calculates the mean of the column ‘peak-rpm’?

- df[‘peak-rpm’].mean()
- mean( df[‘peak-rpm’])
- df.mean([‘peak-rpm’])

#### Get All Course Quiz Answers of Introduction to Scripting in Python Specialization

Python Programming Essentials Coursera Quiz Answers

Python Data Representations Coursera Quiz Answers

Python Data Analysis Coursera Quiz Answers

Python Data Visualization Coursera Quiz Answers