All Weeks Python Data Analysis Coursera Quiz Answers
Table of Contents
About Python Data Analysis Course
Python Data Analysis Coursera Quiz Answers
Week 1: Python Data Analysis
Quiz 1: Understanding the Data
Q1. How many columns does the dataset have?
- 26
- 205
Quiz 2: Python Packages for Data Science
Q1. What is a Python library?
- A file that contains data.
- A collection of functions and methods that allows you to perform lots of actions without writing your code
Quiz 3: Importing and Exporting Data in Python
Q1. What does the following method do to the dataframe? df : df.head(12)
- Show the first 12 rows of dataframe.
- Shows the bottom 12 rows of dataframe.
Quiz 4: Getting Started Analyzing Data in Python
Q1. What is the correct output of?
1df.describe(include=“all”)
b)
Quiz 5: Importing Datasets
Q1. What do we want to predict from the dataset?
- price
- colour
- make
Q2. Select the libraries you will use for this course?
- matplotlib
- pandas
- scikit-learn
Q3. What task does the following command perform?
1 df.to_csv("A.csv")
- change the name of the column to “A.csv”
- load the data from a csv file called “A” into a dataframe
- Save the dataframe df to a csv file called “A.csv”
Q4. Consider the segment of the following dataframe:
- What is the type of the column make?
- int64
- float64
- object
Q5. How would you generate descriptive statistics for all the columns for the dataframe df?
1 df.describe()
1 df.describe(include = "all")
1 df.info
Week 2: Python Data Analysis
Quiz 1: Dealing with Missing Values in Python
How would you access the column ”body-style” from the dataframe df?1 point
1df[ "body-style"]
1df==”bodystyle"
Q2. What is the correct symbol for missing data?
- nan
- no-data
Quiz 2: Data Formatting in Python
Q1. How would you rename the column “city_mpg” to “city-L/100km”?
1df.rename(columns={”city_mpg”: “city-L/100km”}, inplace=True)
1df.rename(columns={”city_mpg”: “city-L/100km”})
Quiz 3: Data Normalization in Python
Q1. What is the maximum value for feature scaling?
Enter your Answers
Quiz 4: Turning categorical variables into quantitative variables in Python
Q1. Why do we convert values of Categorical Variables into numerical values?1 point
- Most statistical models cannot take in objects or strings as inputs
- To save memory
Quiz 5: Data Wrangling
Q1. What task do the following lines of code perform?
1 avg=df['horsepower'].mean(axis=0)
2 df['horsepower'].replace(np.nan, avg)
- replace all the NaN values with the mean
- calculate the mean value for the ‘horsepower’ column and replace all the NaN values of that column by the mean value
- nothing; because the parameter inplace is not set to true
Q2. How would you rename column name from “highway-mpg” to “highway-L/100km”?
1 df.rename(columns={'"highway-mpg"':'highway-L/100km'}, inplace=True)
1 rename(df,columns={'"highway-mpg"':'highway-L/100km'})
Q3. How would you cast the column “losses” to an integer?
1 df[["losses"]]=df[["losses"]].astype("int")
1 df[["losses"]].astype("int")
Q4. The following code is an example of:
1 (df["length"]-df["length"].mean())/df["length"].std()
- simple feature scaling
- min-max scaling
- z-score
Q5. Consider the two columns ‘horsepower’, and ‘horsepower-binned’; from the dataframe df; how many categories are there in the ‘horsepower-binned’ column?
Enter answer here
Week 3: Python Data Analysis
Quiz 1: Descriptive Statistics
Q1. What plot would you see after running the following lines of code?
1 x=df[“engine-size”]
2 y=df[“price”]
3 plt.scatter(x,y)
4 plt.title(“Scatterplot of Engine Size vs Price”)
5 plt.xlabel(“Engine Size”)
6 plt.ylabel(“Price”)
a
Quiz 2: GroupBy in Python
Q1. Which of the following tables representing number of drive wheels, body style and price is a Pivot Table?
a)
b)
Quiz 3: Correlation
Q1. Select the scatter plot with weak correlation:
b
a
Quiz 4: Correlation – Statistics
Q1. Select the plot with a negative correlation:
a
b
Quiz 5: Exploratory Data Analysis
Q1. What task does the method value_counts perform?
- Returns summary statistics
- Returns counts of unique values
- Returns the first five columns of a dataframe
Q2. If we have 10 columns and 100 samples, how large is the output of df.corr()?
- 10 x 100
- 10×10
- 100×100
Q3. If the p-value of the Pearson Correlation is 1, then …
- The variables are correlated
- The variables are not correlated
- None of the above
Q4. Consider the following dataframe:
1 df_test = df[['body-style', 'price']]
The following operation is applied:
1 df_grp = df_test.groupby(['body-style'], as_index=False).mean()
What are the resulting values of: df_grp[‘price’]?
- The average price for each body style
- The average price
- The average body style
Q5. What is the Pearson Correlation between variables X and Y, if X=Y?
- -1
- 1
- 0
Week 4: Python Data Analysis
Quiz 1: Linear Regression and Multiple Linear Regression
Q1. consider the following lines of code, what is the name of the column that contains the target
1 from sklearn.linear_model import LinearRegression
2 lm=LinearRegression()
3 X = df[['highway-mpg']]
4 Y = df['price']
5 lm.fit(X, Y)
6 Yhat=lm.predict(X)
- ‘price’
- ‘highway-mpg’
Q2. consider the following equation:
what is the parameter b_0 (b subscript 0)1 point
- the predictor or independent variable
- the target or dependent variable
- the intercept
- the slope
Quiz 2: Model Evaluation using Visualization
Q1. Consider the following Residual Plot, is our linear model correct :
- yes
- no
Quiz 3: Polynomial Regression and Pipelines
Q1. what is the order of the following Polynomial
- 3
- 1
- 2
Quiz 4: Measures for In-Sample Evaluation
Q1. Of the following answer values, which one is the minimum value of R^2?
- 10
- 0
- 1
Quiz 5: Model Development
Q1. What does the following line of code do?
1 lm = LinearRegression()
- Fit a regression object lm
- Create a linear regression object
- Predict a value
Q2. What is the maximum value of R^2 that can be obtained?
- 10
- 0
- 1
Q3. We create a polynomial feature as follows “PolynomialFeatures(degree=2)”; what is the order of the polynomial?
- 0
- 1
- 2
Q4. What value of R^2 (coefficient of determination) indicates your model performs best?
- -1
- 1
- 0
Q5. Consider the following equation:
- The variable y is what?
- The predictor or independent variable
- The target or dependent variable
- The intercept
Week 5: Python Data Analysis
Quiz 1: Model Evaluation
Q1. What is the correct use of the “train_test_split” function such that 90% of the data samples will be utilized for training, the parameter “random_state” is set to zero, and the input variables for the features and targets are x_data, y_data respectively.
1 train_test_split(x_data, y_data, test_size=0.9, random_state=0)
1 train_test_split(x_data, y_data, test_size=0.1, random_state=0)
Quiz 2: Overfitting, Underfitting and Model Selection
Q1. In the following plot, the vertical axis shows the mean square error and the horizontal axis represents the order of the polynomial. The red line represents the training error the blue line is the test error. Should you select the 16 order polynomial.
- no
- yes
Quiz 3: Ridge Regression
Q1. the following models were all trained on the same data, select the model with the lowest value for alpha:
- a
- b
- c
Week 6: Final Exam
Q1. What type of file allows data to be saved in a tabular format?
- csv
- html
Q2. What Python library is used forstatistical modelling including regression and classification?
- Numpy
- Matplotlib
- Scikit-learn
Q3. What path tells us where the data is stored?
- Encoding path
- Scheme path
- File path
Q4. What attribute or function returns the data types of each column?
- dtypes
- tail()
- head()
Q5. The Pandas library allows us to read what?
- Only headers
- Only rows
- Various datasets into a data frame
Q6. The Matplotlib library is mostly used for what?
- Data analysis
- Data visualization
- Machine learning
Q7. What would the following code segment output from a dataframe df?
df.head(5)
- It would return the last 5 rows of the dataframe
- It would return the first 5 rows of the dataframe
- It would return all of the rows of the dataframe
Q8. What is the function used to remove rows and columns with Null or NaN values?
- removena()
- replacena()
- dropna()
Q9. How would you multiply each element in the column df[“c”] by 5 and assign it back to the column df[“c”]?
- 5*df[“b”]
- df[“a”]=df[“c”]*5
- df[“c”]=5*df[“c”]
Q10. What does the below code segment give an example of for the column “length”?
- df[“length”] = (df[“length”]-df[“length”].min())/ (df[“length”].max()-df[“length”].min())
- It gives an example of the max-min method
- It gives an example of the z-score or standard score
Q11. What is it called when you subtract the mean and divide by the standard deviation?
- Min-max method
- Data standardization
- One-hot encoding
Q12. What segment of code calculates the mean of the column ‘peak-rpm’?
- df[‘peak-rpm’].mean()
- mean( df[‘peak-rpm’])
- df.mean([‘peak-rpm’])
Get All Course Quiz Answers of Introduction to Scripting in Python Specialization
Python Programming Essentials Coursera Quiz Answers
Python Data Representations Coursera Quiz Answers
Python Data Analysis Coursera Quiz Answers
Python Data Visualization Coursera Quiz Answers