Exploratory Data Analysis for Machine Learning Quiz Answers

Exploratory Data Analysis for Machine Learning Week 01 Quiz Answers

Quiz 01: Check for Understanding

Q1. (True/False) Machine Learning is a subset of Artificial Intelligence

False
True

Q2. (True/False) Deep Learning is a subset of Machine Learning

False
True

Q3. (True/False) Machine Learning consists in programming computers to learn from real-time human interactions

False
True

Q4. (True/False) Machine Learning is the same as Artificial Intelligence

False

True

Quiz 02: Check for Understanding

Q1. (True/False) AI Winters happened mostly due to the lack of understanding behind the theory of neural networks

True
False

Q2. Most applications that use computer vision, use models that were trained using this discipline:

Machine Learning
Artificial Intelligence
Deep Learning

Q3. In the Machine Learning Workflow, the main goal of the Data Exploration and Preprocessing step is to:

Identify what data that is best suited to find a solution to your business problem
Determine how to clean your data such that you can use it to train a model

Module 1 Quiz

Q1. Assume you have a data set that summarizes a marketing campaign with information related to prospective customers. The data set contains 100 observations with several columns that summarize information about the prospective customer. It also has a column that flags whether the prospect responded or not.

In this example, “Yes” or “No” are the possible values of the:

label
features
target
example

Q2. Assume you have a data set that summarizes a marketing campaign with information related to prospective customers. The data set contains 100 observations with several columns that summarize information about the prospective customer. It also has a column that flags whether the prospect responded or not.

In this context, observation is a synonym of:

label
features
target
example

Q3. Assume you have a data set that summarizes a marketing campaign with information related to prospective customers. The data set contains 100 observations with several columns that summarize information about the prospective customer. It also has a column that flags whether the prospect responded or not.

A machine learning model that predicts response, is using the column Responded as a:

label
features
target
example

Quiz 03: Check for Understanding

Q1. Which statement about the Pandas read_csv function is TRUE?

It can only read comma-delimited data.
It can read both tab-delimited and space-delimited data.
It reads data into a 2-dimensional NumPy array.
It allows only one argument: the name of the file.

Q2. Which of the following is an example of a file type that uses Javascript Object Notation (JSON) formatting?

Python (.py files)
Javascript (.js files)
SQL Database (.sql files)
Jupyter/iPython (.ipynb files)

Q3. The data below appears in ‘data.txt’, and Pandas has been imported. Which Python command will read it correctly into a Pandas DataFrame?

63.03 22.55 39.61 40.48 98.67 -0.25 AB

39.06 10.06 25.02 29 114.41 4.56 AB

68.83 22.22 50.09 46.61 105.99 -3.53 AB

pandas.read_csv(‘data.txt’)
pandas.read_csv(‘data.txt’, header=None, sep=’ ‘)
pandas.read_csv(‘data.txt’, delim_whitespace=True)
pandas.read_csv(‘data.txt’, header=0, delim_whitespace=True)

Quiz 05: Check for Understanding

Q1. (True/False) Outliers must be very extreme to noticeably impact the fit of a statistical model.

True
False

Q2. (True/False) Outliers should always be replaced, since they never contain useful information about the data.

True
False

Q3. Which residual-based approach to identifying outliers compares running a model with all data to running the same model, but dropping a single observation?

Standardized residuals
Unstandardized residuals
Externally-studentized residuals
Abnormally-studentized residuals

Quiz 06: Check for Understanding

Q1. From the options listed below, select the option that is NOT a valid exploratory data approach to visually confirm whether your data is ready for modeling or if it needs further cleaning or data processing:

Create a panel plot that shows distributions for the dependent variable and scatter plots for all independent variables
Train a model and identify the observations with the largest residuals
Create visualizations for scatter plots, histograms, box plots, and hexbin plots
Create a correlation heatmap to confirm the sign and magnitude of correlation across your features.

Q2. These are two of the most common variables for data visualization:

matplotlib and seaborn
scipy and seaborn
numpy and matplotlib
scipy and numpy

Q3. (True/False) You can use the pandas library to use plots.

True
False

Quiz 07: Check for Understanding

Q1. (True/False) Classification models require that input features be scaled.

True
False

Q2. (True/False) Feature scaling allows better interpretation of distance-based approaches.

True
False

Q3. (True/False) Feature scaling reduces distortions caused by variables with different scales.

True
False

Module 2 Quiz

Q1. Which of the following statements about cloud data access using Pandas is TRUE?

With read_csv , the online file must be comma-delimited.
The ead_csv function can read data directly from a website or url.
With read_csv , the destination file must have column names in the first row.
A remote destination file must be downloaded locally before it can be read by Pandas.

Q2. In which case below is it most plausible to conclude that an observation includes an outlier for one of the features?

One feature has a deleted residual value above 3.
The observation includes the maximum target value.
The observation is missing values for several of the features.
One feature has an internally-studentized residual value above 3.

Q3. Which of these approaches to feature engineering will be impacted LEAST by extreme values?

RobustScaler
MinMaxScaler
LabelBinarizer
OneHotEncoder

Q4. Which of these approaches to feature engineering will be impacted MOST by extreme values?

RobustScaler
MinMaxScaler
LabelBinarizer
OneHotEncoder

Q5. (True/False) RobustScaler adapts MinMaxScaler to account for outliers.

True
False

Q6. (True/False) StandardScaler requires data that are normally distributed.

True
False

Q7. (True/False) Any features that have been transformed by StandardScaler, MinMaxScaler, or RobustScaler will take values in (0,1)

True
False

Q8. Which of the following assertions describes a good reason for using scatter plots to complement calculating the correlation coefficient between two variables?

A scatter plot helps you visualize whether outliers are inflating or deflating a correlation coefficient
A scatter plot will help you identify if the correlation is positive or negative
A scatter plot takes into account both the spearman and the person correlation coefficients in a single step.
It is computationally more efficient to produce a scatter plot first and then compute a correlation coefficient.

Exploratory Data Analysis for Machine Learning Week 02 Quiz Answers

Quiz 01: Check for Understanding

Q1. (True/False) In general, the population parameters are unknown

True.
False.

Q2. (True/False) Parametric models have finite number of parameters.

True.
False.

Q3. The most common way of estimating parameters in a parametric model is:

using the maximum likelihood estimation
using the central limit theorem
extrapolating a non-parametric model
extrapolating Bayesian statistics

Quiz 02: Check for Understanding

Q1. A p-value is

the smallest significance level at which the null hypothesis would be rejected
the probability of the null hypothesis being true
the probability of the null hypothesis being false
the smallest significance level at which the null hypothesis is accepted

Q2. Type 1 Error 1 is defined as:

Saying the null hypothesis is false, when it is actually true
Saying the null hypothesis is true, when it is actually false

Q3. You find through a graph that there is a strong correlation between Net Promoter Score and the visual time that customers spend on a website. Select the TRUE assertion:

There is an underlying factor that explains this correlation, but manipulating the time that customers spend on a website may not affect the Net Promoter Score they will give to the company
To boost the Net Promoter Score of a business, you need to increase the time that customers spend on a website.

Q4. (True/False) If you reject the null hypothesis, it means that the alternate hypothesis is true.

True
False

Module 3 Quiz

Q1. Type of Statistics in which the posterior probability is the updated belief on the probability of an event happening given the prior and the data observed.

Classical Statistics
Frequentist Statistics
Bayesian Statistics
Descriptive Statistics

Q2. (True/False) On a given hypothesis test, you obtain a p-value of 0.051. This can be interpreted as approaching significance or almost significant.

True
False

Q3. (True/False) When you reject the null hypothesis, then the alternate hypothesis must be true.

True
False

Get All Course Quiz Answers of IBM Machine Learning Professional Certificate

Exploratory Data Analysis for Machine Learning Quiz Answers

Supervised Machine Learning: Regression Quiz Answers

Supervised Machine Learning: Classification Coursera Quiz Answers

Unsupervised Machine Learning Coursera Quiz Answers

Deep Learning and Reinforcement Learning Quiz Answers

Specialized Models: Time Series and Survival Analysis Quiz Answers

Exploratory Data Analysis for Machine Learning Quiz Answers

Table of Contents

Exploratory Data Analysis for Machine Learning Week 01 Quiz Answers

Quiz 01: Check for Understanding

Quiz 02: Check for Understanding

Module 1 Quiz

Quiz 03: Check for Understanding

Quiz 05: Check for Understanding

Quiz 06: Check for Understanding

Quiz 07: Check for Understanding

Module 2 Quiz

Exploratory Data Analysis for Machine Learning Week 02 Quiz Answers

Quiz 01: Check for Understanding

Quiz 02: Check for Understanding

Module 3 Quiz

Get All Course Quiz Answers of IBM Machine Learning Professional Certificate

Team Networking Funda

Leave a ReplyCancel Reply

Table of Contents

Exploratory Data Analysis for Machine Learning Week 01 Quiz Answers

Quiz 01: Check for Understanding

Quiz 02: Check for Understanding

Module 1 Quiz

Quiz 03: Check for Understanding

Quiz 05: Check for Understanding

Quiz 06: Check for Understanding

Quiz 07: Check for Understanding

Module 2 Quiz

Exploratory Data Analysis for Machine Learning Week 02 Quiz Answers

Quiz 01: Check for Understanding

Quiz 02: Check for Understanding

Module 3 Quiz

Get All Course Quiz Answers of IBM Machine Learning Professional Certificate

Team Networking Funda

Related Posts

Developing Data Products Quiz Answers – Coursera Graded Solution

Complete Practical Machine Learning Quiz Answers

Regression Models Quiz Answers – All Weeks Graded Quiz Solution

Leave a ReplyCancel Reply

Trending now