Welcome to your complete guide for Supervised Machine Learning: Regression and Classification quiz answers! Whether you’re completing practice quizzes to enhance your understanding or preparing for graded quizzes to assess your knowledge, this guide has you covered.
Spanning all course modules, this resource will help you master supervised machine learning techniques, focusing on regression and classification models like linear regression, decision trees, and support vector machines.
Supervised Machine Learning: Regression and Classification Quiz Answers – Practice & Graded Quizzes for All Modules
Table of Contents
Supervised Machine Learning: Regression and Classification Week 01 Quiz Answers
Quiz 1: Supervised vs. unsupervised learning Quiz Answers
Question 1: Which are the two common types of supervised learning? (Choose two)
Answer:
- Classification
- Regression
Explanation:
Supervised learning involves learning from labeled data. The two most common types are classification (predicting categorical labels) and regression (predicting continuous values).
Question 2: Which of these is a type of unsupervised learning?
Answer:
- Clustering
Explanation:
Unsupervised learning involves learning from data without labels. Clustering is a common unsupervised learning task where the goal is to group similar data points together.
Quiz 2: Regression Quiz Answers
Question 1: For linear regression, the model is fw,b(x)=wx+bf_{w,b}(x) = wx + bfw,b(x)=wx+b. Which of the following are the inputs, or features, that are fed into the model and with which the model is expected to make a prediction?
Answer:
- x
Explanation:
In linear regression, xxx represents the input features, or the independent variable(s), that are used to make predictions. www and bbb are parameters (weights and bias) of the model, and (x,y)(x, y)(x,y) represents the data points (input and output).
Question 2: For linear regression, if you find parameters www and bbb so that J(w,b)J(w,b)J(w,b) is very close to zero, what can you conclude?
Answer:
- The selected values of the parameters www and bbb cause the algorithm to fit the training set really well.
Explanation:
In linear regression, J(w,b)J(w,b)J(w,b) represents the cost function, which measures the error between the predicted and actual values. A cost close to zero indicates that the model’s predictions are very close to the true values, meaning it fits the training set well.
Quiz 3: Train the model with gradient descent Quiz Answers
Question 1: When ∂J(w,b)∂w\frac{\partial J(w,b)}{\partial w}∂w∂J(w,b) is a negative number (less than zero), what happens to www after one update step?
Answer:
- w increases.
Explanation:
In gradient descent, the update step is computed as:
w=w−α∂J(w,b)∂ww = w – \alpha \frac{\partial J(w,b)}{\partial w}w=w−α∂w∂J(w,b)
If ∂J(w,b)∂w\frac{\partial J(w,b)}{\partial w}∂w∂J(w,b) is negative, then subtracting a negative value is equivalent to adding a positive value to www, causing www to increase.
Question 2: For linear regression, what is the update step for parameter bbb?
Answer:
- b=b−α1m∑i=1m(fw,b(x(i))−y(i))b = b – \alpha \frac{1}{m} \sum_{i=1}^{m} (f_{w,b}(x^{(i)}) – y^{(i)})b=b−αm1∑i=1m(fw,b(x(i))−y(i))
Explanation:
The update rule for bbb in linear regression is derived from gradient descent, where the gradient of the cost function J(w,b)J(w,b)J(w,b) with respect to bbb is computed and subtracted by the learning rate α\alphaα. This expression correctly captures the update step for the bias parameter bbb.
Supervised Machine Learning: Regression and Classification Week 02 Quiz Answers
Quiz 1: Multiple Linear Regression Quiz Answers
Q1. In the training set below, what is x_4^{(3)}x ? Please type in the number below (this is an integer such as 123, no decimal points).
Answer: 125
Question 2: Which of the following are potential benefits of vectorization? Please choose the best option.
Answer:
- All of the above
Explanation:
Vectorization can improve performance in multiple ways: it makes your code run faster by utilizing optimized matrix operations, can make the code shorter and more readable, and can enable parallelism on compute hardware like GPUs.
Question 3: True/False? To make gradient descent converge about twice as fast, a technique that almost always works is to double the learning rate α\alphaα.
Answer:
- False
Explanation:
Doubling the learning rate α\alphaα does not always make gradient descent converge faster. In fact, it can cause the algorithm to overshoot the optimal values, leading to divergence. Choosing an appropriate learning rate is crucial for effective convergence.
Quiz 2: Gradient descent in practice Quiz Answers
Question 1: Which of the following is a valid step used during feature scaling?
Answer:
- Subtract the mean (average) from each value and then divide by the (max – min).
Explanation:
Feature scaling involves transforming features to a common scale, and one common approach is subtracting the mean and dividing by the range (max – min) to normalize the data. This ensures all features have a similar scale for more effective learning.
Question 2: Suppose a friend ran gradient descent three separate times with three choices of the learning rate α\alphaα and plotted the learning curves for each (cost JJJ for each iteration). For which case, A or B, was the learning rate α\alphaα likely too large?
Answer:
- Case B only
Explanation:
If the learning rate is too large, the cost function might oscillate or increase, indicating that the algorithm is overshooting the optimal point. Case B likely shows this behavior.
Question 3: Of the circumstances below, for which one is feature scaling particularly helpful?
Answer:
- Feature scaling is helpful when one feature is much larger (or smaller) than another feature.
Explanation:
When features have very different scales, gradient descent or other optimization algorithms can have trouble converging efficiently. Feature scaling helps by bringing all features into a similar scale, allowing the algorithm to work more effectively.
Question 4: You are helping a grocery store predict its revenue, and have data on its items sold per week, and price per item. What could be a useful engineered feature?
Answer:
- For each product, calculate the number of items sold times price per item.
Explanation:
The revenue for each product can be calculated by multiplying the number of items sold by the price per item. This would be a useful engineered feature that directly relates to the grocery store’s revenue.
Question 5: True/False? With polynomial regression, the predicted values fw,b(x)f_{w,b}(x)fw,b(x) does not necessarily have to be a straight line (or linear) function of the input feature xxx.
Answer:
- True
Explanation:
Polynomial regression extends linear regression by allowing the model to fit curves rather than just straight lines. This means the predicted values are not necessarily linear, even though the model is still based on a linear relationship between the coefficients and the input features.
Supervised Machine Learning: Regression and Classification Week 03 Quiz Answers
Quiz 1: Classification with Logistic Regression Quiz Answers
Question 1: Which is an example of a classification task?
Answer:
- Based on the size of each tumor, determine if each tumor is malignant (cancerous) or not.
Explanation:
Classification tasks involve predicting a category or class. In this case, the task is to classify tumors as malignant or not, which is a binary classification problem.
Question 2: Recall the sigmoid function is g(z)=11+e−zg(z) = \frac{1}{1 + e^{-z}}g(z)=1+e−z1. If zzz is a large positive number, then:
Answer:
- g(z)g(z)g(z) will be near one (1)
Explanation:
For large positive values of zzz, the exponential term e−ze^{-z}e−z becomes very small, and thus the sigmoid function g(z)g(z)g(z) approaches 1.
Question 3: A cat photo classification model predicts 1 if it’s a cat, and 0 if it’s not. For a particular photograph, the logistic regression model outputs g(z)g(z)g(z) (a number between 0 and 1). Which of these would be a reasonable criteria to decide whether to predict if it’s a cat?
Answer:
- Predict it is a cat if g(z)≥0.5g(z) \geq 0.5g(z)≥0.5
Explanation:
In binary classification using logistic regression, a common threshold is 0.5. If the model outputs a value greater than or equal to 0.5, it predicts class 1 (cat), otherwise, it predicts class 0 (not a cat).
Question 4: True/False? No matter what features you use (including if you use polynomial features), the decision boundary learned by logistic regression will be a linear decision boundary.
Answer:
- True
Explanation:
Logistic regression inherently learns a linear decision boundary between the classes, even if you use polynomial features. The decision boundary will always be linear in the transformed feature space, as logistic regression is based on linear combinations of the input features.
Quiz 2: Cost function for logistic regression Quiz Answers
Question 1: In this lecture series, “cost” and “loss” have distinct meanings. Which one applies to a single training example?
Answer:
- Loss
Explanation:
Loss refers to the error made by the model on a single training example. The cost is typically the average of the losses over all examples in the dataset.
Question 2: For the simplified loss function, if the label y(i)=0y^{(i)} = 0y(i)=0, then what does this expression simplify to?
Answer:
- −log(1−fw,b(x(i)))-\log(1 – f_{w,b}(x^{(i)}))−log(1−fw,b(x(i)))
Explanation:
For binary classification with a sigmoid output, the loss function is defined as:
L=−y(i)log(fw,b(x(i)))−(1−y(i))log(1−fw,b(x(i)))L = -y^{(i)} \log(f_{w,b}(x^{(i)})) – (1 – y^{(i)}) \log(1 – f_{w,b}(x^{(i)}))L=−y(i)log(fw,b(x(i)))−(1−y(i))log(1−fw,b(x(i)))
When y(i)=0y^{(i)} = 0y(i)=0, this simplifies to −log(1−fw,b(x(i)))-\log(1 – f_{w,b}(x^{(i)}))−log(1−fw,b(x(i))), since the first term vanishes.
Quiz 3: Gradient descent for logistic regression Quiz Answers
Question 1: Which of the following two statements is a more accurate statement about gradient descent for logistic regression?
Answer:
- The update steps look like the update steps for linear regression, but the definition of fw,b(x(i))f_{w,b}(x^{(i)})fw,b(x(i)) is different.
Explanation:
In both linear regression and logistic regression, the update steps follow a similar pattern of adjusting the parameters based on the gradient of the cost function. However, in logistic regression, the prediction function fw,b(x(i))f_{w,b}(x^{(i)})fw,b(x(i)) is the sigmoid function, which differs from the linear function used in linear regression. This leads to different gradients in the cost function.
Quiz 4: The problem of overfitting Quiz Answers
Question 1: Which of the following can address overfitting?
Answer:
- Apply regularization
Explanation:
Regularization helps prevent overfitting by penalizing large weights, thereby simplifying the model. Collecting more data, selecting relevant features, or removing random training examples may also help, but regularization is a direct technique designed to combat overfitting.
Question 2: You fit logistic regression with polynomial features to a dataset, and your model looks like this. What would you conclude?
Answer:
- The model has high variance (overfit). Thus, adding data is likely to help
Explanation:
A model with high variance (overfitting) typically fits the training data too well but fails to generalize to unseen data. Adding more data can help the model generalize better and reduce overfitting.
Question 3: Suppose you have a regularized linear regression model. If you increase the regularization parameter λ\lambdaλ, what do you expect to happen to the parameters w1,w2,…,wnw_1, w_2, …, w_nw1,w2,…,wn?
Answer:
- This will reduce the size of the parameters w1,w2,…,wnw_1, w_2, …, w_nw1,w2,…,wn
Explanation:
Increasing the regularization parameter λ\lambdaλ penalizes large weights in the model, leading to smaller parameter values. This is a key asp
Sources: Supervised Machine Learning: Regression and Classification
Frequently Asked Questions (FAQ)
Are the Supervised Machine Learning: Regression and Classification quiz answers accurate?
Yes, these answers have been carefully verified to ensure they align with the latest course content and concepts in supervised machine learning.
Can I use these answers for both practice and graded quizzes?
Absolutely! These answers are designed for both practice quizzes and graded assessments, ensuring thorough preparation for all quizzes.
Does this guide cover all modules of the course?
Yes, this guide includes answers for every module, offering complete coverage for the entire course.
Will this guide help me understand regression and classification models better?
Yes, beyond providing quiz answers, this guide reinforces key machine learning algorithms and techniques such as linear regression, logistic regression, decision trees, and k-nearest neighbors.
Conclusion
We hope this guide to Supervised Machine Learning: Regression and Classification Quiz Answers helps you excel in your course and gain a strong understanding of regression and classification models. Bookmark this page for easy access and share it with your classmates.
Ready to master supervised learning algorithms and ace your quizzes? Let’s get started!
Get All Course Quiz Answers of Machine Learning Specialization
Supervised Machine Learning: Regression and Classification Quiz Answers
Advanced Learning Algorithms Coursera Quiz Answers
Unsupervised Learning, Recommenders, Reinforcement Learning Quiz Answers