Design Thinking and Predictive Analytics for Data Products Quiz Answers

Get All Weeks Design Thinking and Predictive Analytics for Data Products Quiz Answers

This is the second course in the four-course specialization Python Data Products for Predictive Analytics, building on the data processing covered in Course 1 and introducing the basics of designing predictive models in Python. In this course, you will understand the fundamental concepts of statistical learning and learn various methods of building predictive models.

At each step in the specialization, you will gain hands-on experience in data manipulation and building your skills, eventually culminating in a capstone project encompassing all the concepts taught in the specialization.

Enroll on Coursera

Design Thinking and Predictive Analytics for Data Products Quiz Answers

Week 1 Quiz Answers

Quiz 1: Supervised Learning

Q1. W​hat are some disadvantages of designing a system based on two different platforms?

  • T​his system may not adapt well to new settings.
  • T​his system may depend on false assumptions on how users relate to items.
  • T​his system does not rely on any data.

Q2. W​hat do y, X, and theta represent in a linear regression equation?

  • y​ – weights or parameters we tune to make the prediction
    • X​ – features, the things we will use to predict
    • t​heta – labels, the thing we want to predict
  • y​ – weights or parameters we tune to make the prediction
    • X​ – labels, the thing we want to predict
    • t​heta – features, the things we will use to predict
  • y​ – labels, the thing we want to predict
    • X​ – weights or parameters we tune to make the prediction
    • t​heta – features, the things we will use to predict
  • y​ – labels, the thing we want to predict
    • X​ – features, the things we will use to predict
    • t​heta – weights or parameters we tune to make the prediction
  • y​ – features, the things we will use to predict
    • X​ – labels, the thing we want to predict
    • t​heta – weights or parameters we tune to make the prediction

Quiz 2: Review: Regression

Q1. W​hy did we implement the feature function with a 1 at the 0th index (like below)?

def feature(datum):
feat = [1, datum[0], datum[1]]
return feat
  • T​o initialize the model
  • T​o equalize all the parameters in the model
  • T​o balance outliers in the data
  • T​o account for the intercept parameter

Q2. Which of the following is true about autoregression?

  • I​t is fitted with logistic regression
  • I​t is fitted with least-squares
  • It can be described as a​ time-series model that is trained with and takes input from previous observations.
  • It allows us to capture periodic effects, such as weekly sales or website traffic
  • T​he past observations used as input are weighted by chronological order

Quiz 3: Supervised Learning & Regression

Q1. What is the main difference between supervised and unsupervised learning?

  • Supervised learning lets you understand the relationship between observations
  • Supervised learning is optimized for predictive tasks
  • Unsupervised learning finds patterns in data for accurate predictions
  • Unsupervised learning lets you understand the relationship between the response and predictors

Q2. What is linear regression?

  • Approximating a line to fit the data
  • Y = x * θ
  • Inflexible
  • All of the above

Q3. What is the biggest difference between first three methods (sliding window, linear decay, exponential decay) and the fourth model (autoregression) for time-series regression?

  • The first three can capture periodic effects, while the fourth cannot
  • The first three assign weights based on an arbitrary scheme, while the fourth learns the weights.
  • The first three are more efficient computationally than the fourth
  • The first three can be programmed more easily than the fourth

Q4. In the line of code pertaining to autoregression:

// This is Python, not Javascript
// windowSize = 10
x = [float(d[5]) for d in dataset[ind-windowSize:ind]]
What does the variable x mean conceptually?
  • x is a vector of the past 5 and next 10 observations
  • x is a vector of the past 10 observations
  • More information is needed
  • x is a vector of the next 10 observations

Week 2 Quiz Answers

Quiz 1: Getting Features

Q1. W​hy do we use one-hot encoding?

T​o correctly handle continuous variables in data analysis

B​ecause label encoding assumes an order to categorical variables (i.e. the higher, the better)

T​o get a better representation of categorical variables in a dataset

B​ecause two hots are too many

Q2. Y​ou want to build a piecewise model based on days of the week. Which of the following could be the one-hot encoding for this model’s features?

  • M​onday: [1, 1, 1, 1, 1, 1, 0]
    • T​uesday: [1, 1, 1, 1, 1, 0, 0]
    • Wednesday: [1, 1, 1, 1, 0, 0, 0]
    • Thursday: [1, 1, 1, 0, 0, 0, 0]
    • Friday: [1, 1, 0, 0, 0, 0, 0]
    • Saturday: [1, 0, 0, 0, 0, 0, 0]
  • M​onday: [1, 0, 1, 0, 0, 0, 0]
    • T​uesday: [1, 0, 1, 0, 0, 0, 0]
    • Wednesday: [1, 0, 0, 1, 0, 0, 0]
    • Thursday: [1, 0, 0, 1, 0, 0, 0]
    • Friday: [1, 0, 0, 0, 0, 1, 0]
    • Saturday: [1, 0, 0, 0, 0, 1, 0]
  • M​onday: [1, 0, 0, 0, 0, 0, 0]
    • T​uesday: [0, 1, 0, 0, 0, 0, 0]
    • Wednesday: [0, 0, 1, 0, 0, 0, 0]
    • Thursday: [0, 0, 0, 1, 0, 0, 0]
    • Friday: [0, 0, 0, 0, 1, 0, 0]
    • Saturday: [0, 0, 0, 0, 0, 1, 0]
  • M​onday: [1, 1, 0, 0, 0, 0, 0]
    • T​uesday: [1, 0, 1, 0, 0, 0, 0]
    • Wednesday: [1, 0, 0, 1, 0, 0, 0]
    • Thursday: [1, 0, 0, 0, 1, 0, 0]
    • Friday: [1, 0, 0, 0, 0, 1, 0]
    • Saturday: [1, 0, 0, 0, 0, 0, 1]

Quiz 2: Working with Features

Q1. W​hy is it okay if we transform features for linear models, but not the parameters?

  • C​hanging the parameters should be avoided at all costs
  • P​arameters for linear models are not able to be modified
  • C​hanging the values of features results in negligible changes in the model
  • L​inear models also need linear parameters
  • A​rbitrary combinations of features will not modify the linearity of the parameters

Q2. W​hich of the following are reasonable estimates to replace missing values (for missing data imputation)?

  • T​he median of all instances of that feature
  • 0​
  • T​he average of all numerical values in the dataset
  • T​he average of all instances of some subgroup
  • T​he average of all instances of that feature
  • A​ value placed by an additional prediction

Quiz 3: Features

Q1. Which of the following matrices could be used in one-hot encoding for a 5-dimensional dataset?

  • [0, 0, 1, 0]
  • [1, 1, 1, 0]
  • [0, 0, 0, 0]
  • [1, 0, 1, 0]

Q2. Why is it more difficult to create a linear regression model for categorical features?

  • We can’t create regression models with categorical features
  • Because there are too many data points, so a straight line is not flexible enough to fit the data
  • Because categorical features must be measured with two lines or more
  • Because the outputs are not quantitative

Q3. What is the limitation of feature transformations in linear models?

  • We can make different combinations of features, but not parameters
  • We can make different combinations of parameters, but not features
  • We can only transform quantitative features (i.e. non-categorical)
  • The models can only have linear features

Q4. W​hich of the following can be good strategies for dealing with missing data?

  • M​odeling: change our regression/classification algorithms to explicitly deal with missing values.
  • F​iltering: discard missing values
  • M​issing Data Imputation: fill in missing data with reasonable estimates.

Q5. W​hy shouldn’t we simply discard every instance that has a missing value in the data?

  • M​issing values are very influential in a dataset, so throwing them out means that our model will be extremely inaccurate.
  • I​f many features are missing, we can end up tossing out a large percentage or even all of the dataset.
  • We can simply replace missing values with 0 (or equivalent) to keep and use every instance of the data.

Week 3 Quiz Answers

Quiz 1: Classification and K-Nearest Neighbors

Q1. W​hat is the best definition of classification below?

  • A​ model that infers how a categorical variable is related to other variables
  • A​ model that infers how a numerical variable is related to other variables
  • A​ model that seeks to predict the numerical value of some variable
  • A​ model that seeks to predict the category of some variable

Q2. Which of the following is/are classification problem(s)?

  • Predicting the gender of a person by his/her handwriting style
  • Predict the number of copies a music album will be sold next month
  • Predicting house price based on area
  • Predicting whether monsoon will be normal next year

Q3. A selected Flower has its neighbors as Tall, Short, Short, Tall, Tall, Short, from nearest to furthest in that order. What would the Flower be classified as, given a K of 5?

  • S​hort
  • T​all

Quiz 2: Logistic Regression and Support Vector Machines

Q1. W​hich of the following are questions answerable with logistic regression (vs. linear regression)?

  • Will a given stock will issue a dividend this year based on last year’s percent profit?
  • What is the probability that a student who studies for 40 hours and has an undergrad GPA of 3.5 gets an A in the class?
  • H​ow well does square footage affect a house’s price?
  • Is at least one of weight, height, body mass index, or number of cigarettes smoked per week useful in predicting a heart attack?

Q2. W​hat is the basic idea behind Support Vector Machines?

  • T​o take some percentage of the closest datapoints and use majority-rules to classify each point
  • T​o create vectors that each generally encompass a class
  • T​o find a border, or hyperplane, that best divides multidimensional data into classes
  • T​o calculate the probability of a predicted variable

Quiz 3: Classification

Q1. W​hat is the difference between regression and classification?

  • In r​egression, we predict the output value with training data. In classification, we group the output into a class.
  • I​n regression, we predict future data from the given training data. In classification, we infer relationships between the given training data.
  • I​n regression, the dependent variables (outcomes, labels, etc.) are categorical and unordered. In classification, the dependent variables (outcomes, labels, etc.) are continuous values or ordered whole values.

Q2. How do we calculate the accuracy of a classification?

  • W​e record our model’s predictions and see how many of them match the actual outcomes of the data.
  • W​e record our model’s predictions and see how close each value is to the actual outcome.
  • W​e use the Mean Squared Error of the distance between predicted values and actual outcomes.

Q3. Y​ou have the following data and want to classify a point with X1 = 1, X2 = 2, X3 = 3 with K-Nearest Neighbors. What would this point be classified as if K = 3?

  • D​og
  • C​at

Q4. True or False: We use logistic regression to model the probability of a certain outcome

  • T​rue
  • F​alse

Q5. True or False: W​e can use K-Nearest Neighbors for both classification and regression.

  • F​alse
  • T​rue

Q6. W​hat is the difference between linear and logistic regression?

  • The outcome (label) for l​inear regression is categorical, like red/green or dog/cat. The outcome (label) for logistic regression is continuous, like weight or number of hours.
  • L​inear regression results in a line, whereas logistic regression gives us an S-shaped curve.
  • The outcome (label) for l​inear regression can have any one of an infinite set of values. The outcome (label) for logistic regression has a limited number of possible values.
  • T​he equation for linear regression is in the form of a line (y = mx + b), whereas the equation for logistic regression is given by the logistic function (y = e^X + e^(-X)).
  • The outcome (label) for l​inear regression is the actual value of the outcome. The outcome (label) for logistic regression is the probability of getting a certain outcome.

Q7. W​hat is true of Support Vector Machines (SVMs)?

  • S​VMs are effective in low-dimensional spaces
  • S​VMs use kernel functions to divide classes/labels.
  • S​VMs will never overfit if there are many more features than samples in the dataset.
  • S​VMs are memory-efficient since they use a subset of training points rather than all the points

Week 4 Quiz Answers

Quiz 1: Classification and Training

Q1. W​hy is it important to test out your model on shuffled, unseen data?

  • S​huffled samples will help us avoid areas of the model that under/overfit the data.
  • I​f we evaluate a model based on data it has seen before, we may overestimate its performance.
  • Using shuffled samples ensures that we don’t know which datapoints are used for what.

Q2. W​hat library do we import for easily creating a logistic regression model?

  • j​son
  • matplotlib
  • n​umpy
  • s​klearn

Quiz 2: Gradient Descent

Q1. W​hy do we initialize theta_0 (offset parameter) to the mean value of the output labels?

  • T​o help the model converge faster
  • T​o offset the gradient steps by a pre-given value
  • T​o match the mathematical equation given in lecture
  • T​o optimize the gradient descent algorithm

Q2. W​hat are some pitfalls we should watch out for when using TensorFlow to compute a gradient descent?

  • W​e should compute the partial derivatives ourselves and pass it to the TensorFlow library
  • W​e should convert our vector of labels to a TensorFlow row vector, rather than a column vector
  • W​e should convert our vector of labels to a TensorFlow column vector, rather than a row vector
  • W​e should explicitly define our variables as variables, rather than leaving them as constants

Quiz 3: More on Classification

Q1. Which of the following are true about gradient descents?

  • T​he size of each iterative step in the direction of steepest descent is called the learning rate.
  • Mathematically, it is a partial derivative with respect to its inputs.
  • It is used as a​ way to update the parameters of your model.
  • It is an optimization procedure used with many machine learning algorithms to minimize some function by iteratively moving in the direction of steepest descent as defined by the negative of the gradient.

Q2. W​hy do we use separate training and testing datasets?

  • We use training sets to build our model and testing sets to make our model reach 100% accuracy on our overall dataset.
  • We use training sets to build our model and testing sets to see how well our model performs on never-before-seen data.
  • W​e should always use our training set as our testing set.

Q3. W​hat does the TensorFlow method tf.constant() do for us?

  • I​t creates a randomized list of values to sort our dataset by.
  • It creates a tensor (generalized matrix) populated with the given values.
  • I​t creates a constant variable that cannot be changed for the rest of a Jupyter notebook cell.

Q4. W​hat is the main advantage of TensorFlow for gradient descents?

  • TensorFlow computes the gradient with various built-in functions like optimizer.minimize() and tf.train.AdamOptimizer().
  • T​ensorFlow automatically assumes the Mean Squared Error as the model’s objective with reduce_mean().
  • T​ensorFlow always minimizes the error of the optimization.

Q5. By what factor is our Theta modified in each iteration of Gradient Descent in Python?

  • dTheta * Inner Product
  • Learning Rate * dTheta
  • Learning Rate * Norm

Q6. In a data set containing n variables, what is the maximum number of variables usable to calculate a specific feature?

  • n
  • n + 1
  • n – 1

Q7. What will happen if all values of theta_n = 0 for n > 0? (Say theta_0 = 1)

  • The function will be constant.
  • The function is independent of the other variables.
  • The function will be infinite.
Conclusion:

I hope this Design Thinking and Predictive Analytics for Data Products Coursera Quiz Answers Coursera Quiz Answer would be useful for you to learn something new from this Course. If it helped you then don’t forget to bookmark our site for more Quiz Answers.

This course is intended for audiences of all experiences who are interested in learning about new skills, there are no prerequisite courses.

Keep Learning!

Get All Course Quiz Answers of Python Data Products for Predictive Analytics Specialization

Basic Data Processing and Visualization Coursera Quiz Answers

Design Thinking and Predictive Analytics for Data Products Quiz Answers

Meaningful Predictive Modeling Coursera Quiz Answers

Deploying Machine Learning Models Coursera Quiz Answers

Leave a Reply

Your email address will not be published.

error: Content is protected !!