Get All Weeks Machine Learning Foundations: A Case Study Approach Quiz Answers
Table of Contents
Week 1: Machine Learning Foundations: A Case Study Approach Quiz Answer
Quiz 1: S Frames
Q 1:Download the Wiki People SFrame. Then open a new Jupyter notebook, import TuriCreate, and read the SFrame data.
Answer: Click here
Q 2: How many rows are in the SFrame? (Do NOT use commas or periods.)
Answer: 59071
Q 3: Which name is in the last row?
- Conradign Netzer
- Cthy Caruth
- Fawaz Damrah
Q 4: Read the text column for Harpdog Brown. He was honored with:
- A Grammy award for his latest blues album.
- A gold harmonica to recognize his innovative playing style.
- A lifetime membership in the Hamilton Blues Society.
Q 5: Sort the SFrame according to the text column, in ascending order. What is the name entry in the first row?
- Zygfryd Szo
- Digby Morrell
- 007 James Bond
- 108 (artist)
- 8 Ball Aitken
Week 2: Machine Learning Foundations: A Case Study Approach Quiz Answer
Quiz 1: Regression
Q 2: True or false: The model that best minimizes training error is the one that will perform best for the task of prediction on new data.
- True
- False
Q 3: The following table illustrates the results of evaluating 4 models with different parameter choices on some data set. Which of the following models fits this data the best?
Model index | Parameters (intercept, slope) | Residual sum of squares (RSS) |
1 | (0,1.4) | 20.51 |
2 | (3.1,1.4) | 15.23 |
3 | (2.7, 1.9) | 13.67 |
4 | (0, 2.3) | 18.99 |
- Model 1
- Model 2
- Model 3
- Model 4
Q 4: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)

- w0
- w1
- w2
- none of the above
Q 5: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)
- w0
- w1
- w2
- none of the above
Q 6: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)
- w0
- w1
- w2
- none of the above
Q 7: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)
- w0
- w1
- w2
- none of the above
Q 8: Which of the following plots would you not expect to see as a plot of training and test error curves?
Answer:
Q 9: True or false: One always prefers to use a model with more features since it better captures the true underlying process.
- True
- False
Quiz 2: Predicting house prices
Q 1: Selection and summary statistics: We found the zip code with the highest average house price. What is the average house price of that zip code?
- $75,000
- $7,700,000
- $540,088
- $2,160,607
Q 2: Filtering data: What fraction of the houses have living space between 2000 sq.ft. and 4000 sq.ft.?
- Between 0.2 and 0.29
- Between 0.3 and 0.39
- Between 0.4 and 0.49
- Between 0.5 and 0.59
- Between 0.6 and 0.69
Q 3: Building a regression model with several more features: What is the difference in RMSE between the model trained with my_features and the one trained with advanced_features?
- the RMSE of the model with advanced_features lower by less than $25,000
- the RMSE of the model with advanced_features lower by between $25,001 and $35,000
- the RMSE of the model with advanced_features lower by between $35,001 and $45,000
- the RMSE of the model with advanced_features lower by between $45,001 and $55,000
- the RMSE of the model with advanced_features lower by more than $55,000
Week 3: Machine Learning Foundations: A Case Study Approach Quiz Answer
Quiz 1: Classification
Q 1: The simple threshold classifier for sentiment analysis described in the video (check all that apply):
- Must have pre-defined positive and negative attributes
- Must either count attributes equally or pre-define weights on attributes
- Defines a possibly non-linear decision boundary
Q 2: For a linear classifier classifying between “positive” and “negative” sentiment in a review x, Score(x) = 0 implies (check all that apply):
- The review is very clearly “negative”
- We are uncertain whether the review is “positive” or “negative”
- We need to retrain our classifier because an error has occurred
Q 3: For which of the following datasets would a linear classifier perform perfectly?
Answer:

Q 4: True or false: High classification accuracy always indicates a good classifier.
- True
- False
Q 5: True or false: For a classifier classifying between 5 classes, there always exists a classifier with an accuracy greater than 0.18.
- True
- False
Q 6: True or false: A false negative is always worse than a false positive.
- True
- False
Q 7: Which of the following statements are true? (Check all that apply)
- Test error tends to decrease with more training data until a point, and then does not change (i.e., curve flattens out)
- Test error always goes to 0 with an unboundedly large training dataset
- Test error is never a function of the amount of training data
Quiz 2: Analyzing product sentiment
Q 1: Out of the 11 words in selected_words, which one is most used in the reviews in the dataset?
- awesome
- love
- hate
- bad
- great
Q 2: Out of the 11 words in selected_words, which one is least used in the reviews in the dataset?
- wow
- amazing
- terrible
- awful
- love
Q 3: Out of the 11 words in selected_words, which one got the most positive weight in the selected_words_model?
(Tip: when printing the list of coefficients, make sure to use print_rows(rows=12) to print ALL coefficients.)
- amazing
- awesome
- love
- fantastic
- terrible
Question 4: Out of the 11 words in selected_words, which one got the most negative weight in the selected_words_model?
(Tip: when printing the list of coefficients, make sure to use print_rows(rows=12) to print ALL coefficients.)
- horrible
- terrible
- awful
- hate
- love
Q 5: Which of the following ranges contains the accuracy of the selected_words_model on the test_data?
- 0.811 to 0.841
- 0.841 to 0.871
- 0.871 to 0.901
- 0.901 to 0.931
Q 6: Which of the following ranges contains the accuracy of the sentiment_model in the IPython Notebook from lecture on the test_data?
- 0.811 to 0.841
- 0.841 to 0.871
- 0.871 to 0.901
- 0.901 to 0.931
Q 7: Which of the following ranges contains the accuracy of the majority class classifier, which simply predicts the majority class on the test_data?
- 0.811 to 0.843
- 0.843 to 0.871
- 0.871 to 0.901
- 0.901 to 0.931
Q 8: How do you compare the different learned models with the baseline approach where we are just predicting the majority class?
- They all performed about the same.
- The model learned using all words performed much better than the one using the only the selected_words. And, the model learned using the selected_words performed much better than just predicting the majority class.
- The model learned using all words performed much better than the other two. The other two approaches performed about the same.
- Predicting the simply majority class performed much better than the other two models.
Q 9: Which of the following ranges contains the ‘predicted_sentiment’ for the most positive review for ‘Baby Trend Diaper Champ’, according to the sentiment_model from the IPython Notebook from lecture?
- Below 0.7
- 0.7 to 0.8
- 0.8 to 0.9
- 0.9 to 1.0
Q 10: Consider the most positive review for ‘Baby Trend Diaper Champ’ according to the sentiment_model from the IPython Notebook from lecture. Which of the following ranges contains the predicted_sentiment for this review, if we use the selected_words_model to analyze it?
- Below 0.7
- 0.7 to 0.8
- 0.8 to 0.9
- 0.9 to 1.0
Q 11: Why is the value of the predicted_sentiment for the most positive review found using the sentiment_model much more positive than the value predicted using the selected_words_model?
- The sentiment_model is just too positive about everything.
- The selected_words_model is just too negative about everything.
- This review was positive, but used too many of the negative words in selected_words.
- None of the selected words appeared in the text of this review.
Week 4: Machine Learning Foundations: A Case Study Approach Quiz Answer
Quiz 1: Clustering and Similarity
Q 1:A country, called Simpleland, has a language with a small vocabulary of just “the”, “on”, “and”, “go”, “round”, “bus”, and “wheels”. For a word count vector with indices ordered as the words appear above, what is the word count vector for a document that simply says “the wheels on the bus go round and round.”
Please enter the vector of counts as follows: If the counts were [“the”=1, “on”=3, “and”=2, “go”=1, “round”=2, “bus”=1, “wheels”=1], enter 1321211.
Answer: 21112111
Question 2: In Simpleland, a reader is enjoying a document with a representation: [1 3 2 1 2 1 1]. Which of the following articles would you recommend to this reader next?
- [7 0 2 1 0 0 1]
- [1 7 0 0 2 0 1]
- [1 0 0 0 7 1 2]
- [0 2 0 0 7 1 1]
Question 3: A corpus in Simpleland has 99 articles. If you pick one article and perform a 1-nearest neighbor search to find the closest article to this query article, how many times must you compute the similarity between two articles?
- 98
- 98*2 = 196
- 98/2 = 49
- (98)^2
- 99
Question 4: For the TF-IDF representation, does the relative importance of words in a document depend on the base of the logarithm used? For example, take the words “bus” and “wheels” in a particular document. Is the ratio between the TF-IDF values for “bus” and “wheels” different when computed using log base 2 versus log base 10?
- Yes
- No
Question 5:Which of the following statements are true? (Check all that apply):
- Deciding whether an email is spam or not spam using the text of the email and some spam / not spam labels is a supervised learning problem.
- Dividing emails into two groups based on the text of each email is a supervised learning problem.
- If we are performing clustering, we typically assume we either do not have or do not use class labels in training the model.
Question 6: Which of the following pictures represents the best k-means solution? (Squares represent observations, plus signs are cluster centers, and colors indicate assignments of observations to cluster centers.)
Answer

Quiz 2: Retrieving Wikipedia articles
Q 1: Top word count words for Elton John
- (the, john, singer)
- (england, awards, musician)
- (the, in, and)
- (his, the, since)
- (rock, artists, best)
Question 2: Top TF-IDF words for Elton John
- (furnish,elton,billboard)
- (john,elton,fivedecade)
- (the,of,has)
- (awards,rock,john)
- (elton,john,singer)
Question 3: The cosine distance between ‘Elton John’s and ‘Victoria Beckham’s articles (represented with TF-IDF) falls within which range?
- 0.1 to 0.29
- 0.3 to 0.49
- 0.5 to 0.69
- 0.7 to 0.89
- 0.9 to 1.0
Question 4: The cosine distance between ‘Elton John’s and ‘Paul McCartney’s articles (represented with TF-IDF) falls within which range?
- 0.1 to 0.29
- 0.3 to 0.49
- 0.5 to 0.69
- 0.7 to 0.89
- 0.9 to 1
Question 5: Who is closer to ‘Elton John’, ‘Victoria Beckham’ or ‘Paul McCartney’?
- Victoria Beckham
- Paul McCartney
Question 6: Who is the nearest cosine-distance neighbor to ‘Elton John’ using raw word counts?
- Billy Joel
- Cliff Richard
- Roger Daltrey
- George Bush
Question 7: Who is the nearest cosine-distance neighbor to ‘Elton John’ using TF-IDF?
- Roger Daltrey
- Rod Stewart
- Tommy Haas
- Elvis Presley
Question 8: Who is the nearest cosine-distance neighbor to ‘Victoria Beckham’ using raw word counts?
- Stephen Dow Beckham
- Louis Molloy
- Adrienne Corri
- Mary Fitzgerald (artist)
Question 9: Who is the nearest cosine-distance neighbor to ‘Victoria Beckham’ using TF-IDF?
- Mel B
- Caroline Rush
- David Beckham
- Carrie Reichardt
Week 5: Machine Learning Foundations: A Case Study Approach Quiz Answer
Quiz 1: Recommender Systems
Q1: Recommending items based on global popularity can (check all that apply):
- provide personalization
- capture context (e.g., time of day)
- none of the above
Question 2: Recommending items using a classification approach can (check all that apply):
- provide personalization
- capture context (e.g., time of day)
- none of the above
Question 3:Recommending items using a simple count-based co-occurrence matrix can (check all that apply):
- provide personalization
- capture context (e.g., time of day)
- none of the above
Question 4:Recommending items using featured matrix factorization can (check all that apply):
- provide personalization
- capture context (e.g., time of day)
- none of the above
Question 5:Normalizing co-occurrence matrices is used primarily to account for:
- people who purchased many items
- items purchased by many people
- eliminating rare products
- none of the above
Question 6: A store has 3 customers and 3 products. Below are the learned feature vectors for each user and product. Based on this estimated model, which product would you recommend most highly to User #2?
User ID | Feature vector |
1 | (1.73, 0.01, 5.22) |
2 | (0.03, 4.41, 2.05) |
3 | (1.13, 0.89, 3.76) |
Product ID | Feature vector |
1 | (3.29, 3.44, 3.67) |
2 | (0.82, 9.71, 3.88) |
3 | (8.34, 1.72, 0.02) |
- Product #1
- Product #2
- Product #3
Question 7: For the liked and recommended items displayed below, calculate the recall and round to 2 decimal points. (As in the lesson, green squares indicate recommended items, and magenta squares are liked items. Items not recommended are grayed out for clarity.) Note: enter your answer in American decimal format (e.g. enter 0.98, not 0,98)
Answer: 0.33
Question 8: For the liked and recommended items displayed below, calculate the precision and round to 2 decimal points. (As in the lesson, green squares indicate recommended items, and magenta squares are liked items. Items not recommended are grayed out for clarity.) Note: enter your answer in American decimal format (e.g. enter 0.98, not 0,98)
Answer: 0.25
Question 9: Based on the precision-recall curves in the figure below, which recommender would you use?
- RecSys #1
- RecSys #2
- RecSys #3
Quiz 2: Recommending songs
Question 1: Which of the artists below have had the most unique users listening to their songs?
- Kanye West
- Foo Fighters
- Taylor Swift
- Lady GaGa
Question 2: Which of the artists below is the most popular artist, the one with the highest total listen_count, in the data set?
- Taylor Swift
- Kings of Leon
- Coldplay
- Lady GaGa
Question 3: Which of the artists below is the least popular artist, the one with the smallest total listen_count, in the data set?
- William Tabbert
- Velvet Underground & Nico
- Kanye West
- The Cool Kids
Week 6: Machine Learning Foundations: A Case Study Approach Quiz Answer
Quiz 1: Deep Learning
Question 1: Which of the following statements are true? (Check all that apply)
- Linear classifiers are never useful, because they cannot represent XOR.
- Linear classifiers are useful, because, with enough data, they can represent anything.
- Having good non-linear features can allow us to learn very accurate linear classifiers.
- none of the above
Question 2: A simple linear classifier can represent which of the following functions? (Check all that apply)
- x1 OR x2 OR NOT x3
- x1 AND x2 AND NOT x3
- x1 OR (x2 AND NOT x3)
- none of the above
Question 3: Which of the following neural networks can represent the following function? Select all that apply.
(x1 AND x2) OR (NOT x1 AND NOT x2)
Answer:

Question 4: Which of the following statements is true? (Check all that apply)
- Features in computer vision act like local detectors.
- Deep learning has had an impact in computer vision because it’s used to combine all the different hand-created features that already exist.
- By learning non-linear features, neural networks have allowed us to automatically learn detectors for computer vision.
- none of the above
Question 5: If you have lots of images of different types of plankton labeled with their species name and lots of computational resources, what would you expect to perform better predictions:
- a deep neural network trained on this data.
- a simple classifier trained on this data, using deep features as input, which were trained using ImageNet data.
Question 6: If you have a few images of different types of plankton labeled with their species name, what would you expect to perform better predictions:
- a deep neural network trained on this data.
- a simple classifier trained on this data, using deep features as input, which were trained using ImageNet data.
Quiz 2: Deep features for image retrieval
Question 1: What’s the least common category in the training data?
- bird
- dog
- cat
- automobile
Question 2: Of the images below, which is the nearest ‘cat’ labeled image in the training data to the first image in the test data (image_test[0:1])?
Answer:

Question 3: Of the images below, which is the nearest ‘dog’ labeled image in the training data to the the first image in the test data (image_test[0:1])?
Answer:

Question 4: :For the first image in the test data, in what range is the mean distance between this image and its 5 nearest neighbors that were labeled ‘cat’ in the training data?
- 33 to 35
- 35 to 37
- 37 to 39
- 39 to 41
- Above 41
Question 5: For the first image in the test data, in what range is the mean distance between this image and its 5 nearest neighbors that were labeled ‘dog’ in the training data?
- 33 to 35
- 35 to 37
- 37 to 39
- 39 to 41
- Above 41
Question 6: On average, is the first image in the test data closer to its 5 nearest neighbors in the ‘cat’ data or in the ‘dog’ data?
- cat
- dog
Question 7: In what range is the accuracy of the 1-nearest neighbor classifier at classifying ‘dog’ images from the test set?
- 50 to 60
- 60 to 70
- 70 to 80
- 80 to 90
- 90 to 100
Machine Learning Foundations: A Case Study Approach Course Review
In our experience, we suggest you enroll in Machine Learning Foundations: A Case Study Approach courses and gain some new skills from Professionals completely free and we assure you will be worth it.
Machine Learning Foundations: A Case Study Approach for free, if you are stuck anywhere between a quiz or a graded assessment quiz, just visit Networking Funda to Machine Learning Foundations: A Case Study Approach Quiz Answers.
Get All Course Quiz Answers of Machine Learning Specialization
Machine Learning Foundations: A Case Study Approach Quiz Answer
Machine Learning: Regression Coursera Quiz Answers
Machine Learning: Classification Coursera Quiz Answers
Machine Learning: Clustering & Retrieval Quiz Answers