### Get All Weeks Unsupervised Learning, Recommenders, Reinforcement Learning Quiz Answers

In the third course of the Machine Learning Specialization, you will:

• Use unsupervised learning techniques for unsupervised learning: including clustering and anomaly detection. • Build recommender systems with a collaborative filtering approach and a content-based deep learning method. • Build a deep reinforcement learning model.

The Machine Learning Specialization is a foundational online program created in collaboration between DeepLearning.AI and Stanford Online. In this beginner-friendly program, you will learn the fundamentals of machine learning and how to use these techniques to build real-world AI applications.

### Week 01: Unsupervised Learning, Recommenders, Reinforcement Learning Quiz Answers

#### Clustering Quiz Answers

Q1. Which of these best describes unsupervised learning?

- A form of machine learning that finds patterns using labeled data (x, y)
- A form of machine learning that finds patterns without using a cost function.
- A form of machine learning that finds patterns using unlabeled data (x).
- A form of machine learning that finds patterns in data using only labels (y) but without any inputs (x) .

Q2. Which of these statements are true about K-means? Check all that apply.

- If you are running K-means with K=3
*K*=3 clusters, then each c^{(i)}*c*(*i*) should be 1, 2, or 3. - The number of cluster assignment variables c^{(i)}
*c*(*i*) is equal to the number of training examples. - The number of cluster centroids \mu_k
*μk* is equal to the number of examples. - If each example x is a vector of 5 numbers, then each cluster centroid \mu_k
*μk* is also going to be a vector of 5 numbers.

Q3. You run K-means 100 times with different initializations. How should you pick from the 100 resulting solutions?

- Pick randomly — that was the point of random initialization.
- Pick the last one (i.e., the 100th random initialization) because K-means always improves over time
- Pick the one with the lowest cost J
*J* - Average all 100 solutions together.

Q4. You run K-means and compute the value of the cost function J(c^{(1)}, …, c^{(m)}, \mu_1, …, \mu_K)*J*(*c*(1),…,*c*(*m*),*μ*1,…,*μK*) after each iteration. Which of these statements should be true?

- The cost can be greater or smaller than the cost in the previous iteration, but it decreases in the long run.
- The cost will either decrease or stay the same after each iteration. .
- There is no cost function for the K-means algorithm.
- Because K-means tries to maximize cost, the cost is always greater than or equal to the cost in the previous iteration.

Q5. In K-means, the elbow method is a method to

- Choose the maximum number of examples for each cluster
- Choose the number of clusters K
- Choose the best random initialization
- Choose the best number of samples in the dataset

#### Anomaly detection Quiz Answers

Q1. You are building a system to detect if computers in a data center are malfunctioning. You have 10,000 data points of computers functioning well, and no data from computers malfunctioning. What type of algorithm should you use?

- Anomaly detection
- Supervised learning

Q2. You are building a system to detect if computers in a data center are malfunctioning. You have 10,000 data points of computers functioning well, and 10,000 data points of computers malfunctioning. What type of algorithm should you use?

- Anomaly detection
- Supervised learning

Q3. Say you have 5,000 examples of normal airplane engines, and 15 examples of anomalous engines. How would you use the 15 examples of anomalous engines to evaluate your anomaly detection algorithm?

- Use it during training by fitting one Gaussian model to the normal engines, and a different Gaussian model to the anomalous engines.
- Put the data of anomalous engines (together with some normal engines) in the cross-validation and/or test sets to measure if the learned model can correctly detect anomalous engines.
- You cannot evaluate an anomaly detection algorithm because it is an unsupervised learning algorithm.
- Because you have data of both normal and anomalous engines, don’t use anomaly detection. Use supervised learning instead.

Q4. Anomaly detection flags a new input x*x* as an anomaly if p(x) < \epsilon*p*(*x*)<*ϵ*. If we reduce the value of \epsilon*ϵ*, what happens?

- The algorithm is more likely to classify new examples as an anomaly.
- The algorithm is less likely to classify new examples as an anomaly.
- The algorithm is more likely to classify some examples as an anomaly, and less likely to classify some examples as an anomaly. It depends on the example x
*x*. - The algorithm will automatically choose parameters \mu
*μ*and \sigma*σ*to decrease p(x)*p*(*x*) and compensate.

Q5. You are monitoring the temperature and vibration intensity on newly manufactured aircraft engines. You have measured 100 engines and fit the Gaussian model described in the video lectures to the data. The 100 examples and the resulting distributions are shown in the figure below.

- 0.0738 * 0.02288 = 0.00169
- 0.0738 + 0.02288 = 0.0966
- 17.5 + 48 = 65.5
- 17.5 * 48 = 840

### Week 02: Unsupervised Learning, Recommenders, Reinforcement Learning Quiz Answers

#### Collaborative Filtering Quiz Answers

Q1. You have the following table of movie ratings:

Movie | Elissa | Zach | Barry | Terry |

Football Forever | 5 | 4 | 3 | ? |

Pies, Pies, Pies | 1 | ? | 5 | 4 |

Linear Algebra Live | 4 | 5 | ? | 1 |

Refer to the table above for question 1 and 2. Assume numbering starts at 1 for this quiz, so the rating for Football Forever by Elissa is at (1,1)

What is the value of n_u*nu*

Q2. What is the value of r(2,2)*r*(2,2)

**Comment Answer Below**

Q3. In which of the following situations will a collaborative filtering system be the most appropriate learning algorithm (compared to linear or logistic regression)?

- You run an online bookstore and collect the ratings of many users. You want to use this to identify what books are “similar” to each other (i.e., if a user likes a certain book, what are other books that they might also like?)
- You’re an artist and hand-paint portraits for your clients. Each client gets a different portrait (of themselves) and gives you 1-5 star rating feedback, and each client purchases at most 1 portrait. You’d like to predict what rating your next customer will give you.
- You manage an online bookstore and you have the book ratings from many users. You want to learn to predict the expected sales volume (number of books sold) as a function of the average rating of a book.
- You subscribe to an online video streaming service, and are not satisfied with their movie suggestions. You download all your viewing for the last 10 years and rate each item. You assign each item a genre. Using your ratings and genre assignment, you learn to predict how you will rate new movies based on the genre.

Q4. For recommender systems with binary labels y, which of these are reasonable ways for defining when y*y* should be 1 for a given user j*j* and item i*i*? (Check all that apply.)

- y
*y*is 1 if user j*j*purchases item i*i*(after being shown the item) - y
*y*is 1 if user j*j*has been shown item i*i*by the recommendation engine - y
*y*is 1 if user j*j*fav/likes/clicks on item i*i*(after being shown the item) - y
*y*is 1 if user j*j*has not yet been shown item i*i*by the recommendation engine

#### Recommender systems implementation Quiz Answers

Q1. Lecture described using ‘mean normalization’ to do feature scaling of the ratings. What equation below best describes this algorithm?

**Comment Answer Below**

Q2. The implementation of collaborative filtering utilized a custom training loop in TensorFlow. Is it true that TensorFlow always requires a custom training loop?

- Yes. TensorFlow gains flexibility by providing the user primitive operations they can combine in many ways.
- No: TensorFlow provides simplified training operations for some applications.

Q3. Once a model is trained, the ‘distance’ between features vectors gives an indication of how similar items are.

The squared distance between the two vectors \mathbf{x}^{(k)}**x**(*k*) and \mathbf{x}^{(i)}**x**(*i*) is:

Using the table below, find the closest item to the movie “Pies, Pies, Pies”.

Movie | User 1 | … | User n | x_0x0 | x_1x1 | x_2x2 |

Pastries for Supper | 2.0 | 2.0 | 1.0 | |||

Pies, Pies, Pies | 2.0 | 3.0 | 4.0 | |||

Pies and You | 5.0 | 3.0 | 4.0 |

Pies and You

Pastries for Supper

Q4. Which of these is an example of the cold start problem? (Check all that apply.)

- A recommendation system is unable to give accurate rating predictions for a new user that has rated few products.
- A recommendation system takes so long to train that users get bored and leave.
- A recommendation system is so computationally expensive that it causes your computer CPU to heat up, causing your computer to need to be cooled down and restarted.
- A recommendation system is unable to give accurate rating predictions for a new product that no users have rated.

#### Content-based filtering Quiz Answers

Q1. Vector x_u*xu* and vector x_m*xm* must be of the same dimension, where x_u*xu* is the input features vector for a user (age, gender, etc.) x_m*xm* is the input features vector for a movie (year, genre, etc.) True or false?

- True
- False

Q2. If we find that two movies, i*i* and k*k*, have vectors v_m^{(i)}*vm*(*i*) and v_m^{(k)}*vm*(*k*) that are similar to each other (i.e., ||v_m^{(i)} – v_m^{(k)}||∣∣*vm*(*i*)−*vm*(*k*)∣∣ is small), then which of the following is likely to be true? Pick the best answer.

- A user that has watched one of these two movies has probably watched the other as well.
- The two movies are similar to each other and will be liked by similar users.
- The two movies are very dissimilar.
- We should recommend to users one of these two movies, but not both.

Q3. Which of the following neural network configurations are valid for a content based filtering application? Please note carefully the dimensions of the neural network indicated in the diagram. Check all the options that apply:

The user and item networks have 64 dimensional v_u and v_m vector respectively

Both the user and the item networks have the same architecture

The user and the item networks have different architectures

The user vector v_u is 32 dimensional, and the item vector v_m is 64 dimensional

Q4. You have built a recommendation system to retrieve musical pieces from a large database of music, and have an algorithm that uses separate retrieval and ranking steps. If you modify the algorithm to add more musical pieces to the retrieved list (i.e., the retrieval step returns more items), which of these are likely to happen? Check all that apply.

- The system’s response time might increase (i.e., users have to wait longer to get recommendations)
- The quality of recommendations made to users should stay the same or improve.
- The quality of recommendations made to users should stay the same or worsen.
- The system’s response time might decrease (i.e., users get recommendations more quickly)

Q5. To speed up the response time of your recommendation system, you can pre-compute the vectors v_m for all the items you might recommend. This can be done even before a user logs in to your website and even before you know the x_u*xu* or v_u*vu* vector. True/False?

- True
**False**

### Week 03: Unsupervised Learning, Recommenders, Reinforcement Learning Quiz Answers

#### Reinforcement learning introduction Quiz Answers

Q1. You are using reinforcement learning to control a four legged robot. The position of the robot would be its _____.

- state
- action
- return
- reward

Q2. You are controlling a Mars rover. You will be very very happy if it gets to state 1 (significant scientific discovery), slightly happy if it gets to state 2 (small scientific discovery), and unhappy if it gets to state 3 (rover is permanently damaged). To reflect this, choose a reward function so that:

- R(1) > R(2) > R(3), where R(1), R(2) and R(3) are negative.
- R(1) > R(2) > R(3), where R(1), R(2) and R(3) are positive.
- R(1) < R(2) < R(3), where R(1) and R(2) are negative and R(3) is positive.
- R(1) > R(2) > R(3), where R(1) and R(2) are positive and R(3) is negative.

Q3. You are using reinforcement learning to fly a helicopter. Using a discount factor of 0.75, your helicopter starts in some state and receives rewards -100 on the first step, -100 on the second step, and 1000 on the third and final step (where it has reached a terminal state). What is the return?

- -100 – 0.75*100 + 0.75^2*1000
- -100 – 0.25*100 + 0.25^2*1000
- -0.75*100 – 0.75^2*100 + 0.75^3*1000
- -0.25*100 – 0.25^2*100 + 0.25^3*1000

Q4. Given the rewards and actions below, compute the return from state 3 with a discount factor of \gamma = 0.25*γ*=0.25.

- 0.39
- 25
- 6.25
- 0

#### State-action value function Quiz Answers

Q1. Which of the following accurately describes the state-action value function Q(s,a)*Q*(*s*,*a*)?

- It is the return if you start from state s
*s*, take action a*a*(once), then behave optimally after that. - It is the return if you start from state s
*s*and repeatedly take action a*a*. - It is the return if you start from state s
*s*and behave optimally. - It is the immediate reward if you start from state s
*s*and take action a*a*(once).

Q2. You are controlling a robot that has 3 actions: ← (left), → (right) and STOP. From a given state s*s*, you have computed Q(s, ←) = -10, Q(s, →) = -20, Q(s, STOP) = 0.

What is the optimal action to take in state s*s*?

- STOP
- ← (left)
- → (right)
- Impossible to tell

Q3. For this problem, \gamma = 0.25*γ*=0.25. The diagram below shows the return and the optimal action from each state. Please compute Q(5, ←).

- 0.625
- 0.391
- 1.25
- 2.5

#### Continuous state spaces Quiz Answers

Q1. The Lunar Lander is a continuous state Markov Decision Process (MDP) because:

- The state-action value Q(s,a)
*Q*(*s*,*a*) function outputs continuous valued numbers - The reward contains numbers that are continuous valued
- The state contains numbers such as position and velocity that are continuous valued.
- The state has multiple numbers rather than only a single number (such as position in the x
*x*-direction)

Q2. In the learning algorithm described in the videos, we repeatedly create an artificial training set to which we apply supervised learning where the input x = (s,a)*x*=(*s*,*a*) and the target, constructed using Bellman’s equations, is y = _____?

- y=\max\limits_{a’} Q(s’,a’)
*y*=*a*′max*Q*(*s*′,*a*′) where s’*s*′ is the state you get to after taking action a*a*in state s*s* - y=R(s’)
*y*=*R*(*s*′) where s’*s*′ is the state you get to after taking action a*a*in state s*s* - y=R(s)
*y*=*R*(*s*) - y=R(s) + \gamma \max\limits_{a’} Q(s’,a’)
*y*=*R*(*s*)+*γa*′max*Q*(*s*′,*a*′) where s’*s*′ is the state you get to after taking action a*a*in state s*s*

Q3. You have reached the final practice quiz of this class! What does that mean? (Please check all the answers, because all of them are correct!)

- Andrew sends his heartfelt congratulations to you!
- The DeepLearning.AI and Stanford Online teams would like to give you a round of applause!
- You deserve to celebrate!
- What an accomplishment — you made it!

##### Conclusion:

I hope these Unsupervised Learning, Recommenders, Reinforcement Learning Coursera Quiz Answers would be useful for you to learn something new from this Course. If it helped you then don’t forget to bookmark our site for more Quiz Answers.

This course is intended for audiences of all experiences who are interested in learning about new skills in a business context; there are no prerequisite courses.

Keep Learning!

#### Get All Course Quiz Answers of Machine Learning Specialization

Supervised Machine Learning: Regression and Classification Quiz Answers

Advanced Learning Algorithms Coursera Quiz Answers

Unsupervised Learning, Recommenders, Reinforcement Learning Quiz Answers