Advanced Learning Algorithms Coursera Quiz Answers

Get All Weeks Advanced Learning Algorithms Coursera Quiz Answers

Advanced Learning Algorithms Week 01 Quiz Answers

Quiz 1: Neural Networks Intuition Quiz Answers

Q1. Which of these are terms used to refer to components of an artificial neural network?

View
layers

activation function

neurons

Q2. True/False? Neural networks take inspiration from but do not very accurately mimic, how neurons in a biological brain learn.

View
True

Quiz 2: Neural Network Model Quiz Answers

Q1. For a neural network, here is the formula for calculating the activation of the third neuron in layer 2, given the activation vector from layer 1: a^{[2]}_{3}=g( \vec{w}^{[2]}_{3} \cdot \vec{a}^{[1]} + b^{2}_{3} )a3[2]​=g(w3[2]​⋅a[1]+b32​). Which of the following are correct statements?

View
1.The activation of layer 2 is determined using the activations from the previous layer.
2.Unit 3 (neuron 3) outputs a single number (a scalar).

Q2. For the binary classification for handwriting recognition, discussed in the lecture, which of the following statements is correct?

View
There is a single unit (neuron) in the output layer.

The output of the model can be interpreted as the probability that the handwritten image is of the number one “1”.

After choosing a threshold, you can convert the neural network’s output into a category of 0 or 1.

Q3. For a neural network, what is the expression for calculating the activation of the third neuron in layer 2? Note, this is different from the question that you saw in the lecture video.

View
a^{[2]}_{3}=g( \vec{w}^{[2]}_{3} \cdot \vec{a}^{[1]} + b^{2}_{3} )a3[2]​=g(w3[2]​⋅a[1]+b32​)

Q4. For the handwriting recognition task discussed in the lecture, what is the output a^{[3]}_1a1[3]​?

View
A number that is either exactly 0 or 1, comprising the network’s prediction

Quiz 3: TensorFlow Implementation Quiz Answers

Q1. For the the following code:

model = Sequential([

Dense(units=25, activation=”sigmoid”),

Dense(units=15, activation=”sigmoid”),

Dense(units=10, activation=”sigmoid”),

Dense(units=1, activation=”sigmoid”)])

This code will define a neural network with how many layers?

View
4

Q2. How do you define the second layer of a neural network that has 4 neurons and a sigmoid activation?

View
Dense(units=4, activation=‘sigmoid’)

Q3. If the input features are temperature (in Celsius) and duration (in minutes), how do you write the code for the first feature vector x shown above?

View
x = np.array([[200.0, 17.0]])

Quiz 4: Neural network implementation in Python Quiz Answers

Q1. According to the lecture, how do you calculate the activation of the third neuron in the first layer using NumPy?

View
1.a_1 = layer_1(x)
2.z1_3 =w1_3 * x + b

Q2. According to the lecture, when coding up the numpy array W, where would you place the w parameters for each neuron?

View
In the columns of W.

Q3. For the code above in the “dense” function that defines a single layer of neurons, how many times does the code go through the “for loop”? Note that W has 2 rows and 3 columns.

View
3 times

For each neuron in the layer, there is one column in the numpy array W. Each row of W represents how many input features are fed into that layer. The for loop calculates the activation value for each neuron.

  • 5 times

For each neuron in the layer, there is one column in the numpy array W. Each row of W represents how many input features are fed into that layer. The for loop calculates the activation value for each neuron.

Advanced Learning Algorithms Week 02 Quiz Answers

Quiz 1: Neural Network Training Quiz Answers

Q1. Here is some code that you saw in the lecture:

“`

model.compile(loss=BinaryCrossentropy())

“`

For which type of task would you use the binary cross entropy loss function?

View
binary classification (classification with exactly 2 classes)

Q2. Here is code that you saw in the lecture:

View
model.fit(X,y,epochs=100)

Which line of code updates the network parameters in order to reduce the cost?

View
model.fit(X,y,epochs=100)

Quiz 2: Activation Functions Quiz Answers

Q1. Which of the following activation functions is the most common choice for the hidden layers of a neural network?

View
ReLU (rectified linear unit)

Q2. For the task of predicting housing prices, which activation functions could you choose for the output layer? Choose the 2 options that apply.

View
linear

Sigmoid

Q3. True/False? A neural network with many layers but no activation function (in the hidden layers) is not effective; that’s why we should instead use the linear activation function in every hidden layer.

View
False

Quiz 3: Multiclass Classification

Question 1: For a multiclass classification task that has 4 possible outputs, the sum of all the activations adds up to 1. For a multiclass classification task that has 3 possible outputs, the sum of all the activations should add up to ….

  • Less than 1

The sum of all the softmax activations should add up to 1 whether the number of possible classes is 3, 4, 5 or any other number of classes. One way to see this is that if e^{z_1}=10, e^{z_2}=20,e^{z_3}=30ez1​=10,ez2​=20,ez3​=30, then the sum of a_1 + a_2 + a_3a1​+a2​+a3​ is equal to \frac{e^{z_1} + e^{z_2} + e^{z_3}}{e^{z_1} + e^{z_2} + e^{z_3}}ez1​+ez2​+ez3​ez1​+ez2​+ez3​​ which is 1.

View
1

Q2. For multiclass classification, the cross entropy loss is used for training the model. If there are 4 possible classes for the output, and for a particular training example, the true class of the example is class 3 (y=3), then what does the cross entropy loss simplify to? [Hint: This loss should get smaller when a_3a3​ gets larger.] View

-log(a_3)−log(a3​)

Q3. For multiclass classification, the recommended way to implement softmax regression is to set from_logits=True in the loss function, and also to define the model’s output layer with…

View
a ‘linear’ activation

Quiz 4: Additional Neural Network Concepts Quiz Answers

Q1. The Adam optimizer is the recommended optimizer for finding the optimal parameters of the model. How do you use the Adam optimizer in TensorFlow?

View
When calling model. compile, set optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3).

Q2. The lecture covered a different layer type where each single neuron of the layer does not look at all the values of the input vector that is fed into that layer. What is the name of the layer type discussed in the lecture?

View
A fully connected layer

Advanced Learning Algorithms Week 03 Quiz Answers

Quiz 1: Advice for applying machine learning

Q1. In the context of machine learning, what is a diagnostic?

View
A test that you run to gain insight into what is/isn’t working with a learning algorithm.

Q2. True/False? It is always true that the better an algorithm does on the training set, the better it will do on generalizing to new data.

View
False

Q3. For a classification task; suppose you train three different models using three different neural network architectures. Which data do you use to evaluate the three models in order to choose the best one?

View
The test set

Quiz 2: Bias and Variance Quiz Answers

Q1. If the model’s cross-validation error J_{cv}Jcv​ is much higher than the training error J_{train}Jtrain​, this is an indication that the model has…

View
high variance

Q2. Which of these is the best way to determine whether your model has a high bias (has to underfit the training data)?

View
Compare the training error to the cross-validation error.

Q3. You find that your algorithm has a high bias. Which of these seem like good options for improving the algorithm’s performance? Hint: two of these are correct.

View
Collect more training examples

Collect additional features or add polynomial features

Q4. You find that your algorithm has a training error of 2%, and a cross-validation error of 20% (much higher than the training error). Based on the conclusion you would draw about whether the algorithm has a high bias or high variance problem, which of these seem like good options for improving the algorithm’s performance? Hint: two of these are correct.

View
1.Increase the regularization parameter \lambdaλ
2.Collect more training data

Quiz 3: Machine Learning Development Process Quiz Answers

Q1. Which of these is a way to do error analysis?

View
Manually examine a sample of the training examples that the model misclassified in order to identify common traits and trends.

Q2. We sometimes take an existing training example and modify it (for example, by rotating an image slightly) to create a new example with the same label. What is this process called?

View
Data augmentation

Q3. What are two possible ways to perform transfer learning? Hint: two of the four choices are correct.

View
Given a dataset, pre-train and then further fine-tune a neural network on the same dataset.

Advanced Learning Algorithms Week 04 Quiz Answers

Quiz 1: Decision Trees Quiz Answers

Q1. Based on the decision tree shown in the lecture, if an animal has floppy ears, a round face shape, and has whiskers, does the model predict that it’s a cat or not a cat?

View
Not a cat

Q2. Take a decision tree learning to classify between spam and non-spam email. There are 20 training examples at the root note, comprising 10 spam and 10 non-spam emails. If the algorithm can choose from among four features, resulting in four corresponding splits, which would it choose (i.e., which has the highest purity)?

View
Left split: 10 of 10 emails are spam. Right split: 0 of 10 emails are spam.

Quiz 2: Decision tree learning

Q1. Recall that entropy was defined in the lecture as H(p_1) = – p_1 log_2(p_1) – p_0 log_2(p_0), where p_1 is the fraction of positive examples and p_0 the fraction of negative examples.

At a given node of a decision tree, , 6 of 10 examples are cats and 4 of 10 are not cats. Which expression calculates the entropy H(p_1)H(p1​) of this group of 10 animals?

View
-(0.6) log_2(0.6) – (0.4)log_2(0.4)−(0.6)log2​(0.6)−(0.4)log2​(0.4)

Q2. Recall that information was defined as follows:

H(p_1^{root}) – \left ( w^{left} H(p_1^{left}) + w^{right} H(p_1^{right}) \right ) H(p1root​)−(wleftH(p1left​)+wrightH(p1right​))

Before a split, the entropy of a group of 5 cats and 5 non-cats is H(5/10) H(5/10). After splitting on a particular feature, a group of 7 animals (4 of which are cats) has an entropy of H(4/7)H(4/7). The other group of 3 animals (1 is a cat) has an entropy of H(1/3)H(1/3). What is the expression for information gain?

View
H(0.5) – \left ( \frac{4}{7} * H(4/7) + \frac{4}{7} * H(1/3) \right )H(0.5)−(74​∗H(4/7)+74​∗H(1/3))

Q3. To represent 3 possible values for the ear shape, you can define 3 features for ear shape: pointy ears, floppy ears, and oval ears. For an animal whose ears are not pointy, not floppy, but are oval, how can you represent this information as a feature vector?

View
[0, 0, 1]

Q4. For a continuous-valued feature (such as the weight of the animal), there are 10 animals in the dataset. According to the lecture, what is the recommended way to find the best split for that feature?

View
Choose the 9 mid-points between the 10 examples as possible splits, and find the split that gives the highest information gain.

Q5. Which of these are commonly used criteria to decide to stop splitting? (Choose two.)

View
1.When the number of examples in a node is below a threshold
2.When the tree has reached a maximum depth

Quiz 3: Tree Ensembles Quiz Answers

Q1. For the random forest, how do you build each individual tree so that they are not all identical to each other?

View
If you are training B trees, train each one on 1/B of the training set, so each tree is trained on a distinct set of examples.

Sample the training data with replacement

Sample the training data without replacement

Train the algorithm multiple times on the same training set. This will naturally result in different trees.

Q2. You are choosing between a decision tree and a neural network for a classification task where the input xx is a 100×100 resolution image. Which would you choose?

View
A neural network, because the input is unstructured data, and neural networks typically work better with unstructured data.

A decision tree is because the input is unstructured and decision trees typically work better with unstructured data.

A decision tree is because the input is structured data, and decision trees typically work better with structured data.

A neural network, because the input is structured data, and neural networks typically work better with structured data.

Q3. What does sampling with replacement refer to?

View
Drawing a sequence of examples where, when picking the next example, first remove all previously drawn examples from the set we are picking from.

It refers to a process of making an identical copy of the training set.

It refers to using a new sample of data that we use to permanently overwrite (that is, to replace) the original data.

Drawing a sequence of examples where, when picking the next example, first replace all previously drawn examples into the set we are picking from.

Get All Course Quiz Answers of Machine Learning Specialization

Supervised Machine Learning: Regression and Classification Quiz Answers

Advanced Learning Algorithms Coursera Quiz Answers

Unsupervised Learning, Recommenders, Reinforcement Learning Quiz Answers

Team Networking Funda
Team Networking Funda

We are Team Networking Funda, a group of passionate authors and networking enthusiasts committed to sharing our expertise and experiences in networking and team building. With backgrounds in Data Science, Information Technology, Health, and Business Marketing, we bring diverse perspectives and insights to help you navigate the challenges and opportunities of professional networking and teamwork.

Leave a Reply

Your email address will not be published. Required fields are marked *