## Table of Contents

### Deep Learning and Reinforcement Learning Week 01 Quiz Answers

#### Quiz 01: Check for Understanding

Q1. Neural networks and Deep Learning are behind many of the AI applications that are part of our daily lives.

**True**- False

Q2. Select the best definition of an activation function:

- An activation function is a linear function that transforms the output from one layer into input for another layer.
**An activation function is a non-linear function that transforms the output from one layer into input for another layer.**- An activation function is a linear function that transforms the input from one layer into output for another layer.
- An activation function is a non-linear function that transforms the input from one layer into output for another layer.

Q3. This is a characteristic that neural networks and logistic regression have in common:

- both models retain easy explainability for their computational outcomes.
- both models use only linear functions.
**the weights, inputs, and bias of neural networks are the equivalent to the coefficients, variables, and constant of a logistic regression**- both models make use of layers of units of computation.

#### Quiz 02: Check for Understanding

Q1. Select the method or methods that best help you find the same results as using matrix linear algebra to solve the equation \theta={(X^TX)}^{-1}X^Ty*θ*=(*XTX*)−1*XTy*

- Use stochastic gradient descent
- Use scikit-learn to build a linear regression model
- Train a neural network model
**All the above**

Q2. (True/False) Neurons can be used as logic gates

**True**- False

Q3. (True/False) The feed-forward computation of a neural network can be thought of as matrix calculations and activation functions.

**True**- False

#### End of Module Quiz

Q1. What is another name for the “neuron” on which all neural networks are based?

- deep neuron
- sigmoid
- neutron
**perceptron**

Q2. What is an advantage of using a network of neurons?

- The network is not limited to using only the sigmoid function as an activation function.
**A network of neurons can represent a non-linear decision boundary.**- Feedforward capabilities are limited.
- The output of neurons can be averaged.

Q3. A dataset with 8 features would have how many nodes in the input layer?

- 10
- 2
- 4
**8**

Q4. For a single data point, the weights between an input layer with 3 nodes and a hidden layer with 4 nodes can be represented by a:

- 4 x 3 matrix
- 3 x 4 matrix.
- 4 x 4 matrix
**3 x 3 matrix**

Q5. Use the following image for reference. How many hidden layers are in this Neural Network?

**Two**- Four
- Eight
- Fourteen

Q6. Use the following image for reference. How many hidden units are in this Neural Network?

- Two
- Four
**Eight**- Fourteen

Q7. Which statement is TRUE about the relationship between Neural Networks and Logistic Regression?

- A Neural Network is less likely to overfit to training data than Logistic Regression.
- A Neural Network with two or more deep layers will likely outperform Logistic Regression.
**A Multi-Layer Perceptron is equivalent to Logistic Regression if all activation functions are the same.**- A single-layer Neural Network can be parameterized to generate results equivalent to Linear or Logistic Regression.

### Deep Learning and Reinforcement Learning Week 02 Quiz Answers

#### Quiz 01: Check for Understanding

Q1. True/False. Multi-layer perceptrons always have a hidden layer.

**True**- False

Q2. True/False. Multi-layer perceptrons are considered a type of feedforward neural network.

**True**- False

Q3. Select the correct rule of thumb regarding training a neural network. In general, as you train a neural network:

- The log loss decreases and the accuracy decreases
**The log loss decreases and the accuracy increases**- The log loss increases and the accuracy decreases
- The log loss increases and the accuracy increases

#### End of Module Quiz

Q1. What is the main function of backpropagation when training a Neural Network?

- Preprocess the input layer
**Make adjustments to the weights**- Make adjustments to the loss function
- Propagate the output on the output layer

Q2. (True/False) The “vanishing gradient” problem can be solved using a different activation function.

**True**- False

Q3. (True/False) Every node in a neural network has an activation function.

**True**- False

Q4. These are all activation functions except:

- Sigmoid
- Hyperbolic tangent
**Leaky hyperbolic tangent**- ReLu

Q5. Deep Learning uses deep Neural Networks for all these uses, except

- As an alternative to manual feature engineering
- To uncover usually unobserved relationships in the data
**Cases in which explainability is the main objective**- As a classification and regression technique

Q6. These are all activation functions except:

- Regularization penalty in cost function
- Dropout
- Early stopping
**Pruning**

Q7. (True/False) Optimizer approaches for Deep Learning Regularization use gradient descent:

- True
**False**

Q8. Stochastic gradient descent is this type of batching method:

- online learning
- mini batch
- full batch
**stochastic batch**

Q9. (True/False) The main purpose of data shuffling during the training of a Neural Network is to aid convergence and use the data in a different order each epoch.

**True**- False

Q10. This is a high-level library that is commonly used to train deep learning models and runs on either TensorFlow or Theano:

- PyTorch
**Keras**- Watson Studio
- Deep Learning

### Deep Learning and Reinforcement Learning Week 02 Quiz Answers

#### Quiz 01: Check for Understanding

Q1. Given the syntax below, select the option that will best improve a CNN model that you are trying to fit

`model.fit(x_train, y_train, batch_size=batch_size, epochs=100, validation_data=(x_test, y_test))`

- Remove the validation_data option.
- Increase the number of epochs to 100.
- Decrease the number of epochs to 50.
- Add shuffling, by adding “, shuffle=True” at the end.

Q2. Which of the following statements is **TRUE** about a kernel in a Convolutional Layer applied to an image?

- Kernels allow the convolutional layers to perform nonlinear transformations.
- Kernels detect local features in an image such as lines, corners, and edges.
- Kernels identify which channel in the input data contains the most information.
- Kernels ease computation by reducing the number of dimensions in an image that must be processed.

#### Quiz 02: Check for Understanding

Q1. This concept came as a solution to CNNs in which each layer is turned into branches of convolutions:

- Inception
- Workload portion
- Hebbian Principle
- Network Concatenation

Q2. Which CNN Architecture is considered the flash point for modern Deep Learning?

- AlexNet
- VGG
- Inception
- ResNet
- LeNet

Q3. Which CNN Architecture can be described as a “simplified, deeper LeNet” in which the more layers, the better?

- Deep Lenet
- AlexNet
- VGG
- Inception
- ResNet

Q4. Which CNN Architecture is the precursor of using convolutions to obtain better features and was first used to solve the MNIST data set?

- AlexNet
- VGG
- Inception
- ResNet
- LeNet

Q5. The motivation behind this CNN Architecture was to solve the inability of deep neural networks to fit or overfit the training data better when adding layers.

- LeNet
- AlexNet
- VGG
- Inception
- ResNet

Q6. This CNN Architecture keeps passing both the initial unchanged information and the transformed information to the next layer.

- LeNet
- AlexNet
- VGG
- Inception
- ResNet

Q7. This concept came as a solution to CNNs in which each layer is turned into branches of convolutions:

- Inception
- Hebbian Principle
- Workload portion
- Network Concatenation

#### End of Module Quiz

Q1. What is the main function of backpropagation when training a Neural Network?

- Preprocess the input layer
- Make adjustments to the weights
- Make adjustments to the loss function
- Propagate the output on the output layer

Q2. (True/False) The “vanishing gradient” problem can be solved using a different activation function.

- True
- False

Q3. (True/False) Every node in a neural network has an activation function.

- True
- False

Q4. These are all activation functions except:

- Sigmoid
- Hyperbolic tangent
- Leaky hyperbolic tangent
- ReLu

Q5. Deep Learning uses deep Neural Networks for all these uses, except:

- As an alternative to manual feature engineering
- To uncover usually unobserved relationships in the data
- Cases in which explainability is the main objective
- As a classification and regression technique

Q6. These are all activation functions for CNN, except:

- Regularization penalty in cost function
- Dropout
- Early stopping
- Pruning

Q7. (True/False) Optimizer approaches for Deep Learning Regularization use gradient descent:

- True
- False

Q8. Stochastic gradient descent is this type of batching method:

- online learning
- mini batch
- full batch
- stochastic batch

Q9. The main purpose of data shuffling during the training of a Neural Network is to aid convergence and use the data in a different order each epoch.

- True
- False

Q10. Which of the following IS NOT a benefit of Transfer Learning?

- Reducing time required to tune hyper-parameters
- Reducing the impact of the vanishing gradient problem on early layers
- Improving the speed at which large models can be trained from scratch
- Conveying computational benefits when problems share similar primitive features.

Q11. Which of the following statements about using a Pooling Layer is TRUE?

- Pooling can reduce both computational complexity and overfitting.
- Pooling can reduce computational complexity, at the cost of overfitting.
- Pooling increases computational complexity, but helps with overfitting.
- Pooling reduces the likelihood of overfitting, but generally does not impact computational complexity.

### Deep Learning and Reinforcement Learning Week 04 Quiz Answers

#### Quiz 01: Check for Understanding

Q1. (True/False) Recurrent Neural Networks are a class of neural networks that allow previous outputs to be used as inputs while having hidden states.

- True
- False

Q2. (True/False) Recurrent Neural Networks are well suited in applications in which the context is important and needs to be incorporated in the prediction.

- True
- False

Q3. These are the two main outputs of a recurrent neural network:

- Prediction and state
- Prediction and parameters
- Prediction and recurrence
- Prediction and learning rate

#### End of Module Quiz

Q1. (True/False) RNN models are mostly used in the fields of natural language processing and speech recognition.

- True
- False

Q2. (True/False) GRUs and LSTM are a way to deal with the vanishing gradient problem encountered by RNNs

- True
- False

Q3. (True/False) GRUs will generally perform about as well as LSTMs with shorter training time, especially for smaller datasets.

- True
- False

Q4. (True/False) The main idea of Seq2Seq models is to improve accuracy by keeping necessary information in the hidden state from one sequence to the next.

- True
- False

Q5. (True/False) The main parts of a Seq2Seq model are: an encoder, a hidden state, a sequence state, and a decoder.

- True
- False

Q6. Select the correct option, in the context of Seq2Seq models:

- The
**Greedy Search**algorithm selects one best candidate as an input sequence for each time step while the**Beam Search**produces multiple different hypothesis based on**the output from the encoder**. - The
**Beam Search**algorithm selects one best candidate as an input sequence for each time step while the**Greedy Search**produces multiple different hypothesis based on**the output from the encoder**. - The
**Greedy Search**algorithm selects one best candidate as an input sequence for each time step while the**Beam Search**produces multiple different hypothesis based on**conditional probability**. - The
**Beam Search**algorithm selects one best candidate as an input sequence for each time step while the**Greedy Search**produces multiple different hypothesis based on**conditional probability**.

Q7. Which is the gating mechanism for RNNs that include a reset gate and an update gate?

- GRUs
- LSTMs
- Refined Gate
- Complex Gate

Q8. LSTM models are among the most common Deep Learning models used in forecasting. These are other common uses of LSTM models, except:

- Speech Recognition
- Machine Translation
- Image Captioning
- Generating Images
- Anomaly Detection
- Robotic Control

### Deep Learning and Reinforcement Learning Week 05 Quiz Answers

#### Quiz 01: Check for Understanding

Q1. (True/False) Autoencoders learn a compressed representation of the input by first compressing the input (encoding) and decompressing it back (decoding) to match the original input.

- True
- False

Q2. All of these are examples of applications of Autoencoders, except:

- Anomaly detection
- Machine translation
- Times series forecasting
- Recommender systems
- Image-related applications (generation, denoising, processing and compression)
- Popularity prediction for social media posts

Q3. Which is the main goal of Variational Autoencoders?

- Add variations to the encoding section of the autoencoder
- Generate images using the decoder
- Decrease the decoding error
- Decrease the decoding time

#### Quiz 02: Check for Understanding

Q1. (True/False) A common characteristic of both Autoencoders and Variational Autoencoders is that both have one neural network for encoding and another one for decoding.

- True
- False

Q2. These are all additional steps that you need to consider when using Variational Autoencoders, except:

- Consider parameters within a distributionFeedback: Incorrect. Please review the Autoencoders Python demonstration
- Incorporate a logarithmic term into the loss function
- Remove a KL loss function
- Use binary crossentropy

Q3. Choose the right assertion in the context of comparing the reconstruction error of Autoencoders and Variational Autoencoders:

- The reconstruction error of
**autoencoders**can be**lower**because autoencoders are designed to maximize the interpretability of the latent space, not to minimize the reconstruction error. - The reconstruction error of
**variational autoencoders**can be**higher**because variational autoencoders are designed to maximize the interpretability of the latent space, not to minimize the reconstruction error. - The reconstruction error of
**autoencoders**can be**higher**because autoencoders are designed to maximize the interpretability of the latent space, not to minimize the reconstruction error. - The reconstruction error of
**variational autoencoders**can be**lower**because variational autoencoders are designed to maximize the interpretability of the latent space, not to minimize the reconstruction error.

#### End of Module Quiz

Q1. (True/False) An Autoencoder is a form of unsupervised learning

- True
- False

Q2. Select the right assertion:

- Autoencoders learn from a compressed representation of the data, while variational autoencoders learn from a probability distribution representing the data.
- Variational autoencoders learn from a compressed representation of the data, while autoencoders learn from a probability distribution representing the data.
- Autoencoders and Principal Component Analysis can be used interchangeably.
- Variational Autoencoders and Principal Component analysis can be used interchangeably.

Q3. (True/False) Variational autoencoders are generative models.

- True
- False

Q4. When comparing the results of Autoencoders and Principal Component Analysis, which approach might best improve the results from Autoencoders?

- Add labels to the data
- Add layers and epochs
- Add a Variational Autoencoder
- Reduce the dimensions of the data

Q5. (True/False) KL loss is used in Variatoinal Autoencoders to represent the measure of the difference between two distributions.

- True
- False

Q6. A good way to compare the inputs and outputs of a Variational Autoencoder is to calculate the mean of a reconstruction function based on binary crossentropy

- True
- False

### Deep Learning and Reinforcement Learning Week 06 Quiz Answers

#### Quiz 01: Check for Understanding

Q1. The development of Generative Adversarial Networks was motivated, in part, by

- the need for faster computation across multiple platforms.
- the need to simultaneously generate differing types of output.
- the vulnerability of standard Deep Learning approaches to input manipulation
- the inability of standard Deep Learning approaches to implement backpropagation.

Q2. (True/False) GANs are a way of training two neural networks simultaneously.

- True
- False

Q3. (True/False) GANs are probably behind some applications like FaceApp and applications that can make you look older.

- True
- False

#### Quiz 02: Check for Understanding

Q1. Relative to problems suitable for Deep Learning, Reinforcement Learning allows for analysis of problems in which:

- agents control the actions taken but do not observe outcomes.
- agents observe outcomes but cannot control the actions taken over time.
- agents use dropout to supplement the results of separate analyses.
- agents control actions taken and learn to optimize outcomes over time.

Q2. Which of the following examples would NOT be suitable for Reinforcement Learning?

- Training a robot to move through a maze
- Developing a strategy to play a video game
- Estimating the directional impact of wind on drone movement
- Identifying approaches to maximize profit through algorithmic trading

Q3. Which of the following statements about the environment in a Reinforcement Learning problem is TRUE?

- At each stage, rewards available in the environment are clearly defined.
- The environment is defined by a set of rules and remains fixed over time.
- The timing of expected rewards can impact the policy rule selected by the agent.
- A unique policy solution exists whenever an agent can obtain perfect information about rewards and actions.

#### Module 6 Quiz

Q1. (True/False) Simulation is a common approach for Reinforcement Learning applications that are complex or computing intensive.

- True
- False

Q2. (True/False) Discounting rewards refers to an agent reducing the value of the reward based on its uncertainty.

- True
- False

Q3. (True/False) Successful Reinforcement Learning approaches are often limited by extreme sensitivity to hyperparameters.

- True
- False

Q4. (True/False) Reinforcement Learning approaches are often limited by excessive computation resources and data requirements.

- True
- False

Q5. Which type of Deep Learning approach is most commonly used for image recognition?

- Autoencoders
- Multi-Layer Perceptron
- Recurrent Neural Network
- Convolutional Neural Network

Q6. Which type of Deep Learning approach is most commonly used for forecasting problems?

- Autoencoders
- Multi-Layer Perceptron
- Recurrent Neural Network
- Convolutional Neural Network

Q7. Which type of Deep Learning approach is most commonly used for generating artificial images?

- Autoencoders
- Multi-Layer Perceptron
- Recurrent Neural Network
- Convolutional Neural Network

Q8. The main parts of GANs architecture are:

- generator and discriminator
- loss error and randomnoise
- generated and adversarial neurons
- adversarial and non adversarial neurons

Q9. (True/False) One of the main advantages of GANs over other adversarial networks is that it does not spend any time evaluating whether an input or image is fake or real. It only computes probability of being fake.

- True
- False

##### Get All Course Quiz Answers of IBM Machine Learning Professional Certificate

Exploratory Data Analysis for Machine Learning Quiz Answers

Supervised Machine Learning: Regression Quiz Answers

Supervised Machine Learning: Classification Coursera Quiz Answers

Unsupervised Machine Learning Coursera Quiz Answers

Deep Learning and Reinforcement Learning Quiz Answers

Specialized Models: Time Series and Survival Analysis Quiz Answers