Table of Contents
Art and Science of Machine Learning Week 01 Quiz Answers
The Art of ML: Regularization Quiz Answers
Q1. Regularization is useful because it can
- Limit overfitting
- Make models smaller
Hyperparameter Tuning Quiz Answers
Q1. If searching among a large number of hyperparameters, you should do a systematic grid search rather than start from random values, so that you are not relying on chance. True or False?
Q2. It is a good idea to use the training loss itself as the hyperparameter tuning metric. True or False?
Q3. Hyperparameter tuning in Cloud ML Engine involves adding the appropriate TensorFlow function call to your model code. True or False?
Q4. You are creating a model to predict the outcome (final score difference) of a basketball game between Team A and Team B. Your initial model is a neural network with [64, 32] nodes, learning_rate = 0.05, batch_size = 32. The input features include whether the game was played “at home” for Team A, the fraction of the last 7 games that Team A won, the average number of points scored by Team A in its last 7 games, the average score of Team A’s opponents in its last 7 games, etc.
Which of these are hyperparameters to the model?
- The number of nodes in each layer of the DNN
- The learning rate
- The batch size
- The number of layers in the DNN
- The number of previous games that the input features are averaged over
Learning Rate & Batch Size Quiz Answers
Q1. What is the key reason that we want to penalize models for over-complexity?
- Overly-complex models may not be generalizable to real-world scenarios on unseen data
Q2. If your learning rate is too small, your loss function will:
- Converge very slowly
Q3. If your learning rate is too high, your loss function
- Will converge rapidly, but not reach the lowest error value possible
Q4. If your batch size is too high, your loss function will
- Converge slowly
Q5. If your batch size is too low, your loss function will:
- Oscillate wildly
Art and Science of Machine Learning Week 02 Quiz Answers
L1 Regularization Quiz Answers
Q1. Which type of regularization is more likely to lead to zero weights?
Q2. Which type of regularization penalizes large weight values more?
Logistic Regression Quiz Answers
Q1. You are training your classification model and are using Logistic Regression. Which is true?
- Your last layer has no weights that can be tuned
Multi Class Neural Network Quiz Answers
Q1. If you have a classification problem with multiple labels, how does the neural network architecture change?
- Have a logistic layer for each label, and send the outputs of the logistic layer to a softmax layer
Q2. If you have thousands of classes, computing the cross-entropy loss can be very slow. Which of these is a way to help address that problem?
- Use a noise-contrastive loss function
Training Neural Network Quiz Answers
Q1. Which of these is a common way that neural network training can fail?
- Gradients can explode if the learning rate is too high
- Entire layers can die with all their weights becoming zero
- Gradients can vanish, making it harder to train networks the deeper they are
Q2. If you see a dead layer (fraction of zero weights close to 1), what is a reasonable thing to try?
- Lower the learning rate
- Try using ReLU activation function
Art and Science of Machine Learning Week 03 Quiz Answers
Custom Estimator Quiz Answers
Q1. What is the benefit of using a pre-canned Estimator?
- It can give us a quick ML model
Q2. What is the recommended way to create distributed Keras models?
- Write a Keras model as normal, and use the model_to_estimator function to convert it into an Estimator for train_and_evaluate
Q3. In the model function for a custom estimator, you can customize:
- The set of evaluation metrics
- The loss metric that is optimized
- The optimizer that is used
- The predictions that are returned
Embeddings Quiz Answers
Q1. What does the word “embedding” mean in the context of Machine Learning?
- What that means is that you convert words into vectors. This allow you to do calculations on them and find similarities between them. Well-trained models with word embeddings have shown powerful understanding of the language.
Q2. Which of these statements are true?
- Embeddings can be used to project data to a lower dimensional representation
- Embeddings learned on one problem can be reused in another problem
- Creating embeddings can be the first step to solving a clustering problem
- Embeddings can be learned directly from the data
- Embeddings learned on one problem can be used as a starting point when training a related problem