# Introduction to Machine Learning Coursera Quiz Answers 2021

## All Weeks Introduction to Machine Learning Coursera Quiz Answers

This course will provide you with a foundational understanding of machine learning models (logistic regression, multilayer perceptrons, convolutional neural networks, natural language processing, etc.) as well as demonstrate how these models can solve complex problems in a variety of industries, from medical diagnostics to image recognition to text prediction.

In addition, we have designed practice exercises that will give you hands-on experience implementing these data science models on data sets. These practice exercises will teach you how to implement machine learning algorithms with PyTorch, open-source libraries used by leading tech companies in the machine learning field (e.g., Google, NVIDIA, CocaCola, eBay, Snapchat, Uber, and many more).

## Introduction to Machine Learning Coursera Quiz Answers

**Week 01: Comprehensive**

Q1. Which of the following are necessary for supervised machine learning? (Choose all that are correct)

**A model****Learning from data****Labeled training data**- Human to teach the machine

Q2. What decision boundary can logistic regression provide?

- Arbitrarily complex functions
- Jagged edges
- Smooth curves
**Linear**

Q3. What is the primary advantage of using multiple filters?

- More complexity is always better.
- This requires less compute power.
**This allows the model to look for subtypes of the classification.**- This is simpler to implement.

Q4. Which one of the following best describes transfer learning in the context of document analysis?

- All parameters of the model are different between individuals.
**Parameters at the bottom of the model are transferable across all people and documents, while the parameters at the top are different between individuals.**- All parameters of the model are transferable across all people and documents.
- Parameters at the top of the model are transferable across all people and documents, while the parameters at the bottom are different between individuals.

Q5. Given the following image of data classifications, which of the following models would you choose?

**Logistic regression**- Multilayer perceptron

Q6. What new feature did neural networks acquire in 2010?

- A new computational platform: the GPU
- A new application: image search
- A new operation: convolution
**A new name: Deep Learning**

Q7. Which of the following is convolved with layer 2 features, or sub-motifs?

- Layer 2 feature map
**Layer 1 feature map**- Layer 3 feature map

Q8. Which of the following gives the best conceptual meaning of convolution?

- Surveying a feature map for high-level motif.
- Selecting an atomic element from an image.
- Stacking a collection of feature maps.
**Shifting a filter to every location in an image.**

Q9. What does transfer learning mean in the context of medical imaging?

- Just as assigning categories to images in ImageNet required millions of images, so too does analyzing medical images require millions of labeled medical images.
- Sufficient labeled radiological images can be used to learn all of the model parameters, so they can be used for ophthalmological or dermatological images.
- Once the convolutional layers are learned from labeled medical images, the top layers can be inferred from the parameters found with data from ImageNet.
**Weights of convolutional layers learned from ImageNet transfer to medical images, so we only need learn new parameters at the top of the network.**

Q10. What is the primary advantage of having a deep architecture?

- There is a higher probability that each motif is used in the classifier.
**The model shares knowledge between motifs through their shared substructures.**- A model can learn each top-level motif in isolation.
- The parameters of a deep architecture are less expensive to compute.

**Week -02 Comprehensive**

Q1. What does the equation for the loss function do conceptually?

- Mathematically define network outputs
**Penalize overconfidence**- Ignore historical statistical developments
- Reward indecision

Q2. What is overfitting?

- Overfitting refers to the fact that more complexity is always better, which is why deep learning works.
**Model complexity fits too well to training data and will not generalize in the real-world.**- Model complexity is perfectly matched to the data.
- Model complexity is not enough to capture the nuance of the data and will under-perform in the real-world.

Q3. Why should the test set only be used once?

**More than one use can lead to bias.**- More than one use can lead to overfitting.
- The model cannot learn anything new from subsequent uses.
- It is expensive to use more than once.

Q4. Which two of the following describe the purpose of a validation set?

- To estimate the performance of a model.
**To pick the best performing model.**- To test the performance in lieu of real-world data.
- To learn the model parameters.

Q5. How do we learn our network?

**Gradient descent**- Downhill skiing
- Monte Carlo simulation
- Analytically determine global minimum

Q6. What technique is used to minimize loss for a large data set?

- Newton’s method
- Taylor series expansion
**Stochastic gradient descent**- Gradient descent

Q7. Which of the following are benefits of stochastic gradient descent?

**With stochastic gradient descent, the update time does not scale with data size.**- Stochastic gradient descent finds the solution more accurately.
**Stochastic gradient descent can update many more times than gradient descent.**- Stochastic gradient descent gets near the solution quickly.
- Stochastic gradient descent finds a more exact gradient than gradient descent.

Q8. Why is gradient descent computationally expensive for large data sets?

- Large data sets do not permit computing the loss function, so a more expensive measure is used.
**Calculating the gradient requires looking at every single data point.**- Large data sets require deeper models, which have more parameters.
- There are too many local minima for an algorithm to find.

Q9. What are the two main benefits of early stopping?

**It helps save computation cost.****It performs better in the real world.**- It improves the training loss.
- There is rigorous statistical theory on it.

Q10. Why are optimization and validation at odds?

**Optimization seeks to do as well as possible on a training set, while validation seeks to generalize to the real world.**

Optimization seeks to generalize to the real world, while validation seeks to do as well as possible on a validation set.

Optimization seeks to do as well as possible on a training set, while validation seeks to do as well as possible on a validation set.

They are not at odds—they have the same goal.

### Week 03: Comphrensive

Q1. Which of the following indicates whether a doctor or machine is doing well at finding positive examples in a data set?

- Positive Predictive Value
- Likelihood Ratio
**Sensitivity**- Specificity

Q2. Which of the following is used to distinguish the false positive rate from the false negative rate?

- Sensitivity
- False Negative
- Negative Predictive Value
**Specificity**

Q3. Which of the following is the best conceptual definition of one dimensional convolution?

- “Inverting” of a shape, where the inversion matches a feature.
**“Sliding” of two signals, where a matched feature gives a high value of convolution.**- “Intertwining” of two signals, where one wraps around the other to form a feature.
- “Distortion” of one signal, according to the feature shape

Q4. Which of the following can a user choose when designing a convolutional layer? (Choose all that are correct.)

**Filter depth****Filter size****Filter number****Filter stride**- Filter weights

Q5. What is a fully connected readout?

- A layer with ten classifications.
- A layer with connections to all feature maps.
- The vectorization of a pooling layer.
**A layer with a single neuron for each output class.**

Q6. Why are nonlinear activation functions preferable?

- Nonlinear activation functions are preferable because they are used in generalized linear models in statistics.
**Nonlinear activation functions increase the functional capacity of the neural network by allowing the representation of nonlinear relationships between features in input.**- Nonlinear activation functions are preferable because they have been used historically.
- Nonlinear activation functions are NOT preferable to linear ones, as they lose information in systems with high variance.

Q7. Which of the following are benefits of pooling? (Choose all that are correct.)

**Decreases bias.****Combats overfitting.****Vectorizes the data.****Encourages translational invariance.**- Reduces computational complexity.

Q8. How are parameters that minimize the loss function found in practice?

- Fractal geometry
- Gradient descent
- Simplex algorithm
**Stochastic gradient descent**

Q9. Which of the following is an advantage of hierarchical representation of image features?

- Eliminating bias.
- Decreasing the computational complexity.
**Better leveraging all training data.**- Decreasing variance in the model.

Q10. Why does transfer learning work?

**Top-level features are specialized for a particular task, while low-level features are universal to all images.**- All layers of filters can be learned by studying the mammalian receptive fields.
- Low-level features are specialized for a particular task, while top-level features are universal to all images.
- All images are composed of pixels with three color channels.

**Week 04: Comprehensive**

Q1. What is meant by “word vector”?

- The latitude and longitude of the place a word originated.
**A vector of numbers associated with a word.**- Assigning a corresponding number to each word.
- A vector consisting of all words in a vocabulary.

Q2. Which word is a synonym for “word vector”?1 point

- Norm
- Array
**Embedding**- Stack

Q3. What is the term for a set of vectors, with one vector for each word in the vocabulary?

- Space
- Array
**Codebook**- Embedding

Q4. What is natural language processing?

- Making natural text conform to formal language standards.
- Translating natural text characters to unicode representations.
- Translating human-readable code to machine-readable instructions.
**Taking natural text and making inferences and predictions.**

Q5. What is the goal of learning word vectors?

- Find the hidden or latent features in a text.
- Labelling a text corpus, so a human doesn’t have to do it.
- Determine the vocabulary in the codebook.
**Given a word, predict which words are in its vicinity.**

Q6. What function is the generalization of the logistic function to multiple dimensions?

- Hyperbolic tangent function
- Exponential log likelihood
- Squash function
**Softmax function**

Q7. What is the continuous bag of words (CBOW) approach?

**Vectors for the neighborhood of words are averaged and used to predict word n.**- Word n is used to predict the words in the neighborhood of word n.
- Word n is learned from a large corpus of words, which a human has labeled.
- The code for word n is fed through a CNN and categorized with a softmax.

Q8. What is the Skip-Gram approach?

**Word n is used to predict the words in the neighborhood of word n.**- The code for word n is fed through a CNN and categorized with a softmax.
- Word n is learned from a large corpus of words, which a human has labeled.
- Vectors for the neighborhood of words are averaged and used to predict word n.

Q9. What is the goal of the recurrent neural network?

- Learn a series of images that form a video.
- Predict words more efficiently than Skip-Gram.
**Synthesize a sequence of words.**- Classify an unlabeled image.

Q10. Which model is the state-of-the-art for text synthesis?

**Long short-term memory**- CNN
- Multilayer perceptron
- CBOW

##### Introduction to Machine Learning Coursera Course Review:

In our experience, we suggest you enroll in the The Bits and Bytes of Computer Networking Coursera Course and gain some new skills from Professionals completely free and we assure you will be worth it.

Introduction to Machine Learning course is available on Coursera for free, if you are stuck anywhere between quiz or graded assessment quiz, just visit Networking Funda to get Introduction to Machine Learning Coursera Quiz Answers.

##### Conclusion:

I hope this Introduction to Machine Learning Coursera Quiz Answers would be useful for you to learn something new from this Course. If it helped you then don’t forget to bookmark our site for more Coursera Quiz Answers.

This course is intended for audiences of all experiences who are interested in learning about Data Science in a business context; there are no prerequisite courses.

Keep Learning!

**Find more quiz answers**

**IBM Data Science Professional Certification Quiz Answers and Resource Code**