## About Process Mining: Data science in Action Course

Process mining is the missing link between model-based process analysis and data-oriented analysis techniques. Through concrete data sets and easy to use software the course provides data science knowledge that can be applied directly to analyze and improve processes in a variety of domains.

This course starts with an overview of approaches and technologies that use event data to support decision making and business process (re)design. Then the course focuses on process mining as a bridge between data mining and business process modeling. The course is at an introductory level with various practical assignments.

## Process Mining: Data science in Action Coursera Quiz Answers

Q1. The four V’s of big data are Volume, Velocity, Variety and Veracity.

Which of these four V’s is applicable when we talk about the problem that you cannot be sure that the data is fully accurate?

**Volume**- Veracity
- Velocity
- Variety

Q2. When we talk about replay, we mean the process where…

- we start from both a process model and a collection of observed behavior, e.g. traces, and compare these.
- we start from a process model and generate behavior, e.g. traces.
**we start from event data and generate a process model, e.g. a Petri net.**

Q3. We would like to learn the influence of someone’s weight and drinking behavior on their smoking behavior. What are the response and predictor variables?

- Variable weight is the response variable and drinking and smoker are the predictor variables.
- Variable drinker is the response variable and weight and smoker are the predictor variables.
**Variable smoker is the response variable and drinker and weight are the predictor variables.**- Variables drinker and weight are the response variables and smoker is the predictor variable.
**Variables drinker and smoker are the response variables and weight is the predictor variable.**- Variables smoker and weight are the response variables and drinking is the predictor variable.

Q4. There are two types of learning: supervised and unsupervised. Which of the following statements are true for **unsupervised** learning?

**The goal is to explain a response variable in terms of the predictor variables.**- An example is the detection of patterns in the data.
**The data is labeled such that for each element its class is known**- An example is to cluster similar data together.
- An example is classification of data, e.g. learning a decision tree.

Q5. Consider a node in a decision tree with 100 instances of type A and 50 of type B. What is the entropy of this node?

Q6. Consider the two decision trees depicted below (a tree with just one node, and a tree where this node is split based on the age attribute). Does it make sense to split the tree?

**Yes, since the entropy of the entire tree goes from 0.9183 to 0.7453.**- No, since the entropy of the entire tree goes from 0.9183 to 0.7453.
- Yes, since the entropy of the entire tree goes from 0.9183 to 1.1716.
- Yes, since the entropy of the entire tree goes from 0.9183 to 1.7453.

Q7. What is the formula to calculate the **support** that X implies Y given that

N*N* is the number of instances

N_X*N**X* is the number of instances covering X

N_{X \land Y} = N_{X \cup Y}*NX*∧*Y*=*NX*∪*Y* is the number of instances covering both X and Y

Q8. Assume a data set with two variables that we would like to cluster using k-means with k=3. See the following centroids. Which one could be the end results of applying k-means.

Q9. Given the classification provided below, what is the corresponding recall?

- 0.8741
- 0.8310
- 0.9091
**0.1690**

Q10. Please check the statements that are true for k-fold cross validation.

- The learning algorithm can only use k-1 data sets during each of the runs to learn the model from.
- The learning algorithm is applied k-1 times on different combinations of training and test data sets.
**Within a run, the quality of the model learned by the algorithm is evaluated on the one data set not used for learning the model.****The data set is split into k smaller data sets.**

**More Free AWS Courses**

Ultimate AWS Certified Developer Associate 2021 – NEW Free Download

Amazon Web Services (AWS) Certified 2021 – 4 Certifications free Download

AWS Certified Solutions Architect – Associate Latest Exam Free Download

**More Google Cloud Free Courses**

Ultimate Google Cloud Certifications: All in one Bundle (4)

GCP Professional Cloud Architect: Google Cloud Certification | Free Download

GCP Associate Cloud Engineer – Google Cloud Certification | Free Download

Preparing for the Google Cloud Professional Cloud Architect Exam