## All Weeks Understanding and Visualizing Data with Python Coursera Quiz Answers

### Understanding and Visualizing Data with Python Coursera Quiz Answers

#### Week 1: Understanding and Visualizing Data with Python

#### Quiz 1: Variable Types

Q1. The following five questions are focused around a startup company. Management is trying to gain insight on current employees and have sent out a short anonymous survey to them. Read each prompt carefully and determine the variable type for each question.

Employees were asked to report their typical daily commute time, in minutes. What type of variable would their response be considered?

- Categorical Nominal
- Categorical Ordinal
**Quantitative Continuous**- Quantitative Discrete

Q2. Employees were asked to report their typical daily mode of transportation to and from work (i.e. Car, Bike, Bus, etc.). What type of variable would their response be considered?

**Categorical Nominal**- Categorical Ordinal
- Quantitative Continuous
- Quantitative Discrete

Q3. The company wanted to know how employees perceived the work of upper management. Employees were asked to report the satisfaction of upper management using a 1 to 5 scale (with the following representations: 1 – Extremely Unsatisfied, 2 – Unsatisfied, 3 – Neutral, 4 – Satisfied, 5 – Extremely Satisfied). What type of variable would their response be considered?

- Categorical Nominal
**Categorical Ordinal**- Quantitative Continuous
- Quantitative Discrete

Q4. It was reported that Fridays were generally lighter in terms of number of meetings held. Employees were asked to report the number of scheduled meetings they attended the previous Friday. What type of variable would their response be considered?

- Categorical Nominal
- Categorical Ordinal
- Quantitative Continuous
**Quantitative Discrete**

Q5. Management was playing around with the idea of having a food truck visit the office once a week and was trying to gauge how much employees would spend to help entice various food truck owners. Employees were asked to report the amount of money they believe they would spend on lunch (in $XX.XX) if a food truck came to the office once a week. What type of variable would their response be considered?

- Categorical Nominal
- Categorical Ordinal
**Quantitative Continuous**- Quantitative Discrete

#### Quiz 2: Different Data Types

Q1. The following five questions are focused around a public library. Staff members are trying to gain insight on current library card holders and have sent out a short survey to them. Read each prompt carefully and determine the variable type for each question.

Library card holders were asked whether or not they have checked out a book from the library in the past month (yes or no). What type of variable would their response be considered?

**Categorical Nominal**- Categorical Ordinal
- Quantitative Continuous
- Quantitative Discrete

Q2. Library card holders were asked to report the amount of late fees they have been charged in the past year (input in the form of $XX.XX). What type of variable would their response be considered?

- Categorical Nominal
- Categorical Ordinal
**Quantitative Continuous**- Quantitative Discrete

Q3. Library card holders were asked to reflect on their most recent book they checked out and report the genre that it most closely represented (i.e. Science Fiction, Action, Romance, Mystery, etc.). What type of variable would their response be considered?

**Categorical Nominal**- Categorical Ordinal
- Quantitative Continuous
- Quantitative Discrete

Q4. The library recently added a new online checkout/renewal system. Library card holders were asked how many times they have used the new online system. What type of variable would their response be considered?

- Categorical Nominal
- Categorical Ordinal
- Quantitative Continuous
**Quantitative Discrete**

Q5. Library card holders were asked to report the satisfaction of their library experience during their last visit using a 1 to 5 scale (with the following representations: 1 – Extremely Unsatisfied, 2 – Unsatisfied, 3 – Neutral, 4 – Satisfied, 5 – Extremely Satisfied). What type of variable would their response be considered?

- Categorical Nominal
**Categorical Ordinal**- Quantitative Continuous
- Quantitative Discrete

#### Week 2: Understanding and Visualizing Data with Python

### Quiz 1: Summarizing Graphs in Words

Q1. The following graphs show the distributions for number of sandwiches sold per day at delis managed by the same owner and in the same region. Match these graphs to the appropriate sentence describing them.

- Which of these statements best summarizes Deli A?
- Statement 1: The number of sandwiches sold is bimodal with peaks around 300 and 400. The number of sandwiches vary from around 250 to around 450. There do not appear to be outliers.
- Statement 2: The number of sandwiches sold is slightly skewed left and unimodal, with a center around 500 and a range from around 200 to around 800. There do not appear to be outliers.
- Statement 3: The number of sandwiches sold is skewed right with a peak around 200. The number of sandwiches vary from around 125 to around 325. There do not appear to be outliers.
- Statement 4: The number of sandwiches sold has a bell shaped distribution, with a peak around 500 and with values varying from around 460 to around 540. There do not appear to be outliers.

Q2. Which of these statements best summarizes Deli B?

- Statement 1: The number of sandwiches sold is bimodal with peaks around 300 and 400. The number of sandwiches vary from around 250 to around 450. There do not appear to be outliers.
**Statement 2: The number of sandwiches sold is slightly skewed left and unimodal, with a center around 500 and a range from around 200 to around 800. There do not appear to be outliers.**- Statement 3: The number of sandwiches sold is skewed right with a peak around 200. The number of sandwiches vary from around 125 to around 325. There do not appear to be outliers.
- Statement 4: The number of sandwiches sold has a bell shaped distribution, with a peak around 500 and with values varying from around 460 to around 540. There do not appear to be outliers.

Q3. Which of these statements best summarizes Deli C?

- Statement 1: The number of sandwiches sold is bimodal with peaks around 300 and 400. The number of sandwiches vary from around 250 to around 450. There do not appear to be outliers.
- Statement 2: The number of sandwiches sold is slightly skewed left and unimodal, with a center around 500 and a range from around 200 to around 800. There do not appear to be outliers.
**Statement 3: The number of sandwiches sold is skewed right with a peak around 200. The number of sandwiches vary from around 125 to around 325. There do not appear to be outliers.**- Statement 4: The number of sandwiches sold has a bell shaped distribution, with a peak around 500 and with values varying from around 460 to around 540. There do not appear to be outliers.

Q4. Which of these statements best summarizes Deli D?

Q5. Which Deli has the smallest range?

**Deli A**- Deli B
- Deli C
- Deli D
- Can’t tell

Q6. Which is larger for Deli C?

**The mean**- The median
- They are the same
- Can’t tell

Q7. The following graph shows the service time for customers at five delis. What approximate proportion of customers are waiting for more than 12 minutes for their sandwich at Deli A?

- 25%
- 30%
**50%**- 75%
- Can’t tell

Q8. Which deli’s service time distribution has the smallest mean?

- Deli A
- Deli B
**Deli C**- Deli D
- Deli E

Q9. Which deli’s service time distribution has the smallest median?

- Deli A
- Deli B
**Deli C**- Deli D
- Deli E
- Can’t tell

Q10. What kind of distribution does Deli A’s service time follow?

- Bell-shaped (normal)
- Skewed right
- Uniform
**Can’t tell**

Q11. The manager is also interested in looking at the distribution of the types of sandwiches ordered before placing the next order for food supplies. Which is the distribution shape of the types of sandwiches ordered?

**Uniform**- Bell shaped (normal)
- Skewed right
- Skewed left
- Can’t tell

Q12. The manager is also interested in looking at the distribution of the types of sandwiches ordered before placing the next order for food supplies. Which type of sandwich is ordered the most?

**Turkey**- Roast Beef
- Ham
- Vegetarian
- Can’t tell

#### Quiz 2: Numerical Summaries

Q1. Yanis is training for a stair climbing competition. He’s interested in information from his training and those of his competitors. The below histogram shows the number of stair climbing competitions a random sample of 50 stair climbers entered in the past year.

What is the shape of this distribution?

**Uniform**- Skewed right
- Skewed left
- Bimodal
- Can’t tell

Q2. For his most recent training session, Yanis kept track of the time to complete each flight of stairs (16 steps). His distribution is shown here.

Which will be larger of time to complete each flight of stairs: the mean or the median?

- The mean
**The median**- They are the same
- Can’t tell

Q3. Yanis trains by climbing stairs 4 days out of the week. These box plots show the distribution of the number of flights of stairs climbed during his workouts for the past year. Which day of the week has the largest third quartile?

- Monday
- Wednesday
- Friday
**Sunday**- Can’t tell

Q4. Yanis is interested in figuring out on which days his training is least consistent. Looking at the box plot of number of stairs climbed by day of the week, which day has the largest IQR?

- Monday
- Wednesday
**Friday**- Sunday
- Can’t tell

Q5. Yanis is able to see how many flights of stairs his competitors climb in a week. The box plot shows the distribution of the flights of stairs climbed by his competitors over the last week. Yanis climbed 1375 flights of stairs last week. Approximately what proportion of competitors climbed more flights of stairs than Yanis?

- 50%
- 30%
- 25%
- 10%
- 0%
- Can’t tell

#### Quiz 3: Univariate Analysis

Q1. Using the NHANES data and the previous notebook, the following questions will be about the variable BPXSY2 (with missing values remove). Round your answer to the nearest tenth. (ex: 2.33 should be 2.3, 2.15 should be 2.2)

What is the median?

Answers : 122.00

Q2. What is the mean?

Round your answer to the nearest tenth. (ex: 2.33 should be 2.3, 2.15 should be 2.2)

Answers : 124.80

Q3. What is the standard deviation?

Round your answer to the nearest tenth. (ex: 2.33 should be 2.3, 2.15 should be 2.2)

Answers : 18.5

Q4. What is the max?

Round your answer to the nearest tenth. (ex: 2.33 should be 2.3, 2.15 should be 2.2)

Answers : 238.00

Q5. What is the Interquartile Range (IQR)?

Round your answer to the nearest tenth. (ex: 2.33 should be 2.3, 2.15 should be 2.2)

Answers : 22.00

Q6. Which of these will return descriptive statistics for a numeric Series ‘s’?

- s.descriptive_stats()
**s.describe()**- Series.describe()
- describe(s)

Q7. Select all that apply: Which will produce a histogram of the numeric Series ‘s’

**sns.distplot(a=s)**- sns.hist(a=s).set(title=”Histogram of s”)
- sns.hist(a=s)
**sns.distplot(s)**- sns.distplot(a=s).set(title=”Histogram of s”)
- sns.hist(s)

Q8. How many rows of the DataFrame ‘df’ are shown with the following code:

`1.df.head()`

**Answers : 5**

Q9.

What data is shown when the following code is run?

`1.df.head(2)`

- Columns 1 and 2
- Columns 0 and 1
- Rows 1 and 2
**Rows 0 and 1**- All rows containing the value ‘2’

#### Week 3: Understanding and Visualizing Data with Python

#### Quiz 1: Multivariate Data

Q1. Question 1A bicycle rental company has counted the number of bicycle rentals in each season (spring, summer, fall, winter) for the past two years.

Additionally, the company has collected weather data (temperature, wind speed and humidity).

Use the data for bicycle rentals and weather presented in the tables and graphs below to answer these practice quiz questions.

Which proportion describes the most popular season for renting bicycles in Year 1?

Year 1 | Year 2 | Total | |
---|---|---|---|

Spring | 150,000 | 321,348 | 471,348 |

Summer | 347,316 | 571,273 | 918,589 |

Fall | 419,650 | 641,479 | 1,061,129 |

Winter | 326,137 | 515,476 | 841,613 |

Total | 1,243,103 | 2,049,576 | 3,292,679 |

- 150,000 / 1,243,103
- 641,479 / 2,049,576
**419,650 / 1,243,103**- 1,061,129 / 3,292,679

Q2. Which proportion describes the least popular season for renting bicycles in Year 2?

Year 1 | Year 2 | Total | |
---|---|---|---|

Spring | 150,000 | 321,348 | 471,348 |

Summer | 347,316 | 571,273 | 918,589 |

Fall | 419,650 | 641,479 | 1,061,129 |

Winter | 326,137 | 515,476 | 841,613 |

Total | 1,243,103 | 2,049,576 | 3,292,679 |

- 321,348 / 471,348
**2,049,576 / 3,292,679**- 321,348 / 2,049,576
- 471,348 / 3,292,679
- 641,479 / 2,049,576

Q3. Which statement best describes the meaning of 326,137 / 841,613?

Year 1 | Year 2 | Total | |
---|---|---|---|

Spring | 150,000 | 321,348 | 471,348 |

Summer | 347,316 | 571,273 | 918,589 |

Fall | 419,650 | 641,479 | 1,061,129 |

Winter | 326,137 | 515,476 | 841,613 |

Total | 1,243,103 | 2,049,576 | 3,292,679 |

- The proportion of Year 1 rentals that occurred in Winter.
- The proportion of Total Winter rentals that occurred in Year 1.
- The proportion of Total rentals that occurred in Year 1.
**The proportion of Total rentals that occurred in Winter.**

Q4. How do the proportion of rides in the Summer compare between Year 1 and Year 2?

Year 1 | Year 2 | Total | |
---|---|---|---|

Spring | 150,000 | 321,348 | 471,348 |

Summer | 347,316 | 571,273 | 918,589 |

Fall | 419,650 | 641,479 | 1,061,129 |

Winter | 326,137 | 515,476 | 841,613 |

Total | 1,243,103 | 2,049,576 | 3,292,679 |

- The proportion is higher in Year 2 because 571,273 is larger than 347,316.
- The proportion is higher in Year 1 because 2,049,576 is larger than 1,243,103.
**Can’t tell without doing additional calculations**

Q5. The company suspects that they will have a larger increase in rentals from registered riders compared to non-registered riders over the two years. They make this bar chart to see how the numbers compare. For which group does the increase in riders seem larger?

- They look the same
- Registered
- Non-registered
- Can’t tell from this graph

Q6. What kind of graph is this?

- Bar chart
**Side-by-side bar chart**- Stacked bar chart
- Mosaic plot

Q7. The bicycle company is interested in knowing how rides are affected by various weather conditions. To start with, they want to examine the registered wind speeds (after a normalization).

Is wind speed a discrete or continuous variable?

- Discrete
**Continuous**- Can’t tell

Q8. The company wants to consider how weather patterns affect the bicycle rentals. They first consider how the measured temperature compares to the apparent temperature, or the temperature that humans perceive it to be. The temperatures have been normalized to fall on a scale between 0 and 1. Yesterday the normalized real temperature was 0.4. Today the normalized real temperature is 0.8. Which day would you expect to have a higher apparent temperature?

- Yesterday
**Today**- Can’t tell

Q9. The scatterplot between temperature and apparent temperature is linear. What is the strength of the scatterplot between temperature and apparent temperature?

**Strong**- Moderate
- Weak
- Can’t tell

Q10. Eventually, the bicycle company wants to think about how bicycle rides vary based on weather. After looking at humidity, they think that the humidity might be associated with the general weather conditions. They consider weather situations of 1 = clear to partly cloudy, 2 = misty with no to some clouds, 3 = light rain and light snow, and 4 = heavy rain, snow, thunderstorms, and other extreme weather. Based on the side-by-side boxplots below, which weather condition has the highest mean humidity?

- 1 = Clear to partly cloudy
- 2 = Misty with no to some clouds
- 3 = Light rain and light snow
- 4 = Heavy rain, snow, thunderstorms, and other extreme weather
**Can’t tell**

#### Quiz 3: Multivariate Analysis

Q1. Using information about the Cartwheel dataset from the previous assignments, answer the following questions.

Is the relationship between ‘Height’ and ‘Wingspan’ linear?

**Yes**- No

Q2. Is the relationship between ‘Wingspan’ and ‘Height’ linear for each gender?

**Yes**- No

Q3. Is the interquartile range of ‘CWDistance’ similar to ‘Wingspan’?

- Yes
**No**

Q4. Looking at the barplot of ‘Glasses’ and ‘CWDistance’, which glasses condition has a (slightly) larger estimate of cartwheel distance?

- Glasses-Y
**Glasses-N**

Q5. Looking at the barplot of ‘Glasses’ and ‘CWDistance’ split by ‘Gender’, which glasses condition has a (slightly) larger estimate of cartwheel distance?

- Glasses-Y
- Glasses-N
**The results are different for each gender.**

#### Week 4: Understanding and Visualizing Data with Python

#### Quiz 1: Distinguishing Between Probability & Non-Probability Samples

Q1. In each of the questions in this assessment, you’ll read a description of a sample, and decide whether or not it is a probably or non-probability sample.

A random sample of U.S. households is selected from a population address list, and households in lower-income areas are randomly sampled at a higher rate than households in higher income areas. Both sampling rates are known. The households are then mailed a paper survey asking about employment status.

**Probability**- Non-Probability

Q2. The telephone surveying technique known as random digit dialing (RDD) is used to select a random sample of households from two different lists: a list of randomly generated landline telephone numbers, and a list of randomly generated mobile phone numbers. Mobile phone numbers are sampled at a higher rate than landline numbers. Both rates are known.

**Probability**- Non-Probability

Q3. A doctoral student in psychology wants to collect opinion information from the general campus population, but doesn’t have a large budget for her research. She decides to go out on one of the busiest campus street, wait on the corner, and ask people walking by if they would like to answer a few brief questions. She ultimately speaks with 100 people and analyzes the data.

- Probability
**Non-Probability**

Q4. A University survey research center selects a random sample of counties in the U.S. using probability proportionate to size, with a certain number of counties to be sampled from each of the four major regions of the United States. One hundred housing units are then selected at random within each randomly selected county, from all available housing units within each county, and one adult is selected at random within each household and invited to participate in a survey.

**Probability**- Non-Probability

Q5. After randomly selected adults from the randomly sampled housing units in Question 4 above have completed a survey, they are invited to join a web panel that will receive invitations to complete web surveys throughout the year.

**Probability**- Non-Probability

Q6. Visitors to a sports information web site click on an advertisement that says they can get paid for providing their opinions about current events.

- Probability
**Non-Probability**

Q7. A researcher visits a homeless shelter in a nearby city, tells some individuals currently residing in the center that they can receive compensation for participating in a survey, and indicates that they should also tell everyone in their social networks about this opportunity.

- Probability
**Non-Probability**

Q8. A psychology professor is told by a statistical consultant that she needs 100 males and 100 females to have enough statistical power to compare the two groups in terms of mean scores on a new scale that she is developing. She actively recruits student volunteers for the study, and turns away all interested volunteers after she hits her target of 100 people in each group.

- Probability
**Non-Probability**

Q9. Facebook wishes to estimate the proportion of Facebook users living in Washington, D.C. that has tweeted about Donald Trump in the past week. They select a random sample of 100,000 posts from all of these identified users in the past week, and analyze the content of the posts.

**Probability**- Non-Probability

Q10. A large University wishes to collect information about incidents of sexual assault from its students. They obtain a list of all undergraduate students from the registrar’s office, and randomly select 2,500 males from all possible male undergraduates, and 2,500 females from all possible female undergraduates.

**Probability**- Non-Probability

#### Quiz 2: Generating Random Data and Samples

Q1. In the code block below, generate 3 normal random variables with mean 100 and standard deviation 1.

This will require about 4 lines of code. Use the functions provided in this outline.

- Import the numpy library
- Set the seed to 123 to initialize environment so random variables are replicated according to the grader. (hint: np.random.seed(?))
- Generate three random normal variables with mean 100 and standard deviation 1 and assign them to a variable named sample. (hint: sample = np.random.normal(?,?,?))
- Print the variable sample.

The question marks in the hints indicate input parameters.

Choose the answer that matches your result to three decimal places.

Reference Documentation

**100.915 99.997 101.283**- 99.822 100.093 100.719
- 98.914 100.997 100.283
- 99.922 100.103 100.819
- 99.914 101.937 100.282

Q2. Generating random samples from a population lies at the heart of statistics. In the code block below, draw a sample of size 10 from a set containing the integers 1 through 100.

This will require about 5 lines of code. Use the functions provided in this outline.

Import the numpy library

Set the seed to 123 to initialize environment so random variables are replicated according to the grader. (hint: np.random.seed(?))

Create a vector called population, and put the numbers 1-100 into the population list. (hint: np.arange(?,?))

Generate a sample with length 10 from the population. (hint: sample = np.random.choice(?, ?)) and assign the output to a variable named sample.

Print the variable sample.

The question marks in the hints above indicate input parameters.

Reference Documentation

Select the answer matching your sample below.

- 9 25 68 88 80 49 11 95 53 99
- 12 14 57 79 70 72 36 25 67 9
- -0.2144699617662135 0.4160333636063626 0.02927226924712613 -0.5072293848619751 2.6014747539872567 0.17141327084834654 -0.21195901381927462 -0.37671989689029883 0.1799644167541328 -0.8515596897956541
**67 93 99 18 84 58 87 98 97 48**- 0.70579387 -0.69160146 1.12461493 0.36499493 0.19864388 -0.85155969

-2.88011494 -0.77227959 0.36499493 0.809468 - 110 67 93 99 103 18 84 107 58 87

#### Get All Course Quiz Answers of Statistics with Python Specialization

Understanding and Visualizing Data with Python Coursera Quiz Answers

Inferential Statistical Analysis with Python Coursera Quiz Answers

Fitting Statistical Models to Data with Python Coursera Quiz Answers