## All Weeks Statistics for Data Science with Python Coursera Quiz Answers

### Statistics for Data Science with Python Week 1 Quiz Answers

#### Quiz 1: Introduction to Descriptive Statistics

Q1. What is the difference between the maximum and minimum data entries in the set?

- Mean
- Variance
**Range**- Mode

Q2. Find the median of the data set. 3 ,8 ,9 ,11, 12, 15

- 9
**10**- 11
- 12

Q3. The measurements of spread or scatter of the individual values around the central point is called:

**Measures of dispersion**- Measures of central tendency
- Measure of skewness
- Measures of central tendency and Measures of dispersion

#### Quiz 2: Introduction and Descriptive Statistics

Q1. Which of the following is an example of time series data?

- Annual average housing price in New York
- Batting average of a baseball player
- Number of trees in Jardin du Luxemburg in Paris
- Number of dolphins in the Pacific Ocean

Q2. What is the 75th percentile of the following data set;

1, 3, 3, 4, 5, 6, 6, 7, 8, 8

- 5.5
**7**- 8
- 3

Q3. Which of the following is a measure of variability?

- Median
- Mode
**Variance**- Mean

Q4. Which of the following measures of central tendency will always change if a single value in the data changes?

- Mean
- Mode
- Median
- All of the above

Q5. Which of the following data sets has a mean of 10 and standard deviation of 0?

- 10, 10, 10
- 0, 10, 20
- 15, 15, 15
- 0, 0, 0

Q6. What is meta data?

- The metabolism data in a clinical trial
- The data about metamorphism
- Data about metal fatigue
**It’s the data about data**

Q7. Which of the following is an example of categorical data?

- Length of the river Nile
- Number of children at a kindergarten
**Mode of travel to work**- Number of fire hydrants in a city

Q8. Median represents a value in the data set where:

- Half of the observations are known and the other half not known
- Most observations are positive
- Most observations are negative
- Half of the observations are above the median and the other half below it

Q9. If the variance of a dataset is correctly computed with the formula using (n – 1) in the denominator, which of the following option is true?

**Data contains other variables with categorical data**- Data is from an unknown source
- Data is a population
- Data is a sample

Q10. Which of the following is NOT a descriptive statistic?

- t-test
- Mean
- Median
- Standard Deviation

### Statistics for Data Science with Python Week 2 Quiz Answers

#### Quiz 1: Data Visualization

Q1. What’s the best way to display median and outliers?

- A scatter plot
**A box plot**- A bubble chart
- A time series plot

Q2. What is a suitable way to display the average basketball scores between two teams?

**A bar chart**- A pie chart
- A histogram
- A scatter plot

#### Quiz 2: Data Visualization

Q1. Which of the following is the suitable way to display the average income earned by men and women in a city?

- A histogram
**A bar chart**- A scatter plot
- A pie chart

Q2. What is a suitable way to display relationship between two continuous variables?

- A pie chart
- A histogram
- A bar chart
**A scatter plot**

Q3. When the sum of two or more categories equals 100, what chart type is ideally suited for displaying data?

- A line chart
- A box plot
- A histogram
**A pie chart**

Q4. Which of the following will return a scatterplot of age and evaluation scores differentiated by gender?

- 1
- sns.scatterplot(x=’age’, y=’eval’, hue=’gender’, data=ratings_df)
- 1
- sns.boxplot(x=’credits’, y=’beauty’, data=ratings_df)
- 1
- sns.distplot(ratings_df[‘eval’], kde = False)
- 1
- ratings_df.groupby(‘division’)[[‘eval’]].mean().reset_index()

Q5. When multiple observations are reported for each respondent in the data set, to compute statistics for variables about the respondents, one must:

- Ignore the presence of duplicates and compute statistics as usual
- Weight data by duplicates
- Remove duplicates before running analysis
**None of the above**

### Statistics for Data Science with Python Week 3 Quiz Answers

#### Quiz 1: Introduction to Probability Distribution

Q1. Given the histograms below,

which histogram most closely depicts a normal distribution?

**A**- B
- C
- D

Q2. If you got a 75 on a test in a class with a mean score of 85 and a standard deviation of 5, the z-score of your test score would be

- 2
- -3
- 3
**-2**

Q3. The spread of the normal curve depends upon the value of:

**Standard Deviation**- Median
- Mean
- 1st quartile

#### Quiz 2: Introduction to Probability Distribution

Q1. For the below normal distribution, which of the following option holds true? σ1, σ2 and σ3 represent the standard deviations for curves 1, 2 and 3 respectively.

- σ1> σ2> σ3
- σ1< σ2< σ3
- σ1= σ2= σ3
- None

Q2. A test is administered annually. The test has a mean score of 150 and a standard deviation of 20. If Chioma’s z-score is 1.50, what was her score on the test?

**180**- 150
- 130
- 30

Q3. If a negatively skewed distribution (i.e. skewed to the left) has a median of 50, which of the following statements are true? (Select all that apply)

- None of the above
- Mode is greater than 50
- Mean is less than 50
**Mean is greater than 50**

Q4. What is the probability of getting two heads when two coins are flipped?

**1/4**- 1/2
- 1/8
- 1

Q5. The probability of getting a double by rolling TWO six-sided dice (with sides labeled as 1, 2, 3, 4, 5, 6) is:

- 1/36
**1/6**- 2/36
- 1

Q6. What is the area under a conditional Cumulative Density Function?

- 2
**0.5**- 0
- 1

Q7. Which of the following is a possible alternative hypothesis H1 for a two-tailed test.

**µ is equal to 85**- µ is less than 85
- µ is greater than 85
- µ is not equal to 85

Q8. Green sea turtles have normally distributed weights, measured in kilograms, with a mean of 134.5 and a variance of 49.0. A particular green sea turtle’s weight has a z-score of -2.4. What is the weight of this green sea turtle? Round to the nearest whole number.

**118kg**- 151kg
- 252kg
- 17kg

Q9. A normal distribution can best be described as which of the following? (Select all that apply)

- Bell-shaped
- Skewed
- Uniform
- Symmetric

Q10. In its **standardized** form, the normal distribution

- has an area equal to 0.5.
- has a mean of 1 and a variance of 0.
- has a mean of 0 and a standard deviation of 1.
- cannot be used to approximate discrete probability distributions.

### Statistics for Data Science with Python Week 4 Quiz Answers

#### Quiz 1: Hypothesis Testing

Q1. The weekly earnings of bus drivers are normally distributed with a mean of $395. If only 0.84% of the bus drivers have a weekly income of more than $429.35, the standard deviation of the weekly earnings of the bus drivers is approximately

**14.37**- 17
- 34.83
- 2.39

Q2. For the following samples assume they follow a normal distribution and we assume equal variance, we will like to know if there is a difference between both sample means. If we perform a two-sample t-test for independent samples. What is the p-value for the test Statistics?

Sample1 = 9, 11, 10,11,10,12, 9,11,12, 9, 10

Sample2 = 10, 13, 10, 13, 12, 9, 11, 12, 12, 12, 13

**0.0384**- 2.21
- 0.0885
- 0.975

Q3. What test is used to test the equality of variance

- t-test
**Levene’s test**- ANOVA
- z-test

#### Quiz 2: Hypothesis Testing

Q1. Using the teacher’s rating data, is there an association between native (native English speakers) and the number of credits taught? What test will you use?

- Z-test
- ANOVA
- T-test
- Chi-Square Test for Association

Q2. If I wanted to test for association using chi-square test, whether there is an association between gender (Male or Female) and tenure-ship (tenured or not tenured), what will be my degree of freedom?

**Q2. If I wanted to test for association using chi-square test, whether there is an association between gender (Male or Female) and tenure-ship (tenured or not tenured), what will be my degree of freedom?**

Q3. Consider a normally distributed data set with mean μ = 63.18 inches and standard deviation σ= 13.27 inches. What is the z-score when x = 91.54 inches? (To 3 decimal places)

**Q2. If I wanted to test for association using chi-square test, whether there is an association between gender (Male or Female) and tenure-ship (tenured or not tenured), what will be my degree of freedom?**

Q4.

Battery life of smartphones is of great concern to customers. A consumer group tested four brands of smartphones to determine the battery life. Samples of phones of each brand were fully charged and left to run until the battery died. The table above displays the number of hours each of the batteries lasted. What test will be be using to test the difference in means?

- Chi-square Test
- Pearson Correlation Test
- T-test
- ANOVA

Q5. A room in a laboratory is only considered safe if the mean radiation level is 400 or less. When a sample of 10 radiation measurements were taken, the mean value of the radiation was 414 with a standard deviation of 17. There are concerns that mean radiation is above 414. Radiation levels in the lab are known to follow a normal distribution with standard deviation 22. We will like to conduct a hypothesis test at the 5% level of signiﬁcance to determine whether there is evidence that the laboratory is unsafe.

What will be the appropriate test?

- z-test
- t-test
- ANOVA
- Chi-square

Q6. The mineral content of a particular brand of supplement pills is normally distributed with mean 490 mg and variance of 400. What is the probability that a randomly selected pill contains at least 500 mg of minerals?

- 0.3085
- 0.2023
- 0.0525
- 0.7967

Q7. The P-value for a normally distributed right-tailed test is P=0.042. Which of the following is **INCORRECT**?

- The z-score test statistic is approximately z=1.73
- We will reject H0 at α=0.05, but not at α=0.01
- The P-value for a two-tailed test based on the same sample would be P=0.084
- The P-value for a left-tailed test based on the same sample would be P= -0.042

Q8. The time X taken by a cashier in a grocery store express lane to complete a transaction follows a normal distribution with mean 90 seconds and standard deviation 20 seconds.What is the first quartile of the distribution of X (in seconds)?

- 76.6
**88.0**- 73.8
- 81.2

Q9. A man accused of committing a crime is taking a polygraph (lie detector) test. The polygraph is essentially testing the hypotheses

H0: The man is telling the truth vs. Ha: The man is not telling the truth.

Suppose we use a 5% level of significance. Based on the man’s responses to the s asked, the polygraph determines a P-value of 0.08. We conclude that:

- The probability that the man is not telling the truth is 0.08.
**We reject the null hypothesis as there is sufficient evidence that the man is telling the truth.**- The probability that the man is telling the truth is 0.08.
- We fail to reject the null hypothesis as there is insufficient evidence that the man is not telling the truth.

Q10. The average hourly wage at a fast-food restaurant is $5.85 with a standard deviation of $0.35. Assume that the wages are normally distributed. The probability that a selected worker earns more than $6.90 is

- 0
- 0.4987
- 0.0013
**0.9987**

### Statistics for Data Science with Python Week 5 Quiz Answers

#### Quiz 1: Regression analysis

Q1. We run a regression analysis between two continuous variables amount of food eaten vs the amount of calories burnt. If I get a coefficient of -0.33 for the amount of food eaten and an R-square value of 0.81. What is the correlation coefficient?

- 0.66
- –
**0.9** - -0.66
- 0.9

Q2. In the simple linear regression equation, the term B0 represents the:

- explanatory variable
- estimated slope
- estimated or predicted response
**estimated intercept**

Q3. Pearson correlation are concerned with:

- the relationship between a categorical explanatory variable and a quantitative response variable.
- the relationship between a quantitative explanatory variable and a categorical response variable
**the relationship between two quantitative variables**- the relationship between two categorical variables

#### Quiz 2: Regression Analysis

Q1. Does running an ANOVA give the same p-value results as running a regression analysis when testing the difference in group means?

- True
- False

Q2. Give the results of the regression analysis below, what is the correlation coefficient?**1 point**

- 0.19
- 0.036
- 0.034
- 17.08

Q3. Given the results for tenure-ship vs teaching evaluation, if our null hypothesis is that there is no difference in mean evaluation scores for professors who are tenured vs professors who are not tenured. What will be the conclusion of the t-test statistics?

- P-value is less than 0.05, that means that there is a difference in mean values for professors who are tenures versus professors who are not tenured.
- P-value is less than 0.05, we will fail to reject the null hypothesis.
- There is no conclusive evidence in the results above.

Q4. We run a regression analysis in place of a t-test to test if there is a difference in number of students enrolled in classes with professors who are native english speakers (English_speakers = 1) vs professors who are not (English_speakers = 0). The table is shown below. What does the coefficient for English_speakers mean?

- Professors who are English speakers get about 27 more students enrolled on average
**Professors who are English speakers get about 30 more students enrolled on average**- Professors who are English speakers get about 27 less students enrolled on average
- We can’t conclude because the error is too large and if factored in could change the conclusion of the results

Q5. Which of these are correct about correlation coefficient? (Select all that apply)

- The correlation coefficient (r) ranges from -1 to 1
- A correlation coefficient of -0.9 indicates a weak linear relationship?
- A correlation coefficient of -0.9 indicates a strong linear relationship?
**The correlation coefficient (r) ranges from 0 to 1**

Q6. Which of these options is most likely to be the null hypothesis for testing correlation between two variables?

- There is a partial association between an instructor’s looks and teaching evaluation score.
- There is an association between an instructor’s looks and teaching evaluation score.
**There is no association between an instructor’s looks and teaching evaluation score.**

Q7. If we ran a regression analysis between two continuous variables amount of time spent running on a treadmill vs the amount of calories burnt. If I get a coefficient of 0.33 for the amount of time running on the treadmill and an R-square value of 0.81. What is the correlation coefficient?

- 0.66
- 0.81
- 0.9
- 0.77

Q8. Which of the following best explains a scatter plot?

- A two-dimensional graph of data values.
- A two-dimensional graph of a straight line.
- A two-dimensional graph of a curved line.
- A one-dimensional graph of randomly scattered data.

### Statistics for Data Science with Python Week 6 Quiz Answers

#### Quiz 1: Opt-in to receive your badge!

Q1. Learners who complete all courses of this Specialization/Professional Certificate are eligible to earn a digital credential from Credly and IBM.

Would you like to receive a digital credential to recognize the skills you learned in this Specialization/Professional Certificate?

**Yes, I would like to receive a badge upon completion of this Specialization/Professional Certificate. By selecting yes, I authorize Coursera to share my name and my email with Credly for the purpose of badge administration only.****No**, I would not like to receive a badge and do not authorize Coursera to share my personal contact information with Credly.

#### Get All Course Quiz Answers of **Entrepreneurship Specialization**

Entrepreneurship 1: Developing the Opportunity Quiz Answers

Entrepreneurship 2: Launching your Start-Up Quiz Answers

Entrepreneurship 3: Growth Strategies Coursera Quiz Answers

Entrepreneurship 4: Financing and Profitability Quiz Answers