Statistics for Data Science with Python Coursera Quiz Answers

All Weeks Statistics for Data Science with Python Coursera Quiz Answers

Statistics for Data Science with Python Week 1 Quiz Answers

Quiz 1: Introduction to Descriptive Statistics

Q1. What is the difference between the maximum and minimum data entries in the set?

Mean
Variance
Range
Mode

Q2. Find the median of the data set. 3 ,8 ,9 ,11, 12, 15

Q3. The measurements of spread or scatter of the individual values around the central point is called:

Measures of dispersion
Measures of central tendency
Measure of skewness
Measures of central tendency and Measures of dispersion

Quiz 2: Introduction and Descriptive Statistics

Q1. Which of the following is an example of time series data?

Annual average housing price in New York
Batting average of a baseball player
Number of trees in Jardin du Luxemburg in Paris
Number of dolphins in the Pacific Ocean

Q2. What is the 75th percentile of the following data set;

1, 3, 3, 4, 5, 6, 6, 7, 8, 8

Q3. Which of the following is a measure of variability?

Median
Mode
Variance
Mean

Q4. Which of the following measures of central tendency will always change if a single value in the data changes?

Mean
Mode
Median
All of the above

Q5. Which of the following data sets has a mean of 10 and standard deviation of 0?

10, 10, 10
0, 10, 20
15, 15, 15
0, 0, 0

Q6. What is meta data?

The metabolism data in a clinical trial
The data about metamorphism
Data about metal fatigue
It’s the data about data

Q7. Which of the following is an example of categorical data?

Length of the river Nile
Number of children at a kindergarten
Mode of travel to work
Number of fire hydrants in a city

Q8. Median represents a value in the data set where:

Half of the observations are known and the other half not known
Most observations are positive
Most observations are negative
Half of the observations are above the median and the other half below it

Q9. If the variance of a dataset is correctly computed with the formula using (n – 1) in the denominator, which of the following option is true?

Data contains other variables with categorical data
Data is from an unknown source
Data is a population
Data is a sample

Q10. Which of the following is NOT a descriptive statistic?

t-test
Mean
Median
Standard Deviation

Statistics for Data Science with Python Week 2 Quiz Answers

Quiz 1: Data Visualization

Q1. What’s the best way to display median and outliers?

A scatter plot
A box plot
A bubble chart
A time series plot

Q2. What is a suitable way to display the average basketball scores between two teams?

A bar chart
A pie chart
A histogram
A scatter plot

Quiz 2: Data Visualization

Q1. Which of the following is the suitable way to display the average income earned by men and women in a city?

A histogram
A bar chart
A scatter plot
A pie chart

Q2. What is a suitable way to display relationship between two continuous variables?

A pie chart
A histogram
A bar chart
A scatter plot

Q3. When the sum of two or more categories equals 100, what chart type is ideally suited for displaying data?

A line chart
A box plot
A histogram
A pie chart

Q4. Which of the following will return a scatterplot of age and evaluation scores differentiated by gender?

1
sns.scatterplot(x=’age’, y=’eval’, hue=’gender’, data=ratings_df)
1
sns.boxplot(x=’credits’, y=’beauty’, data=ratings_df)
1
sns.distplot(ratings_df[‘eval’], kde = False)
1
ratings_df.groupby(‘division’)[[‘eval’]].mean().reset_index()

Q5. When multiple observations are reported for each respondent in the data set, to compute statistics for variables about the respondents, one must:

Ignore the presence of duplicates and compute statistics as usual
Weight data by duplicates
Remove duplicates before running analysis
None of the above

Statistics for Data Science with Python Week 3 Quiz Answers

Quiz 1: Introduction to Probability Distribution

Q1. Given the histograms below, Statistics for Data Science with Python Coursera Quiz Answers

which histogram most closely depicts a normal distribution?

Q2. If you got a 75 on a test in a class with a mean score of 85 and a standard deviation of 5, the z-score of your test score would be

2
-3
3
-2

Q3. The spread of the normal curve depends upon the value of:

Standard Deviation
Median
Mean
1st quartile

Quiz 2: Introduction to Probability Distribution

Q1. For the below normal distribution, which of the following option holds true? σ1, σ2 and σ3 represent the standard deviations for curves 1, 2 and 3 respectively.

Statistics for Data Science with Python Coursera Quiz Answers

σ1> σ2> σ3
σ1< σ2< σ3
σ1= σ2= σ3
None

Q2. A test is administered annually. The test has a mean score of 150 and a standard deviation of 20. If Chioma’s z-score is 1.50, what was her score on the test?

180
150
130
30

Q3. If a negatively skewed distribution (i.e. skewed to the left) has a median of 50, which of the following statements are true? (Select all that apply)

None of the above
Mode is greater than 50
Mean is less than 50
Mean is greater than 50

Q4. What is the probability of getting two heads when two coins are flipped?

1/4
1/2
1/8
1

Q5. The probability of getting a double by rolling TWO six-sided dice (with sides labeled as 1, 2, 3, 4, 5, 6) is:

1/36
1/6
2/36
1

Q6. What is the area under a conditional Cumulative Density Function?

2
0.5
0
1

Q7. Which of the following is a possible alternative hypothesis H1 for a two-tailed test.

µ is equal to 85
µ is less than 85
µ is greater than 85
µ is not equal to 85

Q8. Green sea turtles have normally distributed weights, measured in kilograms, with a mean of 134.5 and a variance of 49.0. A particular green sea turtle’s weight has a z-score of -2.4. What is the weight of this green sea turtle? Round to the nearest whole number.

118kg
151kg
252kg
17kg

Q9. A normal distribution can best be described as which of the following? (Select all that apply)

Bell-shaped
Skewed
Uniform
Symmetric

Q10. In its standardized form, the normal distribution

has an area equal to 0.5.
has a mean of 1 and a variance of 0.
has a mean of 0 and a standard deviation of 1.
cannot be used to approximate discrete probability distributions.

Statistics for Data Science with Python Week 4 Quiz Answers

Quiz 1: Hypothesis Testing

Q1. The weekly earnings of bus drivers are normally distributed with a mean of $395. If only 0.84% of the bus drivers have a weekly income of more than $429.35, the standard deviation of the weekly earnings of the bus drivers is approximately

14.37
17
34.83
2.39

Q2. For the following samples assume they follow a normal distribution and we assume equal variance, we will like to know if there is a difference between both sample means. If we perform a two-sample t-test for independent samples. What is the p-value for the test Statistics?

Sample1 = 9, 11, 10,11,10,12, 9,11,12, 9, 10

Sample2 = 10, 13, 10, 13, 12, 9, 11, 12, 12, 12, 13

0.0384
2.21
0.0885
0.975

Q3. What test is used to test the equality of variance

t-test
Levene’s test
ANOVA
z-test

Quiz 2: Hypothesis Testing

Q1. Using the teacher’s rating data, is there an association between native (native English speakers) and the number of credits taught? What test will you use?

Z-test
ANOVA
T-test
Chi-Square Test for Association

Q2. If I wanted to test for association using chi-square test, whether there is an association between gender (Male or Female) and tenure-ship (tenured or not tenured), what will be my degree of freedom?

Q3. Consider a normally distributed data set with mean μ = 63.18 inches and standard deviation σ= 13.27 inches. What is the z-score when x = 91.54 inches? (To 3 decimal places)

Q4.

Battery life of smartphones is of great concern to customers. A consumer group tested four brands of smartphones to determine the battery life. Samples of phones of each brand were fully charged and left to run until the battery died. The table above displays the number of hours each of the batteries lasted. What test will be be using to test the difference in means?

Chi-square Test
Pearson Correlation Test
T-test
ANOVA

Q5. A room in a laboratory is only considered safe if the mean radiation level is 400 or less. When a sample of 10 radiation measurements were taken, the mean value of the radiation was 414 with a standard deviation of 17. There are concerns that mean radiation is above 414. Radiation levels in the lab are known to follow a normal distribution with standard deviation 22. We will like to conduct a hypothesis test at the 5% level of signiﬁcance to determine whether there is evidence that the laboratory is unsafe.

What will be the appropriate test?

z-test
t-test
ANOVA
Chi-square

Q6. The mineral content of a particular brand of supplement pills is normally distributed with mean 490 mg and variance of 400. What is the probability that a randomly selected pill contains at least 500 mg of minerals?

0.3085
0.2023
0.0525
0.7967

Q7. The P-value for a normally distributed right-tailed test is P=0.042. Which of the following is INCORRECT?

The z-score test statistic is approximately z=1.73
We will reject H0 at α=0.05, but not at α=0.01
The P-value for a two-tailed test based on the same sample would be P=0.084
The P-value for a left-tailed test based on the same sample would be P= -0.042

Q8. The time X taken by a cashier in a grocery store express lane to complete a transaction follows a normal distribution with mean 90 seconds and standard deviation 20 seconds.What is the first quartile of the distribution of X (in seconds)?

76.6
88.0
73.8
81.2

Q9. A man accused of committing a crime is taking a polygraph (lie detector) test. The polygraph is essentially testing the hypotheses

H0: The man is telling the truth vs. Ha: The man is not telling the truth.

Suppose we use a 5% level of significance. Based on the man’s responses to the s asked, the polygraph determines a P-value of 0.08. We conclude that:

The probability that the man is not telling the truth is 0.08.
We reject the null hypothesis as there is sufficient evidence that the man is telling the truth.
The probability that the man is telling the truth is 0.08.
We fail to reject the null hypothesis as there is insufficient evidence that the man is not telling the truth.

Q10. The average hourly wage at a fast-food restaurant is $5.85 with a standard deviation of $0.35. Assume that the wages are normally distributed. The probability that a selected worker earns more than $6.90 is

0
0.4987
0.0013
0.9987

Statistics for Data Science with Python Week 5 Quiz Answers

Quiz 1: Regression analysis

Q1. We run a regression analysis between two continuous variables amount of food eaten vs the amount of calories burnt. If I get a coefficient of -0.33 for the amount of food eaten and an R-square value of 0.81. What is the correlation coefficient?

0.66
–0.9
-0.66
0.9

Q2. In the simple linear regression equation, the term B0 represents the:

explanatory variable
estimated slope
estimated or predicted response
estimated intercept

Q3. Pearson correlation are concerned with:

the relationship between a categorical explanatory variable and a quantitative response variable.
the relationship between a quantitative explanatory variable and a categorical response variable
the relationship between two quantitative variables
the relationship between two categorical variables

Quiz 2: Regression Analysis

Q1. Does running an ANOVA give the same p-value results as running a regression analysis when testing the difference in group means?

True
False

Q2. Give the results of the regression analysis below, what is the correlation coefficient? Statistics for Data Science with Python Coursera Quiz Answers 1 point

0.19
0.036
0.034
17.08

Q3. Given the results for tenure-ship vs teaching evaluation, if our null hypothesis is that there is no difference in mean evaluation scores for professors who are tenured vs professors who are not tenured. What will be the conclusion of the t-test statistics? Statistics for Data Science with Python Coursera Quiz Answers

P-value is less than 0.05, that means that there is a difference in mean values for professors who are tenures versus professors who are not tenured.
P-value is less than 0.05, we will fail to reject the null hypothesis.
There is no conclusive evidence in the results above.

Q4. We run a regression analysis in place of a t-test to test if there is a difference in number of students enrolled in classes with professors who are native english speakers (English_speakers = 1) vs professors who are not (English_speakers = 0). The table is shown below. What does the coefficient for English_speakers mean?

Professors who are English speakers get about 27 more students enrolled on average
Professors who are English speakers get about 30 more students enrolled on average
Professors who are English speakers get about 27 less students enrolled on average
We can’t conclude because the error is too large and if factored in could change the conclusion of the results

Q5. Which of these are correct about correlation coefficient? (Select all that apply)

The correlation coefficient (r) ranges from -1 to 1
A correlation coefficient of -0.9 indicates a weak linear relationship?
A correlation coefficient of -0.9 indicates a strong linear relationship?
The correlation coefficient (r) ranges from 0 to 1

Q6. Which of these options is most likely to be the null hypothesis for testing correlation between two variables?

There is a partial association between an instructor’s looks and teaching evaluation score.
There is an association between an instructor’s looks and teaching evaluation score.
There is no association between an instructor’s looks and teaching evaluation score.

Q7. If we ran a regression analysis between two continuous variables amount of time spent running on a treadmill vs the amount of calories burnt. If I get a coefficient of 0.33 for the amount of time running on the treadmill and an R-square value of 0.81. What is the correlation coefficient?

0.66
0.81
0.9
0.77

Q8. Which of the following best explains a scatter plot?

A two-dimensional graph of data values.
A two-dimensional graph of a straight line.
A two-dimensional graph of a curved line.
A one-dimensional graph of randomly scattered data.

Statistics for Data Science with Python Week 6 Quiz Answers

Quiz 1: Opt-in to receive your badge!

Q1. Learners who complete all courses of this Specialization/Professional Certificate are eligible to earn a digital credential from Credly and IBM.

Would you like to receive a digital credential to recognize the skills you learned in this Specialization/Professional Certificate?

Yes, I would like to receive a badge upon completion of this Specialization/Professional Certificate. By selecting yes, I authorize Coursera to share my name and my email with Credly for the purpose of badge administration only.
No, I would not like to receive a badge and do not authorize Coursera to share my personal contact information with Credly.

Get All Course Quiz Answers of Entrepreneurship Specialization

Entrepreneurship 1: Developing the Opportunity Quiz Answers

Entrepreneurship 2: Launching your Start-Up Quiz Answers

Entrepreneurship 3: Growth Strategies Coursera Quiz Answers

Entrepreneurship 4: Financing and Profitability Quiz Answers

Statistics for Data Science with Python Coursera Quiz Answers

All Weeks Statistics for Data Science with Python Coursera Quiz Answers

Table of Contents

Statistics for Data Science with Python Week 1 Quiz Answers

Quiz 1: Introduction to Descriptive Statistics

Quiz 2: Introduction and Descriptive Statistics