# Introduction to Statistics & Data Analysis in Public Health Quiz Answers

### Get All Weeks Introduction to Statistics & Data Analysis in Public Health Quiz Answers

Welcome to Introduction to Statistics & Data Analysis in Public Health!

This course will teach you the core building blocks of statistical analysis – types of variables, common distributions, hypothesis testing – but, more than that, it will enable you to take a data set you’ve never seen before, describe its keys features, get to know its strengths and quirks, run some vital basic analyses and then formulate and test hypotheses based on means and proportions.

You’ll then have a solid grounding to move on to more sophisticated analysis and take the other courses in the series. You’ll learn the popular, flexible and completely free software R, used by statistics and machine learning practitioners everywhere. It’s hands-on, so you’ll first learn about how to phrase a testable hypothesis via examples of medical research as reported by the media.

Then you’ll work through a data set on fruit and vegetable eating habits: data that are realistically messy, because that’s what public health data sets are like in reality. There will be mini-quizzes with feedback along the way to check your understanding. The course will sharpen your ability to think critically and not take things for granted: in this age of uncontrolled algorithms and fake news, these skills are more important than ever.

Enroll on Coursera

### Introduction to Statistics & Data Analysis in Public Health Quiz Answers

#### Quiz 1: Parkinson’s Disease Study Issues Quiz Answers

Q1. Patient selection, i.e. how patients were chosen to take part in the study

• Issue
• Potential issue but difficult to assess
• Not an issue

Q2. Treatment allocation, i.e. how patients were chosen to get which treatment

• Issue
• Potential issues but difficult to assess
• Not an issue

Q3. Small sample size (the size of the sample of the trial)

• Issue
• Potential issues but difficult to assess
• Not an issue

Q4. Blinded treatment group (this means that people – patients and/or staff – did not know who got which treatment until after the data were analysed)

• Issue
• Potential issues but difficult to assess
• Not an issue

Q5. Length of follow-up (the length of time that patients were followed up)

• Issue
• Potential issues but difficult to assess
• Not an issue

Q6. Outcome measure (the outcome of interest in the study)

• Issue
• Potential issues but difficult to assess
• Not an issue

Q7. Effect size – clinical vs statistical significance

• Issue
• Potential issues, but difficult to assess
• Not an issue

Q8. Side-effects (side-effects of the drug)

• Issue
• Potential issues, but difficult to assess
• Not an issue

Q9. Patients withdrawing from the study (patients leaving the study)

• Issue
• Potential issues, difficult to assess
• Not an issue

Q10. Replication of the study (repeating the study)

• Issue
• Potential Issues, difficult to assess
• Not an issue

#### Quiz 2: Research Question Formulation Quiz Answers

Q1. In the following articles, try to identify the research question you believe the original study set out to answer. When you’re doing this it might help if you first identify the following: population, intervention, control, outcome and timeframe.

You’re going to look again at the BBC report on Parkinson’s. You might notice the text in the article differs from that in the activity, Parkinson’s Disease Treatment Reports. The text was edited so as to hide some of the issues. At the above link you will find the unedited full report.

Using the information found in the article, what do you think the research question was?

`What do you think?`

Q2. Using the information found in the article, what do you think the research question was?

`What do you think?`

Q3. The articles above reference the same study discussed in the Mirror article in Question 2.

After reading these articles, what are your thoughts on the study’s research question now?

`What do you think?`

#### Quiz 1: Special case of age Quiz Answers

Q1. Pablo requests the birth records for every individual in his region. He is told that the data set contains everyone’s date of birth so he will be able to calculate their age in days if he wishes. What sort of data will Pablo have:

• Continuous
• Integer
• Ordinal

Q2. When Pablo receives the data set he finds that in fact the version of the data set that he has been given contains age group rather than dates of birth. Each individual has been classified as <18 years, 18-44, 45-64 and 65+ years of age. What sort of data does Pablo actually have:

• Continuous
• Binary
• Ordinal

Q3. Meghan downloads the following death rate data for the population of England and Wales:

The death rate for males aged 65 or older in England and Wales is 42.11.

• True
• False

Q4. The death rate in England and Wales remains constant at 42.11 deaths per 1000 people for ages 0 to 64.

• True
• False

#### Quiz 2: Well-behaved Distributions Quiz Answers

Q1. Match this distribution to the plot:

Normal with mean 50 and standard deviation 12

Note: There are two correct answers

• a)
• b)
• c)
• d)
• e)
• f)

Q2. Match this distribution to the plot:

Poisson with mean 4

Note: There are two correct answers

• a)
• b)
• c)
• d)
• e)

Q3. Match this distribution to the plot:

Normal with mean 50 and standard deviation 4

Note: There are two correct answers

• a)
• b)
• c)
• d)
• e)

Q4. Which of the following plots shows the distribution with the biggest standard deviation?

• a)
• b)

Q5. What proportion of the data lies in the shaded area on the plot below?

• 68%
• 95%
• 50%

Q6. What proportion of the data lies in the shaded area on the plot below?

• 68%
• 95%
• 50%

Q7. A drug is given to 100 migraine suffers to prevent the onset of new migraines. 40% experience a new migraine after taking the drug. What distribution does the outcome (new migraine) follow:

• Binomial
• Normal
• Poisson

Q8. A new drug is given to 100 asthma suffers to reduce the number of hospital admissions due to asthma attack over a 12 month period. After 12 months, the mean number of hospital admissions is 2. What distribution does the outcome (hospital admissions) follow:

• Binomial
• Normal
• Poisson

Q9. The normal distribution is a:

• Discrete distribution
• Continuous distribution

Q10. The Poisson distribution is a:

• Discrete distribution
• Continuous distribution

Q11. The Binomial distribution is a:

• Discrete distribution
• Continuous distribution

Q12. Which of these does not follow a Poisson distribution:

• Asthma exacerbations over a 12-month period
• Patients arriving at a hospital emergency department in a one hour time period
• Number of patients in disease remission
• Number of patient falls on a geriatric ward over a twelve-hour shift.

#### Quiz 3: Ways of Dealing with Weird Data Quiz Answers

Q1. The video introduced the idea that data do not always fit well-behaved distributions. However, this matters to a greater or lesser extent depending on how you plan to use the data. The following will test your understanding of this and the potential solutions available to you when you have “weird” data.

Dev has collected information on the average number of times a month that people viewed a particular public health information website (he has no information on people who did not access the website at all). He plots the data and observes the following:

Dev wants to describe website access in his sample. What would be the best approach for him to do this?

• Try transforming the data to see if it makes the distribution more normal and analyse as a normal distribution.
• Dichotomise the data into high and low usage using a cut point such as 5 or more times a month on average and analyse as a binomial distribution.
• Present a simple summary table of frequencies and proportion of people by average number of logins.

Q2. Ji-woo is conducting a study that is looking at the effects of a new drug on vision compared with a group that receive standard care. The vision outcome is measured by the ETDRS (a visual acuity scale), which has a range from 0-100 (complete sight loss to perfect vision). She collects the ETDRS at baseline before the drug/standard care is administered and 6 months later. At baseline, the sample contains patients with very poor vision, including some with complete vision loss. The literature shows that the baseline scores are likely to be positively skewed. Ji-woo wants to compare change scores on the ETDRS between baseline and 6 months across the two treatment groups. How should Ji-woo proceed?

In thinking about your answer, one of the things you should consider is how the doctor might most easily communicate the information to the patient.

• Present the mean change scores by group.
• Dichotomise the change scores so that the data follows a binomial distribution.

Q3. Nisha has data that contain each person’s average daily fruit and vegetable consumption over the course of a year for the last ten years. An extract is given in the table below.

• A histogram of the data for year 1 is shown below:
• She wants to draw a graph of the trend over this 10-year period. She decides she needs to get a summary measure for each year to compare over time. How can she best summarise the data per year to make a comparison over time:
• Calculate the mean average daily fruit and vegetable consumption for each year.
• Calculate the proportion per year that eat above the daily recommended amount.

#### Quiz 4: Sampling Quiz Answers

Q1. Which one of the following defines the standard error of a mean?

• The difference between the population mean and the sample mean
• The average difference between the population mean and the sample mean
• The average difference between the individual observations and the sample mean

Q2. Lucy takes a sample of BMI values across her class of 35 students. The sample mean and standard deviation are 23.2 and 2 respectively. What is the estimated standard error of Lucy’s sample:

• 0.06
• 0.34
• 3.92

Q3. Lucy want to calculate the 95% confidence interval for the sample mean. What is Lucy’s estimated 95% confidence interval:

• (22.53, 23.87)
• (19.28, 27.12)
• (21.20, 25.20)

#### Quiz 1: Distributions and Medians Quiz Answers

Q1. Match the below plot with the correct distribution.

• Normal (75, 10)
• Poisson(4)
• Uniform (0,100)
• Binomial (100, 0.5)

Q2. Match the below plot with the correct distribution.

• Binomial (100, 0.5)
• Uniform (0,100)
• Poisson(4)
• Normal (75, 10)

Q3. Match the below plot with the correct distribution.

• Normal (75, 10)
• Binomial (100, 0.5)
• Poisson(4)
• Uniform (0,100)

Q4. Match the below plot with the correct distribution.

• Binomial (100, 0.5)
• Poisson(4)
• Uniform (0,100)
• Normal (75, 10)

Q5. For the sequence of numbers 3, 4, 5, 5, 7, 36, what is the Mean?

• 6
• 10
• 3
• 5
• 4

Q6. For the sequence of numbers 3, 4, 5, 5, 7, 36, what is the Median?

• 3
• 10
• 6
• 4
• 5

Q7. For the sequence of numbers 7, 7, 5, 3, 2, 12, what is the Mean?

• 6
• 4
• 3
• 5
• 10

Q8. For the sequence of numbers 7, 7, 5, 3, 2, 12, what is the Median?

• 3
• 6
• 4
• 10
• 5

#### Quiz 2: Calculations: Percentiles by Hand Quiz Answers

Q1. Throughout this quiz, I’d like you to calculate the values by hand. It’s more effective learning than getting the computer to do it for you.

For the values 4, 5, 20, 22, 22, 24, 24, 26, 27, 29, 29, calculate the:

Mean (to one decimal place)

`Enter answer here`

Q2. For the values 4, 5, 20, 22, 22, 24, 24, 26, 27, 29, 29. Calculate the:

Median

`Enter answer here`

Q3. For the values 4, 5, 20, 22, 22, 24, 24, 26, 27, 29, 29, calculate the:

25th Percentile

`Enter answer here`

Q4. For the values 4, 5, 20, 22, 22, 24, 24, 26, 27, 29, 29, calculate the:

75th Percentile

`Enter answer here`

Q5. For the values 4, 5, 20, 22, 22, 24, 24, 26, 27, 29, 29, calculate the IQR (interquartile range). Express it as a single number, i.e. the upper minus the lower quartile.

`Enter answer here`

Q6. For the values 7, 3, 5, 12, 6, 7, 25, 9, 23, 9, 12, 3, 12, 23, calculate the mean.

`Enter answer here`

Q7. For the values 7, 3, 5, 12, 6, 7, 25, 9, 23, 9, 12, 3, 12, 23, calculate the median.

`Enter answer here`

Q8. For the values 7, 3, 5, 12, 6, 7, 25, 9, 23, 9, 12, 3, 12, 23, calculate the 25th Percentile.

`Enter answer here`

Q9. For the values 7, 3, 5, 12, 6, 7, 25, 9, 23, 9, 12, 3, 12, 23, calculate the 75th Percentile.

`Enter answer here`

Q10. For the values 7, 3, 5, 12, 6, 7, 25, 9, 23, 9, 12, 3, 12, 23, calculate the IQR (expressed as single number).

`Enter answer here`

#### Quiz 1: Hypothesis Testing Quiz Answers

Q1. In the following three questions, you’ll be asked to write out some text. This isn’t a graded assessment, so only my version of the correct answer will be given once you’ve answered. You need to check that the sense of what you’ve written matches mine.

In the example in the video you found that 20% (10/50) of those with cancer met the target of eating 5 portions of fruit and vegetables per day, let’s represent this proportion as pc = 0.2. The equivalent proportion in the sample of people without cancer was 30% (30/100): let’s represent this proportion as pnc = 0.3. You want to test whether there is a true difference in these proportions or whether the observed difference is due to random variation. You can do this with a hypothesis test.

Using the notation of pc and pnc set up the null hypothesis for this test. Can you suggest what the alternative hypothesis might be?

`What do you think?`

Q2. After choosing the appropriate test and running it you find you have a p-value of 0.02. Following the convention, you choose an a priori significance level (also known as the alpha value) of 0.05. What does this p-value allow you to conclude?

`What do you think?`

Q3. What does a p-value actually represent? Think about probabilities.

`What do you think?`

#### Quiz 2: The Coin Tossing Experiment: Evaluation Quiz Answers

`What do you think?`

Q2. Does your p value support or go against your null hypothesis?

• Supports the null hypothesis
• Goes against the null hypothesis

Q3. If yes, why? If not, why not?

Please write out an explanation of how your p-value relates to your null hypothesis. For example, if it goes against the null hypothesis and leads you to reject the null and accept the alternative hypothesis, what does this say about how likely it is that your coin is fair?

`What do you think?`

#### Quiz 3: Results: Running a New Hypothesis Test Quiz Answers

Q1. Suppose you want to compare the proportions of overweight and cancer. First, define your variables:

``cancer <- g\$cancer``
``overweight <- ifelse(g\$bmi >= 25, 1, 0)Have a look at your new variable to check everything makes sense:``

table(overweight)

overweight

0 1

Next perform a chi-squared test. For best practice, assigning the explanatory variable to x and the dependent variable to y. The “dependent variable” is so named because we are hypothesising that its value depends at least partly on some other variable(s) – called the “explanatory variable(s)”.

chisq.test(x = overweight, y = cancer)
What did you get? What do you conclude?

Enter the p value in the box below (to 2 decimal places) and tick which of the given options for the conclusion you agree with.

`.65`

Q2. Tick which of the below given options for the conclusion you agree with.

• Being overweight gives you cancer
• Being overweight protects you from getting cancer
• Being overweight does not give you cancer
• There is no association between being overweight and cancer
• There is good evidence of an association between being overweight and cancer
• There is no evidence of an association between being overweight and cancer anywhere in the world
• There is no evidence of an association between being overweight and cancer in this data set

#### Quiz 3: Hypothesis Testing Quiz Answers

Q1. In each of the following six questions, you’ll be asked to choose the single correct answer.

David takes 5 samples of 10 patients from the National Cancer Registry. He calculates mean BMI values for each of these 5 samples and obtains the following results – 24.3, 27.9, 25.2, 26.7, 26.4. Why are David’s sample means all different?

• Measurement error
• Population variation
• Sampling variation

Q2. Charlotte wants to test the mean BMI value in the National Cancer Registry based on a sample of 100 patients. She hypothesizes that the mean BMI value in her sample will be 27. Before she conducts her experiment, her boss points out an error in her hypothesis. What is wrong with Charlotte’s statement?

• The hypothesis should relate to the population value.
• 27 is an unreasonable value for mean BMI.
• She hasn’t specified her alpha value.

Q3. Charlotte corrects her hypotheses and randomly selects her sample of 100 patients. She has decided to use a two-sided alpha value of 0.01 instead of the conventional value of 0.05 because she believes that this will decrease her risk of making the wrong conclusion. Will this lower value reduce her risk of concluding the mean population BMI is 27 when in fact it isn’t?

• No
• Yes

Q4. Charlotte’s colleague repeats her experiment but chooses a two-side alpha value of 0.05. What happens to the chance area (or probability of making a type I error)?

• Becomes larger
• Stays the same
• Becomes smaller

Q5. How many degrees of freedom will Charlotte’s test have?

• 0.05
• 100.
• 99
• 0.01

Q6. Noah has the following data and wants to test whether age-group is associated with the presence or absence of cancer. He decides to perform a chi-squared test.

How many degrees of freedom does his test have?

• 4
• 10
• 919
• 920

#### Quiz 4: End-of-course Assessment Quiz Answers

Q1. Part of the success of the UN’s Millennium Development Goals was due to the statistical monitoring of data on measures such as infant mortality and living in extreme poverty.

• True
• False

Q2. As long as a research question is interesting, it is scientifically testable as a hypothesis – the more interesting, the more testable.

• true
• false

Q3. In the study published in the Journal of the American College of Cardiology on the effect of taking supplements of vitamins and minerals that you read earlier in this course, they concluded that, in simple terms, there’s no health benefit in taking such supplements (with the exception of folic acid) and there might even be some risk.

• true
• false

Q4. That phrase that I wrote in the previous question, “there’s no health benefit in taking supplements”, uses precise enough language to be used in a hypothesis test.

• true
• false

Q5. The responsibility for accurate reporting of medical research always lies solely with the journalist. If there’s a misinterpretation of the results, the scientist is never to blame.

• true
• false

Q6. The next set of questions concern data types and exploratory analyses in R. A histogram is a useful but rough way to assess whether a variable is normally distributed.

• true
• false

Q7. When undertaking a t-test in R, it is fine to use “t.test” before “hist” and “summary”

• true
• false

Q8. You want to see whether patients with cancer have different mean BMIs from those without. If you type t.test(cancer~bmi). Please select all that apply.

• You have written “cancer” and “bmi” the wrong way round
• BMI should be roughly normally distributed for a t-test to be valid
• You should have done a chi-squared test instead

Q9. Your boss reminds you that BMI is often categorised, with underweight, normal weight etc as categories. Which of the following is/are correct?

• Making categories from a normally distributed variable loses a lot of information, and it’s more efficient to compare means instead of proportions
• If you did categorise BMI, you could do a chi-squared test using “chisq.test” in R
• The chi-squared statistic that R gives you in the output is really useful and should always be reported

Q10. You decide to turn BMI into categories because they are of public interest, even though it loses information. Before running the above chi-squared test, you make the variable “bmi.group”. You should run these commands in R first and for the reason given…

• table(cancer) in order to check how many values “cancer” has
• table(bmi.group, exclude=NULL) to check your code for grouping BMI gives sensible results
• hist(bmi) to check that BMI is roughly normally distributed
• summary(bmi) to check that BMI is roughly normally distributed
• table(bmi) in order to see how common each BMI value is

Q11. The next set of questions concern the interpretation of official mortality figures from India. These figures are publicly available from https://data.gov.in/catalog/estimated-age-specific-death-rates-sex and in the reading before this test (pdf download) and give the rates of death per 1,000 population in each age-gender group.

True or false?

These data were published in April 2014, but they only go up to 2011. Such delays in releasing official data are common in many countries.

• True
• False

Q12. In 2011 in India according to official statistics, the estimated death rate for girls aged under 1 was higher than that for every older age group until 75-79.

• true
• false

Q13. The lack of a 95% confidence interval for either of these estimates means that we can safely say that 49.7 is statistically significantly higher than 42.5.

• true
• false

Q14. To see whether these two rates (42.5 and 49.7 per 1,000) are statistically significantly different from one another, we would carry out a t-test and interpret its p value.

• true
• false

Q15. As these rates are in fact based on proportions (they’re proportions of the population of each age group that died that are then multiplied by 10 to make them easier to read), the appropriate test is a chi-squared test. We have enough information to carry out this test.

• true
• false
##### Introduction to Statistics & Data Analysis in Public Health Course Review

In our experience, we suggest you enroll in Introduction to Statistics & Data Analysis in Public Health courses and gain some new skills from Professionals completely free and we assure you will be worth it.

Introduction to Statistics & Data Analysis in Public Health Course for free, if you are stuck anywhere between a quiz or a graded assessment quiz, just visit Networking Funda to get Introduction to Statistics & Data Analysis in Public Health Quiz Answers.

##### Conclusion:

I hope this Introduction to Statistics & Data Analysis in Public Health Quiz Answer would be useful for you to learn something new from this Course. If it helped you then don’t forget to bookmark our site for more Quiz Answers.

This course is intended for audiences of all experiences who are interested in learning about new skills in a business context; there are no prerequisite courses.

Keep Learning!

##### Get All Course Quiz Answers of Statistical Analysis with R for Public Health Specialization

Introduction to Statistics & Data Analysis in Public Health Quiz Answers

Linear Regression in R for Public Health Coursera Quiz Answers

Logistic Regression in R for Public Health Coursera Quiz Answers

Survival Analysis in R for Public Health Coursera Quiz Answers