## Table of Contents

## Get All Weeks Introduction to Probability and Data with R Coursera Quiz Answers

### Week 02: Introduction to Probability and Data with R Coursera Quiz Answers

#### Quiz1: Practice Quiz

Q1. Which of the following classifications of variable types is false?

Q2. True or False: If subjects are randomly assigned to treatments, conclusions can be generalized to the population.

ViewQ3. As part of a statistics project, Andrea would like to collect data on household size in her city. To do so, she asks each person in her statistics class for the size of their household and reports that her sample is a simple random sample. However, this is not a simple random sample. Which of the following is the best reasoning for why this is not a random sample that is appropriate for this research question?

ViewQ4. Which of the following is not one of the four principles of experimental design?

ViewQ5. True or False: Stratified sampling allows for controlling for possible confounders in the sampling stage while blocking allows for controlling for such variables during the random assignment.

View#### Quiz 2: Week 1 Quiz

Q1. Consider the table below describing a data set of individuals who have registered to volunteer at a public school. Which of the choices below lists categorical variables?

ViewQ2. The General Social Survey conducted annually in the United States asks how many friends people have and how they would rate their happiness level (very happy, pretty happy, not too happy). In order to evaluate the relationship between these two variables a researcher calculates the average number of friends for people who categorize themselves as very happy, pretty happy, and not too happy. Which of the following correctly identifies the variables used in the study as explanatory and response?

Viewresponse: number of friends

Q3. In a study published in 2011 in The Proceedings of the National Academy of Sciences, researchers randomly assigned 120 elderly men and women who volunteered to be a part of this study (average age mid-60s) to one of two exercise groups. One group walked around a track three times a week; the other did a variety of less aerobic exercises, including yoga and resistance training with bands. After a year, brain scans showed that among the walkers, the hippocampus (part of the brain responsible for forming memories) had increased in volume by about 2% on average; in the others, it had declined by about 1.4%. Which of the following is false?

ViewQ4. An extraneous variable that is related to the explanatory and response variables and that prevents us from deducing causal relationships based on observational studies is called a ** _** (use all lower cases in your answer please).

Q5. For your political science class, you’d like to take a survey from a sample of all the Catholic Church members in your town. Your town is divided into 17 neighborhoods, each with similar socio-economic status distribution and ethnic diversity, and each contains a Catholic Church. Rather than trying to obtain a list of all members of all these churches, you decide to pick 3 churches at random. For these churches, you’ll ask to get a list of all current members and contact 100 members at random. What kind of design have you used?

ViewQ6. In an experiment, what purpose does blocking serve?

ViewQ7. Which of the following is one of the four principles of experimental design?

View### Week 3: Introduction to Probability and Data with R Coursera Quiz Answers

#### Quiz 1: Introduction to R and RStudio

Q1. How many variables are included in this data set (data set: Arbuthnot)?

ViewQ2. What command would you use to extract just the counts of girls born?

ViewQ3. Which of the following best describes the number of girls baptized over the years included in this dataset?

ViewQ4. How many variables are included in this data set (data set: present)?

ViewQ5. Calculate the total number of births for each year and store these values in a new variable called total in the present dataset. Then, calculate the proportion of boys born each year and store these values in a new variable called prop_boys in the same dataset. Plot these values over time and based on the plot determine if the following statement is true or false: The proportion of boys born in the US has decreased over time.

ViewQ6. Create a new variable called more_boys which contains the value of either TRUE if that year had more boys than girls, or FALSE if that year did not. Based on this variable which of the following statements is true?

ViewQ7. Calculate the boy-to-girl ratio each year, and store these values in a new variable called prop_boy_girl in the present dataset. Plot these values over time. Which of the following best describes the trend?

ViewQ8. In what year did we see the most total number of births in the U.S.?

View### Week 4: Introduction to Probability and Data with R Coursera Quiz Answers

#### Quiz 1: Practice Quiz

Q1. Which of the below data sets has the lowest standard deviation? You do not need to calculate the exact standard deviations to answer this question.

ViewQ2. True or False: The statistic mean/median (mean divided by median) can be used as a measure of skewness (either right or left). Suppose we are dealing with a distribution where the minimum is 0.5. If this statistic (mean/median) is less than 1, the distribution is most likely left skewed.

ViewQ3. True or False: You are going to collect income data from a right-skewed distribution of the incomes of politicians. If you take a large enough sample from that distribution, the sample mean and the sample median will always have the same value.

ViewQ4. True or False: A mosaic plot is useful for visualizing the relationship between a numerical and a categorical variable.

ViewQ5. Does meditation cure insomnia? Researchers randomly divided 400 people into two equal-sized groups. One group meditated daily for 30 minutes, the other group attended a 2-hour information session on insomnia. At the beginning of the study, the average difference between the number of minutes slept between the two groups was about 0. After the study, the average difference was about 32 minutes, and the meditation group had a higher average number of minutes slept. To test whether an average difference of 32 minutes could be attributed to chance, a statistics student decided to conduct a randomization test. She wrote the number of minutes slept by each subject in the study on an index card. She shuffled the cards together very well, and then dealt them into two equal-sized groups. Which of the following best describes the outcome?

View#### Quiz 2: Week 2 Quiz

Q1. Which of the below data sets has the highest standard deviation? You do not need to calculate the exact standard deviations to answer this question.

ViewQ2. The distribution of exam scores (ranging from 0 – 100%) where the mean score is 75%, the standard deviation is 12%, and the median is 78% is most likely

ViewQ3. Two distributions (A and B) are shown on the box plot below. Which of the following statements is not supported by the plot?

ViewQ4. Which is more affected by extreme observations, the mean or median? And how about the standard deviation or IQR?

ViewQ5. Phi Delta Kappa (PDK) is an international professional organization for educators that, in collaboration with Gallup, has been conducting polls on the public’s attitudes toward public schools since 1969. The following was one of the questions on the 2011 poll:

ViewQ6. In 1948, Austin Bradford Hill, designed a study to test a new treatment for tuberculosis that at the beginning of the study there was no evidence whether it would be any better or worse than bed rest. He randomly assigned some patients who volunteered to be a part of this study to receive the treatment of Streptomycin, an antibiotic. The other patients received only bed rest as the control group. Hill then observed the patients’ outcomes: which patients died and which recovered. The results of the study are shown below.

We use the following simulation test if there is a difference between the recovery rates under the two treatments: We write “died” on 18 index cards and “survived” on 89 index cards to indicate whether or not a patient died. Next, we shuffle the cards and deal them into two groups of 52 and 55, for control and treatment, respectively. We then calculate the simulated difference between the recovery rates in Streptomycin and control groups (p̂Streptomycin − p̂Control), and record this value. We repeat this simulation 100 times. The histogram below shows the distribution simulated difference between the recovery rates in these 100 simulations.

Which of the following is correct? Choose all that apply (there are multiple correct answers).

View2.The alternative hypothesis is that the Streptomycin treatment is more effective than bed rest.

### Week 5: Introduction to Probability and Data with R Coursera Quiz Answers

#### Quiz 1: Introduction to Data

Q1. Create a new data frame that includes flights headed to SFO in February, and save this data frame assfo_feb_flights. How many flights meet these criteria?

ViewQ2. Make a histogram and calculate appropriate summary statistics for arrival delays of sfo_feb_flights. Which of the following is false?

ViewQ3. Calculate the median and interquartile range for arr_delays of flights in the sfo_feb_flights data frame, grouped by carrier. Which carrier has the highest IQR of arrival delays?

ViewQ4. Considering the data from all the NYC airports, which month has the highest average departure delay?

ViewQ5. Which month has the highest median departure delay from an NYC airport?

ViewQ6. Is the mean or the median a more reliable measure for deciding which month(s) to avoid flying if you really dislike delayed flights, and why?

ViewQ7. If you were selecting an airport simply based on on-time departure percentage, which NYC airport would you choose to fly out of?

ViewQ8. Mutate the data frame so that it includes a new variable that contains the average speed, avg_speed traveled by the plane for each journey (in mph). What is the tail number of the plane with the fastest avg_speed? Hint: Average speed can be calculated as distance divided by a number of hours of travel, and note that air_time is given in minutes. If you just want to show the avg_speed and tailnum and none of the other variables, use the select function at the end of your pipe to select just these two variables with select(avg_speed, tailnum). You can google this tail number to find out more about the aircraft.

ViewQ9. Make a scatterplot of avg_speed vs. distance. Which of the following is true about the relationship between average speed and distance.

ViewQ10. Suppose you define a flight to be “on time” if it gets to the destination on time or earlier than expected, regardless of any departure delays. Mutate the data frame to create a new variable called arr_type with levels “on time” and “delayed” based on this definition. Also mutate to create a new variable called dep_type with levels “on time” and “delayed” depending on the flight was delayed for fewer than 5 minutes or 5 minutes or more, respectively. In other words, if arr_delay is 0 minutes or fewer, arr_type is “on time”. If dep_delay is less than 5 minutes, dep_type is “on time”. Then, determine the on time arrival percentage based on whether the flight departed on time or not. What fraction of flights that were “delayed” departing arrive “on time”? (Enter the answer in decimal point, like 0.xx)

View### Week 6: Introduction to Probability and Data with R Coursera Quiz Answers

#### Quiz 1: Practice Quiz

Q2. Which of the following is false about probability distributions?

ViewQ3. Last semester, out of 170 students taking a particular statistics class, 71 students were “majoring” in social sciences and 53 students were majoring in pre-medical studies. There were 6 students who were majoring in both pre-medical studies and social sciences. What is the probability that a randomly chosen student is majoring in social sciences, given that s/he is majoring in pre-medical studies?

View#### Quiz 2: Week 3 Quiz

Q1. Which of the following states that the proportion of occurrences with a particular outcome converges to the probability of that outcome?

ViewQ2. Shown below are four Venn diagrams. In which of the diagrams does the shaded area represent A and B but not C?

ViewQ3. Each choice below shows a suggested probability distribution for the method of access to online course materials (desktop computer, laptop computer, tablet, smartphone). Determine which is a proper probability distribution.

ViewQ4. Assortative mating is a nonrandom mating pattern where individuals with similar genotypes and/or phenotypes mate with one another more frequently than what would be expected under a random mating pattern. Researchers studying this topic collected data on eye colors of 204 Scandinavian men and their female partners. The table below summarizes the results. For simplicity, assume heterosexual relationships. What is the probability that a randomly chosen couple is comprised of a male and female with blue eyes?

(Reference: Laeng, Bruno, Ronny Mathisen, and Jan-Are Johnsen. “Why do blue-eyed men prefer women with the same eye color?.” Behavioral Ecology and Sociobiology 61.3 (2007): 371-384.)

ViewQ5. Which of the following statements is false?

View### Week 7: Introduction to Probability and Data with R Coursera Quiz Answers

#### Quiz 1: Probability

Q1. Fill in the blank: A streak length of 1 means one *_* followed by one miss.

Q2. Fill in the blank: A streak length of 0 means one *_* which must occur after a miss that ended the preceding streak.

Q3. Which of the following is false about the distribution of Kobe’s streak lengths from the 2009 NBA finals?

ViewQ4. If you were to run the simulation of the independent shooter a second time, how would you expect its streak distribution to compare to the distribution from the exercise above?

ViewQ5. How does Kobe Bryant’s distribution of streak lengths compare to the distribution of streak lengths for the simulated shooter? Using this comparison, do you have evidence that the hot hand model fits Kobe’s shooting patterns?

View### Week 8: Introduction to Probability and Data with R Coursera Quiz Answers

#### Quiz 1: Practice Quiz

Q1. Heights of 10-year-olds, regardless of gender, closely follow a normal distribution with a mean 55 inches and a standard deviation of 6 inches. Which of the following is true?

ViewQ2. While it is often assumed that the probabilities of having a boy or a girl are the same, the actual probability of having a boy is slightly higher at 0.51. Suppose a couple plans to have 3 children. What is the probability that exactly 2 of them will be boys?

ViewQ3. You are about to take a multi-day tour through a national park which is famous for its wildlife. The tour guide tells you that on any given day there’s a 61% chance that a visitor will see at least one “big game” animal and a 39% chance they’ll see no big game animals; When the tour guide says “big game”, he refers to either a moose or a bear. The guide assures you that big game sightings on a single day are independent of any other day’s sightings. Given the information from the tour guide, which of the following calculations cannot be performed using a binomial distribution?

ViewQ4. Your friend is about to begin an introductory chemistry course at his university. The course has collected data from students on their study habits for many years, and the professor reports that study times (in hours) for the final exam closely follow a normal distribution with a mean 24 and a standard deviation 4. What percentage of students study 34 hours or more?

ViewQ5. Which of the following is false? Hint: It might be useful to sketch the distributions.

ViewQ6. About 30% of human twins are identical, and the rest are fraternal. Identical twins are necessarily the same sex, half are males and the other half are females. One-quarter of fraternal twins are both males, one-quarter are both female and one-half are mixed: one male, one female. You have just become a parent of twins and are told they are both girls. Given this information, what is the probability that they are identical?

ViewQ7. Which of the following probabilities can be calculated using the normal approximation to the binomial distribution?

View#### Quiz 2: Week 4 Quiz

Q1. Suppose that scores on a national entrance exam are normally distributed with a mean 1000 and a standard deviation of 100. Which of the following is false?

ViewQ2. A 2005 survey found that 7% of teenagers (ages 13 to 17) suffer from an extreme fear of spiders (arachnophobia). At a summer camp, there are 10 teenagers sleeping in each tent. Assume that these 10 teenagers are independent of each other. What is the probability that at least one of them suffers from arachnophobia?

ViewQ3. Your roommate loves to eat Chinese food for dinner. He estimates that on any given night, there’s a 30% chance he’ll choose to eat Chinese food. Although he loves Chinese food, he doesn’t like to eat it too much in a short period of time, so on most weeks he eats several different kinds of foods for dinner. Suppose you wanted to calculate the probability that, over the next 7 days, you friend eats Chinese food at least 3 times. Which of the following is the most accurate statement about calculating this probability?

ViewQ4. Which of the following, on its own, is the least useful method for assessing if the data follow a normal distribution?

ViewQ5. Which of the following is true? Hint: It might be useful to sketch the distributions.

ViewQ6. More than three-quarters of the nation’s colleges and universities now offer online classes, and about 23% of college graduates have taken a course online. 39% of those who have taken a course online believe that online courses provide the same educational value as one taken in person, a view shared by only 27% of those who have not taken an online course. At a coffee shop, you overhear a recent college graduate discussing that she doesn’t believe that online courses provide the same educational value as one taken in person. What’s the probability that she has taken an online course before?

ViewQ7. One strange phenomenon that sometimes occurs at U.S. airport security gates is that an otherwise law-abiding passenger is caught with a gun in his/her carry-on bag. Usually the passenger claims he/she forgot to remove the handgun from a rarely-used bag before packing it for airline travel. It’s estimated that every day 3,000,000 gun owners fly on domestic U.S. flights. Suppose the probability a gun owner will mistakenly take a gun to the airport is 0.00001. What is the probability that tomorrow more than 35 domestic passengers will accidentally get caught with a gun at the airport? Choose the closest answer.

View**Conclusion:**

In conclusion, our journey through the Introduction to Probability and Data with R course has been a fascinating exploration of the fundamental concepts that underpin the world of data analysis and statistics. We’ve delved into the principles of probability, statistical inference, and data visualization, equipping ourselves with essential tools for making sense of the vast sea of information that surrounds us.

**Find More Related Quiz Answers >>**

Introduction to Databases Coursera Quiz Answers

Version Control Coursera Quiz Answers