Design Principles: an Introduction Coursera Quiz Answers

All Weeks Design Principles: an Introduction Coursera Quiz Answers

Designing, Running, and Analyzing Experiments Coursera Quiz Answers

Week 1: Understanding the Basics

Q1. Why can’t we confidently compare only means when trying to assess differences between experiment conditions?

  • We can, provided the means are really big.
  • We can, provided the means are really different.
  • We can, but we need to know every score comprising those means as well.
  • We can, but we need to know the spread of scores around those means as well.
  • None of the above.

Q2. When is having more participants customarily most helpful in an experiment?

  • When differences are present but relatively small, more participants will give more statistical power to detect those differences.
  • When differences are present but relatively large, more participants will ensure those differences are detected.
  • When differences are absent and we need to be sure that we don’t incorrectly claim a difference anyway.
  • When differences are absent and we want to be sure that the results are convincing.
  • None of the above.

Q3. A practically significant difference can be determined in just the same way that a statistically significant difference can.

  • True
  • False

Q4. Which of the following is a type of probability sampling?

  • Snowball sampling
  • Random stratified sampling
  • Convenience sampling
  • Purposive sampling
  • None of the above.

Q5. Which of the following is best described as an exclusion criterion?

  • An experiment will only include males of age 18-25.
  • An experiment will only include people living in developing nations.
  • An experiment will exclude people who have used an existing website before.
  • An experiment will exclude everyone not between the ages of 18-25.
  • None of the above.

Q6. The question of whether or not to capture data by writing computer-generated log files is a question about which aspect of experiment design?

  • Participants
  • Apparatus
  • Procedure
  • Design & Analysis
  • None of the above.

Q7. Informed consent is important because it gives participants the power to choose whether to proceed with taking part in an experiment.

  • True
  • False

Q8. Which of the following pertain to the procedure of an experiment? (Mark all that apply.)

  • How many trials a participant performs
  • How long it should take a participant to complete each trial
  • A virtual-reality headset worn by participants
  • A think-aloud protocol followed by participants
  • The pointing device used by participants

Week 2: Designing, Running, and Analyzing Experiments

Quiz 1: Understanding Tests of Proportions

Q1. If your Subject column is filled with numbers, why is it necessary to recode it using the “factor” function?

  • The Subject column is a nominal factor, not a numeric one, despite sometimes being encoded as a number.
  • The Subject column is a categorical factor, not a numeric one, despite sometimes being encoded as a number.
  • The Subject column is a nominal factor, not a scalar one, despite sometimes being encoded as a number.
  • The Subject column is a categorical factor, not a scalar one, despite sometimes being encoded as a number.
  • All of the above.

Q2. Which of the following variable type names are grouped with synonyms? (Mark all that apply.)

  • Categorical, nominal, factor
  • Ordinal, ordered
  • Numeric, continuous, scalar
  • Numeric, ordinal, factor
  • Categorical, binomial, scalar

Q3. What is the correct R command for viewing preference proportions?

1 plot(data$Pref)

None of the above.

Q4. Which of the following is the most precise way of saying what a one-sample test of proportions tell us?

  • Whether the proportions of counts in each response category are significantly different.
  • Whether the proportions of counts across the response categories are significantly different from each other.
  • Whether any of the proportions of counts in each response category are significantly different from chance.
  • Whether any of the proportions of counts in each response category are significantly different from each other.
  • None of the above.

Q5. Which of the following is the most proper way to report a Chi-Square test result?

  • χ²(1,20) = 4.12, p<.05
  • χ²(1,20) = 4.12, p=.04257
  • χ²(1,N=20) = 4.12, p<.05
  • χ²(1,N=20) = 4.12, p=.04257

None of the above.

Q6. What does “n.s.” mean in place of a p-value?

  • Non-significant
  • Not statistical
  • Insignificant
  • Nae significaté
  • Not shown

Q7. Which of the following best describes the main purpose for which we employ inferential statistical tests?

  • With statistical tests, we can prove that two things are different.
  • With statistical tests, we can prove that two things are equal.
  • With statistical tests, we can provide evidence that two things are different.
  • With statistical tests, we can provide evidence that two things are not detectably different.
  • With statistical tests, we can prove that two things are not detectably different.

Q8. As opposed to an asymptotic test, what does an exact test compute?

  • An exact p-value
  • An exact Chi value
  • An exact degrees of freedom
  • An exact binomial value
  • None of the above.

Q9. The binomial test is used in tests of proportions with two response categories.

  • True
  • False

Q10. The multinomial test generalizes the binomial test to more than two response categories.

  • True
  • False

Q11. For a one-sample test of proportions with four response categories, what would be the R vector of probabilities representing no significant preference (i.e., lack of any detectable preference) for any of the categories?

1 c(1/4, 1/4, 1/4)
1 c(1/3, 1/3, 1/3)
1 c(1/2, 1/4, 1/4)
1 c(1/2, 1/2)
1 c(1/4, 1/4, 1/4, 1/4)

Q12. An omnibus test is when two levels of a three-level factor are directly compared.

  • True
  • False

Q13. Which of the following are true statements about post hoc tests? (Mark all that apply.)

  • They are justified after a significant overall test.
  • They are justified after a significant omnibus test.
  • They can compare specific levels of a factor.
  • They are often pairwise comparisons, pitting one level of a factor against another.
  • They may be performed with a different statistical test than was used by the omnibus test.

QQ14. Which is the best explanation for post hoc adjustments and why they are necessary?

  • Post hoc adjustments are adjustments to p-values designed to make them bigger.
  • Post hoc adjustments are adjustments for multiple comparisons designed to improve the chances of finding statistical significance.
  • Post hoc adjustments are adjustments for multiple comparisons designed to reduce the chances of incorrectly finding statistical significance, also known as a Type I error.
  • Post hoc adjustments are adjustments for multiple comparisons designed to reduce the chances of incorrectly failing to find statistical significance, also known as a Type II error.
  • Post hoc adjustments are adjustments to p-values designed to make them smaller.

Q15. The post hoc adjustment indicated with “holm” in R stands for what?

  • The Bonferroni correction
  • The Holm-Bonferroni school of statistical thought
  • Holm’s statistical test
  • Holm’s sequential Bonferroni procedure
  • None of the above.

Q16. Which of the following indicate two-sample proportions? (Mark all that apply.) Hint: Do not confuse the number of samples with the number of categories within a sample. The number of categories within a sample has no bearing on the number of samples one has in the first place.

  • Designers’ choices among three options by highest degree attained
  • Designers’ choices among two options
  • Designers’ choices among three options, then take the top two, and then designers’ choices between those remaining two options
  • Designers’ choices among three options by citizenship
  • Designers’ choices among two options by highest degree attained and by citizenship

Q17. Which of the following are tests of proportions reviewed in lecture? (Mark all that apply.)

  • Chi-Square test
  • A/B test
  • Binomial test
  • G-test
  • Friedman test

Q18. The G-test can be thought of as a newer version of essentially which asymptotic test?

  • Chi-Square test
  • A/B test
  • Binomial test
  • t-test
  • Turing test

Quiz 2: Doing Tests of Proportions

Q1. Download the file deviceprefs.csv from the course materials. This file describes a study in which participants indicated their preference for touchpads or trackballs as computer input devices. Participants also had the option to disclose whether they have a disability. You will use R to analyze this file to answer the questions in this quiz. With this and every quiz in this course, you can find what you need by understanding and mimicking coursera.R, the R code file used in lecture. This first question gives you credit for getting R, RStudio, and deviceprefs.csv ready to go. Are you ready to proceed? (Hint: If you need help getting R, RStudio, or the course materials, be sure to read all posts on the README First! discussion forum.)

  • Yep, I’m ready to go!
  • Nope, I’m not ready.

Q2. How many subjects’ preferences were recorded? (Hint: Be sure you read and understand the comments about installing packages and loading libraries on lines 12-14 atop coursera.R. Also, be sure you have read the entire README First! discussion forum, starting with the README very first! thread. Also, it is vital to read this thread for understanding the answer code snippets revealed to you when you miss an answer.)

Enter answer here

Q3. Does the data table indicate a one-sample proportion or a two-sample proportion?

  • One-sample proportion
  • Two-sample proportion

Q4. The data table shows input device preferences of some people who have and who have not disclosed that they have a disability. How many people disclosed that they have a disability?

Enter answer here

Q5. Ignoring for a moment disability status, perform a one-sample Chi-Square test to see whether the proportion of subjects who preferred the trackball (or touchpad) differed significantly from chance. To the nearest hundredth (two digits), what is the Chi-Square statistic? Hint: Note that this question is not asking for the p-value!

Enter answer here

Q6. For people without disabilities, perform a binomial test to see whether their preference for touchpads differed significantly from chance. To the nearest ten-thousandth (four digits), what is the p-value? Hint: Run a binomial test comparing the sum of rows of people without disabilities who prefer the touchpad, against the number of all rows of people without disabilities. With two possible preferences, touchpad and trackball, the chance probability would be 1/2. Do not correct for multiple comparisons; consider this a single test on a subset of the data.

Enter answer here

Q7. For people with disabilities, perform a binomial test to see whether their preference for touchpads differed significantly from chance. To the nearest ten-thousandth (four digits), what is the p-value? Hint: Run a binomial test comparing the sum of rows of people with disabilities who prefer the touchpad, against the number of all rows of people with disabilities. With two possible preferences, touchpad and trackball, the chance probability would be 1/2. Do not correct for multiple comparisons; consider this a single test on a subset of the data.

Enter answer here

Q8. Conduct a two-sample Chi-Square test of proportions on preferences by disability status. To the nearest hundredth (two digits), what is the Chi-Square statistic?

Enter answer here

Q9. Perform a two-sample G-test on preferences by disability status. To the nearest hundredth (two digits), what is the G statistic? Hint: Use the RVAideMemoire library and its G.test function. (Note: Since there have been complications with installing RVAideMemoire, the answer to this question is given to you: 7.7933.)

Enter answer here

Q10. Perform Fisher’s exact test on preferences by disability status. To the nearest ten-thousandth (four digits), what is the p-value?

Enter answer here

Week 3: Understanding Experiment Designs

Q1. What might account for random error in an experimental measure?

  • Natural variation among and within subjects
  • A systematic flaw in the logging software
  • A pattern of dropped data for every fifth subject
  • Biased observations
  • None of the above.

Q2. Which of the following would be an ordinal response? (Mark all that apply.)

  • Responses on a Likert-type scale
  • Height in centimeters of each subject
  • Favorite color of each subject
  • How spicy each subject prefers their Thai food using 1-5 stars
  • The number of heads resulting from one-hundred coin flips

Q3. In an experiment, factors are the independent variables manipulated by the experimenter, and levels are the specific values a factor can take on.

  • True
  • False

Q4. A between-subjects factor is most precisely defined by which of the following characteristic?

  • Each subject experiences more than one level of the factor.
  • Each subject experiences only one level of the factor.
  • Each subject experiences all levels of the factor.
  • Each subject experiences all but one level of the factor.
  • None of the above.

Q5. A within-subjects factor is most precisely defined by which of the following characteristic?

  • Each subject experiences more than one level of the factor.
  • Each subject experiences only one level of the factor.
  • Each subject experiences all levels of the factor.
  • Each subject experiences all but one level of the factor.
  • None of the above.

Q6. If a given factor has four levels and subjects experience two of the four levels, that factor is most precisely described as:

  • A within-subjects factor
  • A between-subjects factor
  • A partial within-subjects factor
  • A partial between-subjects factor
  • None of the above.

Q7. Balanced experimental designs are where every subject experiences every level of every factor.

  • True
  • False

Q8. The most common use of an independent-samples t-test is to examine which of the following?

  • One set of subjects that all does the same thing.
  • One set of subjects that does two different things.
  • Two sets of subjects that do the exact same thing.
  • Two sets of subjects that do different things.
  • None of the above.

Q9. Which of the following is the most proper way to report a t-test result?

  • t(14) = 2.76, p=.015
  • t(14) = 2.76, p<.05
  • t(1,14) = 2.76, p=.015
  • t(1,14) = 2.76, p<.05
  • None of the above.

Q10. A t-test is a test suited to one factor with two levels.

  • True
  • False

Week 4: Designing, Running, and Analyzing Experiments

Quiz 1: Understanding Validity

Q1. What is experimental control?

  • Ensuring that nothing happens in an experiment without the experimenter knowing about it.
  • Ensuring that every subject gets to experience every condition in the experiment.
  • Ensuring that measures are made correctly and precisely.
  • Ensuring that systematic differences in observed responses can be attributed to systematic changes in manipulated factors.
  • None of the above.

Q2. Which of the following are examples of potential confounds? (Mark all that apply.)

  • In a website A/B test, every visitor was different from every other visitor.
  • In a website A/B test, designers all saw website “A” and scientists all saw website “B”.
  • In a website A/B test, every visitor hitting the site before noon saw website “A”, while every visitor hitting the site after noon saw website “B”.
  • In a website A/B test, site “A” was different from site “B”.
  • In a website A/B test, sites “A” and “B” were measured a second time with a new batch of visitors, just to be sure.

Q3. Generally speaking, ecological validity and experimental control cannot both be maximized.

  • True
  • False

Q4. Which of the following was not an option discussed in lecture for handling a potential confound?

  • Manipulate it — systematically vary it to see if doing so causes systematic changes in the response.
  • Control for it — ensure that its effects are spread evenly across all subjects.
  • Measure it — at least record its value so it can be later examined for possibly having had an effect.
  • Hide it — don’t let subjects encounter it in the first place.
  • All of the above are options.

Q5. Which of the following is not another term for the response in an experiment?

  • Dependent variable
  • Measure
  • Outcome
  • Y
  • Factor

Q6. Which of the following are assumptions of ANOVA? (Mark all that apply.)

  • Reliability of residuals
  • Normality
  • Homoscedasticity
  • Independence
  • Homogeneity of variance

Q7. Which of the following was not a common data distribution reviewed in lecture?

  • Normal
  • Lognormal
  • Bimodal
  • Exponential
  • Gamma
  • Poisson
  • Binomial
  • Multinomial

Q8. For what kind of experiment would a multinomial distribution be relevant?

  • For an experiment in which the response is categorical with more than two categories.
  • For an experiment in which the response is bimodal.
  • For an experiment in which the response is scalar.
  • For an experiment in which the response is Poisson.
  • None of the above.

Q9. Most precisely, parametric analyses differ from nonparametric analyses in what way?

  • Parametric analyses operate on ranks.
  • Parametric analyses make assumptions about the spread of data.
  • Parametric analyses make assumptions about the distribution of the response within the population.
  • Parametric analyses are easier to use.
  • None of the above.

Q10. Typically, an advantage of parametric analyses over nonparametric analyses is statistical power, i.e., the ability to detect differences.

  • True
  • False

Q11. Nonparametric analyses must meet the three assumptions of ANOVA.

  • True
  • False

Q12. Nonparametric analyses typically operate on ranks.

  • True
  • False

Quiz 2: Doing Tests of Assumptions

Q1. Download the file designtime.csv from the course materials. This file describes a study in which designers used Adobe Illustrator or Adobe InDesign to create a benchmark set of classic children’s illustrations. The amount of time they took was recorded, in minutes. How many subjects took part in this study?

Enter answer here

Q2. Create a boxplot of the task time data for each tool. At a glance, which of the following conclusions seems to be most likely?

  • Illustrator and InDesign have similar median task times, with similar variances.
  • Illustrator has a higher median task time than InDesign, with similar variances.
  • Illustrator has a higher median task time than InDesign, with dissimilar variances.
  • InDesign has a higher median task time than Illustrator, with similar variances.
  • InDesign has a higher median task time than Illustrator, with dissimilar variances.

Q3. Conduct a Shapiro-Wilk test on the Time response for each of the tools. To the nearest ten-thousandth (four digits), what is the p-value of this test for Illustrator?

Enter answer here

Q4. Conduct a Shapiro-Wilk normality test on the residuals of Time by Tool. To the nearest ten-thousandth (four digits), what is the W value displayed? Hint: Use aov to fit a model and then run shapiro.test on the model residuals.

Enter answer here

Q5. In light of your normality tests, would you conclude the data do or do not violate normality?

  • The data do violate normality.
  • The data do not violate normality.

Q6. Conduct a Brown-Forsythe test of homoscedasticity. To the nearest hundredth (two digits), what is the F statistic for the test? Hint: Use the car library and its leveneTest function with center=median.

Enter answer here

Q7. Fit a lognormal distribution to the Time response of each of the design tools. Conduct a Kolmogorov-Smirnov goodness-of-fit test. To the nearest ten-thousandth (four digits), what is the exact p-value of the test for the Illustrator data? Hint: Use the MASS library and its fitdistr function with “lognormal” to acquire a fit estimate. Then use ks.test with “plnorm” passing the acquired fit values as meanlog and sdlog. Request an exact fit.

Enter answer here

Q8. Create a new column that is the log-transformed Time response. Compute the mean of this log-transformed response for each drawing tool. To the nearest hundredth (two digits), what is the mean of the log-transformed response for InDesign?

Enter answer here

Q9. Conduct an independent-samples t-test on the log-transformed Time response. Use the Welch version for unequal variances. To the nearest hundredth (two digits), what is the t statistic for the test?

Enter answer here

Q10. As an alternative to log-transforming the Time response, leave Time as it is and conduct an exact nonparametric Mann-Whitney U test on it. To the nearest ten-thousandth (four digits), what is the Z statistic that results from this test? Hint: Use the coin library and its wilcox_test function with distribution=”exact”.

Enter answer here

Week 5: Designing, Running, and Analyzing Experiments

Quiz 1:Understanding Oneway Designs

Q1. The issue that requires an experimenter to use a oneway ANOVA instead of a t-test is when there are more than two response categories available.

  • True
  • False

Q2. Which of the following is the equivalent nonparametric analysis to a parametric oneway ANOVA?

  • F-test
  • t-test
  • Kruskal-Wallis test
  • Mann-Whitney U test
  • None of the above.

Q3. Typically, an ANOVA uses which distribution and test statistic?

  • F
  • t
  • Chi-Square
  • Kolmogorov-Smirnov
  • Poisson

Q4. If an omnibus oneway ANOVA for a three-level factor is statistically significant, it does not mean that post hoc pairwise comparisons are allowed.

  • True
  • False

Q5. Which of the following is the most proper way to report an F-test result?

  • F(14) = 9.06, p=.009
  • F(14) = 9.06, p<.01
  • F(1,14) = 9.06, p=.009
  • F(1,14) = 9.06, p<.01
  • None of the above.

Q6. A oneway ANOVA is characterized by which experimental design?

  • An experiment with a single between-subjects factor of exactly two levels.
  • An experiment with a single between-subjects factor of two or more levels.
  • An experiment with a single within-subjects factor of exactly two levels.
  • An experiment with a single within-subjects factor of two or more levels.
  • None of the above.

Q7. In a between-subjects experiment, each participant uses only one of the systems being compared. True or False? Select one.

  • T​rue
  • F​alse

Quiz 2:Doing Oneway ANOVAs

Q1. Download the file alphabets.csv from the course materials. This file describes a study in which people used a pen-based stroke alphabet to enter a set of text phrases. How many different stroke alphabets are being compared?

Enter answer here

Q2. To the nearest hundredth (two digits), what was the average text entry speed in words per minute (WPM) of the EdgeWrite alphabet?

Enter answer here

Q3. Conduct Shapiro-Wilk normality tests on the WPM response for each Alphabet. Which of the following, if any, violate the normality test? (Mark all that apply.)

  • Unistrokes
  • Graffiti
  • EdgeWrite
  • None of the above.

Q4. Conduct a Shapiro-Wilk normality test on the residuals of a WPM by Alphabet model. To the nearest ten-thousandth (four digits), what is the p-value from such a test? Hint: Fit a model with aov and then run shapiro.test on the model residuals.

Enter answer here

Q5. Conduct a Brown-Forsythe homoscedasticity test on WPM by Alphabet. To the nearest ten-thousandth (four digits), what is the p-value from such a test? Hint: Use the car library and its leveneTest function with center=median.

Enter answer here

Q6. Conduct a oneway ANOVA on WPM by Alphabet. To the nearest hundredth (two digits), what is the F statistic from such a test?

Enter answer here

Q7. Perform simultaneous pairwise comparisons among levels of Alphabet using the Tukey approach. Adjust for multiple comparisons using Holm’s sequential Bonferroni procedure. To the nearest ten-thousandth (four digits), what is the corrected p-value for the comparison of Unistrokes to Graffiti? Hint: Use the multcomp library and its mcp function called from within its glht function.

Enter answer here

Q8. According to the results of the simultaneous pairwise comparisons, which of the following levels of Alphabet are significantly different in terms of WPM? (Mark all that apply.)

  • Unistrokes vs. Graffiti
  • Unistrokes vs. EdgeWrite
  • Graffiti vs. EdgeWrite
  • None of the above.

Q9. Conduct a Kruskal-Wallis test on WPM by Alphabet. To the nearest ten-thousandth (four digits), what is the p-value from such a test? Hint: Use the coin library and its kruskal_test function with distribution=”asymptotic”.

Enter answer here

Q10. Conduct nonparametric post hoc pairwise comparisons of WPM among all levels of Alphabet manually using three separate Mann-Whitney U tests. Adjust the p-values using Holm’s sequential Bonferroni procedure. To the nearest ten-thousandth (four digits), what is the corrected p-value for Unistrokes vs. Graffiti? Hint: The coin library’s wilcox_test only takes a model formula specification. For this, you need wilcox.test with paired=FALSE (and to avoid warnings, exact=FALSE).

Enter answer here

Week 6: Designing, Running, and Analyzing Experiments

Quiz 1: Understanding Oneway Repeated Measures Designs

Q1. What primarily distinguishes a oneway repeated measures ANOVA from a oneway ANOVA?

  • The presence of multiple factors.
  • The presence of a between-subjects factor.
  • The presence of a within-subjects factor.
  • The presence of both between- and within-subjects factors.
  • None of the above.

Q2. All else being equal, which of the following is a reason to use a within-subjects factor instead of a between-subjects factor?

  • The data is more reliable.
  • The data exhibits less variance.
  • The factors are easier to analyze.
  • The exposure to confounds is less.
  • Less time from each subject is required.

Q3. In a repeated measures experiment, why should we encode an Order factor and test whether it is statistically significant? (Mark all that apply.)

  • To examine whether the presentation order of conditions exerts a statistically significant effect on the response.
  • To examine whether any counterbalancing strategies we used were effective.
  • To examine whether an order confound has affected our results.
  • To examine whether our factors cause changes in our response.
  • To examine whether our experiment discovered any differences.

Q4. How many subjects would be needed to fully counterbalance a repeated measures factor with four levels?

  • 4
  • 8
  • 16
  • 24
  • 32

Q5. For an even number of conditions, a balanced Latin Square contains more sequences than a Latin Square.

  • True
  • False

Q6. For a within-subjects factor of five levels, a balanced Latin Square would distribute which of the following number of subjects evenly across all sequences?

  • 5
  • 15
  • 20
  • 25
  • 35

Q7. Which is the key property of a long-format data table?

  • Each row contains only one data point per response for a given subject.
  • Each row contains all of the data points per response for a given subject.
  • Each row contains all of the dependent variables for a given subject.
  • Multiple columns together encode all levels of a single factor.
  • Multiple columns together encode all measures for a given subject.

Q8. Which is not a reason why Likert-type responses often do not satisfy the assumptions of ANOVA for parametric analyses?

  • Despite having numbers on a scale, the response is not actually numeric.
  • Responses may violate normality.
  • The response distribution cannot be calculated.
  • The response is ordinal.
  • The response is bound to within, say, a 5- or 7-point scale.

Q9. When is the Greenhouse-Geisser correction necessary?

  • When a within-subjects factor of 2+ levels violates sphericity
  • When a within-subjects factor of 2+ levels exhibits sphericity
  • When a within-subjects factor of 3+ levels violates sphericity
  • When a within-subjects factor of 3+ levels exhibits sphericity
  • None of the above.

Q10. If an omnibus Friedman test is non-significant, post hoc pairwise comparisons should be carried out with Wilcoxon signed-rank tests.

  • True
  • False

Quiz 2: Doing Oneway Repeated Measures ANOVAs

Q1. Download the file websearch2.csv from the course materials. This file describes a study in which participants were asked to find 100 distinct facts on the web using different search engines. The number of searches required and a subjective effort rating for each search engine were recorded. How many participants took part in this experiment?

Enter answer here

Q2. To the nearest hundredth (two digits), what was the average number of searches required for the search engine that had the greatest average overall?

Enter answer here

Q3. Conduct an order effect test on Searches using a paired-samples t-test assuming equal variances. To the nearest ten-thousandths (four digits), what is the p-value from such a test? Hint: Use the reshape2 library and the dcast function to create a wide-format table with columns for each level of Order.

Enter answer here

Q4. Conduct a paired-samples t-test, assuming equal variances, on Searches by Engine. To the nearest hundredth (two digits), what is the absolute value of the t statistic for such a test? Hint: Use the reshape2 library and the dcast function to create a wide-format table with columns for each level of Engine.

Enter answer here

Q5. Conduct a nonparametric Wilcoxon signed-rank test on the Effort Likert-type ratings. Calculate an exact p-value. To the nearest ten-thousandth (four digits), what is the p-value from such a test? Hint: Use the coin library and its wilcoxsign_test function with distribution=”exact”.

Enter answer here

Q6. Download the file websearch3.csv from the course materials. This file describes a study just like the one from websearch2.csv, except that now three search engines were used instead of two. Once again, the number of searches required and a subjective effort rating for each search engine were recorded. How many subjects took part in this new experiment?

Enter answer here

Q7. To the nearest hundredth (two digits), what was the average number of searches required for the search engine that had the greatest average overall?

Enter answer here

QQ8. Conduct a repeated measures ANOVA to determine if there was an order effect on Searches. First determine whether there is a violation of sphericity. To the nearest ten-thousandth (four digits), what is the value of Mauchly’s W criterion? Hint: Use the ez library and its ezANOVA function passing within=Order, among other things, to test for order effects.

Enter answer here

Q9. Interpret the result of Mauchly’s test of sphericity, and then interpret the appropriate repeated measures ANOVA result. To the nearest ten-thousandth (four digits), what is the p-value from the appropriate F-test?

Enter answer here

Q10. Conduct a repeated measures ANOVA on Searches by Engine. First determine whether there is a violation of sphericity. To the nearest ten-thousandth (four digits), what is the value of Mauchly’s W criterion? Hint: Use the ez library and its ezANOVA function passing within=Engine, among other things, to test for a significant main effect.

Enter answer here

Q11. Interpret the result of Mauchly’s test of sphericity, and then interpret the appropriate repeated measures ANOVA result. To the nearest ten-thousandth (four digits), what is the p-value from the appropriate F-test?

Enter answer here

Q12. Strictly speaking, given the result of the repeated measures ANOVA examining Searches by Engine, are post hoc pairwise comparisons among levels of Engine warranted?

  • Yes
  • No

Q13. Whatever your previous answer, proceed to do post hoc pairwise comparisons. Conduct manual pairwise comparisons of Searches among levels of Engine using paired-samples t-tests, assuming equal variances and using Holm’s sequential Bonferroni procedure to correct for multiple comparisons. To the nearest ten-thousandth (four digits), what is the smallest corrected p-value resulting from this set of tests? Hint: Use the reshape2 library and dcast function to create a wide-format table.

Enter answer here

Q14. Conduct a nonparametric Friedman test on the Effort Likert-type ratings. Calculate an asymptotic p-value. To the nearest ten-thousandth (four digits), what is the Chi-Square statistic from such a test? Hint: Use the coin library and the friedman_test function.

Enter answer here

Q15. Strictly speaking, given the result of the Friedman test examining Effort by Engine, are post hoc pairwise comparisons among levels of Engine warranted?

  • Yes
  • No

Q16. Whatever your previous answer, proceed to do post hoc pairwise comparisons. Conduct manual pairwise comparisons of Effort among levels of Engine with Wilcoxon signed-rank tests, using Holm’s sequential Bonferroni procedure to correct for multiple comparisons. To the nearest ten-thousandth (four digits), what is the smallest corrected p-value resulting from this set of tests? Hint: Use the reshape2 library and dcast function to create a wide-format table. Then use the wilcox.test function with paired=TRUE (and to avoid warnings, exact=FALSE).

Enter answer here

Week 7: Designing, Running, and Analyzing Experiments

Quiz 1: Understanding Factorial Designs

Q1. Interaction effects explore which of the following?

  • How the response changes as factors change
  • How the response is affected by levels of a factor
  • How the response is differentially affected by levels of one factor based on levels of another factor
  • How the response is differentially affected by levels of one factor depending on the different levels the factor can take on
  • None of the above.

Q2. Which of the following is a description of a 3 × 2 factorial design? (Mark all that apply.)

  • Comparison of task completion times using each of three drawing tools while at a standing desk and a sitting desk
  • Comparison of task completion times while working at one of three desk types with one of two drawing tools
  • Comparison of task completion times using a drawing tool at a desk
  • Comparison of task completion times using a drawing tool at a standing or sitting desk three times
  • Comparison of task completion times using a drawing tool at a standing or sitting desk

Q3. Parallel lines on an interaction plot often indicate the presence of a statistically significant interaction effect.

  • True
  • False

Q4. Assuming small variances in comparison to means, which effects are most likely present given the following interaction plot? (Mark all that apply.)

  • A significant main effect of Numbers
  • A significant main effect of Letters
  • A significant Numbers × Letters interaction

Q5. Assuming small variances in comparison to means, which effects are most likely present given the following interaction plot? (Mark all that apply.)

  • A significant main effect of Numbers
  • A significant main effect of Letters
  • A significant Numbers × Letters interaction

Q6. Assuming small variances in comparison to means, which effects are most likely present given the following interaction plot? (Mark all that apply.)

  • A significant main effect of Numbers
  • A significant main effect of Letters
  • A significant Numbers × Letters interaction

Q7. Assuming small variances in comparison to means, which effects are most likely present given the following interaction plot? (Mark all that apply.)

  • A significant main effect of Numbers
  • A significant main effect of Letters
  • A significant Numbers × Letters interaction

Q8. Assuming small variances in comparison to means, which effects are most likely present given the following interaction plot? (Mark all that apply.)

  • A significant main effect of Numbers
  • A significant main effect of Letters
  • A significant Numbers × Letters interaction

Q9. Assuming small variances in comparison to means, which effects are most likely present given the following interaction plot? (Mark all that apply.)

  • A significant main effect of Numbers
  • A significant main effect of Letters
  • A significant Numbers × Letters interaction

Q10. The aligned rank transform (ART) generally refers to which procedure?

  • For each main and interaction effect, align the data and conduct an ANOVA.
  • For each main and interaction effect, rank the data and conduct an ANOVA.
  • For each main and interaction effect, align the data, rank it, and conduct an ANOVA.
  • For each main and interaction effect, rank the data, align it, and conduct an ANOVA.
  • None of the above.

Quiz 2: Doing Factorial ANOVAs

Q1. Download the file avatars.csv from the course materials. This file describes a study in which tall and short participants were shown a single virtual human avatar that was itself rendered as either tall or short. Participants were asked to craft a persona and write a day-in-the-life scenario for that avatar. The number of positive sentiments expressed were counted by a panel of judges who knew neither the height of the participant nor of the avatar. Examine the data. What kind of experiment design was this?

  • A 2×2 between-subjects design with factors for Height (tall, short) and Avatar (tall, short).
  • A 2×2 within-subjects design with factors for Height (tall, short) and Avatar (tall, short).
  • A 2×2 mixed factorial design with a between-subjects factor for Height (tall, short) and a within-subjects factor for Avatar (tall, short).
  • None of the above.

Q2. How many subjects took part in this experiment?

Enter answer here

Q3. To the nearest hundredth (two digits), on average how many positive sentiments were expressed for the most positive combination of Height and Avatar?

Enter answer here

Q4. Create an interaction plot with Height on the X-axis and Avatar as the traces. Do the lines cross?

  • Yes
  • No

Q5. Create an interaction plot with Avatar on the X-axis and Sex as the traces. Do the lines cross?

  • Yes
  • No

Q6. Conduct a factorial ANOVA on Positives by Height and Avatar. To the nearest hundredth (two digits), what is the largest F statistic from such a test? Hint: Use the ez library and its ezANOVA function. Pass both Height and Avatar as the between parameter using a vector created with the “c” function. If you aren’t sure how to use the “c” function, look it up, as with any function, by using the question-mark operator, like so:

1 ?c

Enter answer here

Q7. Which effects are statistically significant in the factorial ANOVA of Positives by Height and Avatar? (Mark all that apply.)

  • Main effect of Height
  • Main effect of Avatar
  • Height × Avatar interaction
  • None of the above.

Q8. Conduct two planned pairwise comparisons using independent-samples t-tests. The first question is whether short participants produced different numbers of positive sentiments for tall avatars versus short avatars. The second question is whether tall participants produced different numbers of positive sentiments for tall avatars versus short avatars. Assuming equal variances, and using Holm’s sequential Bonferroni procedure to correct for multiple comparisons, what to within a ten-thousandth (four digits) is the lowest corrected p-value from these tests?

Hint: You will need conjunctions with ampersands (&) to select the necessary rows for your t.test functions. By now, you have seen certain responses (Y) obtained by selecting rows with a single factor (X) matching a given level value:

1 df[df$X == "value",]$Y


Now, you can see how to request responses (Y) from rows having two factors (X1, X2) that match certain level values. And with more ampersands (&), you can easily expand to three factors, four factors, etc.:

1 df[df$X1 == "value1" & df$X2 == "value2",]$Y
Enter answer here

Q9. Which of the following conclusions are supported by the examination of means and the planned pairwise comparisons just conducted? (Mark all that apply.)

  • Short participants made significantly more positive sentiments about tall avatars than they did about short avatars.
  • Short participants made significantly more positive sentiments about short avatars than they did about tall avatars.
  • Tall participants made significantly more positive sentiments about tall avatars than they did about short avatars.
  • Tall participants made significantly more positive sentiments about short avatars than they did about tall avatars.
  • None of the above.

Q10. Download the file notes.csv from the course materials. This file describes a study in which iPhone and Android smartphone owners used their phone’s built-in note-taking app and then switched to an add-on third-party app, or vice-versa. The number of words they wrote in their notes apps over the course of a week was recorded. Examine the data and indicate what kind of experiment design this was.

  • A 2 × 2 between-subjects design with factors for Phone (iPhone, Android) and Notes (Built-in, Add-on).
  • A 2 × 2 within-subjects design with factors for Phone (iPhone, Android) and Notes (Built-in, Add-on).
  • A 2 × 2 mixed factorial design with a between-subjects factor for Phone (iPhone, Android) and a within-subjects factor for Notes (Built-in, Add-on).
  • None of the above.

Q11. How many subjects took part in this experiment?

Enter answer here

Q12. To the nearest whole number, on average how many words were recorded with the most heavily used combination of Phone and Notes?

Enter answer here

Q13. Create an interaction plot with Phone on the X-axis and Notes as the traces. Do the lines cross?

  • Yes
  • No

Q14. Create an interaction plot with Notes on the X-axis and Phone as the traces. Do the lines cross?

  • Yes
  • No

Q15. Conduct a factorial ANOVA to test for any order effect that the presentation order of the Notes factor may have had. To the nearest ten-thousandth (four digits), what is the p-value for the Order factor from such a test? Hint: Use the ez library and its ezANOVA function, passing one between parameter and Order as the within parameter.

Enter answer here

Q16. In our test of possible order effects, Mauchly’s test of sphericity is irrelevant because our within-subjects factor only has two levels, which cannot present a sphericity violation.

  • True
  • False

Q17. Conduct a factorial ANOVA on Words by Phone and Notes. To the nearest hundredth (two digits), what is the largest F statistic produced by such a test? Hint: Use the ez library and its ezANOVA function, passing one between parameter and one within parameter.

Enter answer here

Q18. Conduct two planned pairwise comparisons using paired-samples t-tests. The first question is whether iPhone users entered different numbers of words using the built-in notes app versus the add-on notes app. The second question is whether Android users entered different numbers of words using the built-in notes app versus the add-on notes app. Assuming equal variances, and using Holm’s sequential Bonferroni procedure to correct for multiple comparisons, what to within a ten-thousandth (four digits) is the lowest p-value from these tests? Hint: Use the reshape2 library and its dcast function to make a wide-format table with columns for Subject, Phone, Add-on, and Built-in, and then within each Phone type, do a paired-samples t-test between the Add-on and Built-in columns.

Enter answer here

Q19. Which of the following conclusions are supported by the planned pairwise comparisons just conducted? (Mark all that apply.)

  • Android users entered significantly more words using the built-in notes app than the add-on notes app.
  • Android users entered significantly more words using the add-on notes app than the built-in notes app.
  • iPhone users entered significantly more words using the built-in notes app than the add-on notes app.
  • iPhone users entered significantly more words using the add-on notes app than the built-in notes app.
  • None of the above.

Q20. Download the file socialvalue.csv from the course materials. This file describes a study of people viewing a positive or negative film clip before going onto social media and then judging the value of the first 100 posts they see there. The number of valued posts was recorded. Examine the data and indicate what kind of experiment design this was.

  • A 2 × 2 between-subjects design with factors for Clip (positive, negative) and Social (Facebook, Twitter).
  • A 2 × 2 within-subjects design with factors for Clip (positive, negative) and Social (Facebook, Twitter).
  • A 2 × 2 mixed factorial design with a between-subjects factor for Clip (positive, negative) and a within-subjects factor for Social (Facebook, Twitter).
  • A 2 × 2 mixed factorial design with a within-subjects factor for Clip (positive, negative) and a between-subjects factor for Social (Facebook, Twitter).
  • None of the above.

Q21. How many subjects took part in this experiment?

Enter answer here

Q22. To the nearest hundredth (two digits), on average how many posts out of 100 were valued for the most valued combination of Clip and Social?

Enter answer here

Q23. Create an interaction plot with Social on the X-axis and Clip as the traces. Do the lines cross?

  • Yes
  • No

Q24. Create an interaction plot with Clip on the X-axis and Social as the traces. Do the lines cross?

  • Yes
  • No

Q25. Conduct a factorial ANOVA to test for any order effects that the presentation order of the Clip factor and/or the Social factor may have had. To the nearest ten-thousandth (four digits), what is the p-value for the ClipOrder main effect? Hint: Use the ez library and its ezANOVA function. Pass both ClipOrder and SocialOrder as the within parameter using a vector created with the “c” function.

Enter answer here

Q26. Conduct a factorial ANOVA on Valued by Clip and Social. To the nearest hundredth (two digits), what is the largest F statistic produced by such a test? Hint: Use the ez library and its ezANOVA function. Pass both Clip and Social as the within parameter using a vector created with the “c” function.

Enter answer here

Q27. Conduct two planned pairwise comparisons using paired-samples t-tests. The first question is whether on Facebook, the number of valued posts was different after people saw a positive film clip versus a negative film clip. The second question is whether on Twitter, the number of valued posts was different after people saw a positive film clip versus a negative film clip. Assuming equal variances, and using Holm’s sequential Bonferroni procedure to correct for multiple comparisons, what to within a ten-thousandth (four digits) is the lowest p-value from these tests? Hint: Use the reshape2 library and its dcast function to make a wide-format table with columns for Subject and the combination of Social × Clip, and then do a paired-samples t-test between columns with the same Social level.

Enter answer here

Q28. Which of the following conclusions are supported by a comparison of means and the planned pairwise comparisons just conducted? (Mark all that apply.)

  • On Facebook, people valued significantly more posts after seeing a positive film clip than a negative film clip.
  • On Facebook, people valued significantly more posts after seeing a negative film clip than a positive film clip.
  • On Twitter, people valued significantly more posts after seeing a positive film clip than a negative film clip.
  • On Twitter, people valued significantly more posts after seeing a negative film clip than a positive film clip.
  • None of the above.

Q29. Continue using the file socialvalue.csv from the course materials. Conduct a nonparametric Aligned Rank Transform procedure on Valued by Clip and Social. To the nearest hundredth (two digits), what is the largest F statistic produced by this procedure? Hint: Use the ARTool library and its art function with the formula:

1 Valued ~ Clip * Social + (1|Subject)

The above formula indicates that Subject is to be treated as a random effect. (Random effects will be covered in Module 9.)

Enter answer here

Q30. For question 23, you created an interaction plot with Social on the X-axis, Clip as the traces, and Valued on the Y-axis. Create this plot again. Is the number of valued social media posts after viewing positive and negative film clips closer for Facebook or Twitter?

  • Facebook
  • Twitter

Q31. Compare the number of valued posts on Facebook after viewing positive and negative film clips using post hoc contrast tests. To the nearest ten-thousandth (four digits), what is the p-value for this comparison? Correct for multiple comparisons in the family of contrasts using Holm’s sequential Bonferroni procedure. Hint: Use the Aligned Rank Transform contrast testing function art.con. Assuming “m” is the ART model you built in question 29, this is the code to use:

1 art.con(m, ~ Clip*Social, adjust="holm")
Enter answer here

Q32. Compare the number of valued posts on Twitter after viewing positive and negative film clips using post hoc contrast tests. To the nearest ten-thousandth (four digits), what is the p-value for this comparison? Correct for multiple comparisons in the family of contrasts using Holm’s sequential Bonferroni procedure. Hint: Use the Aligned Rank Transform contrast testing function art.con. Assuming “m” is the ART model you built in question 29, this is the code to use:

1 art.con(m, ~ Clip*Social, adjust=”holm”)

Enter answer here

Week 8: Designing, Running, and Analyzing Experiments

Quiz 1: Understanding Generalized Linear Models

Q1. What do generalized linear models (GLMs) generalize?

  • The linear model, which encompasses the ANOVA
  • The linear model, which is a subset of the ANOVA
  • The general model, which supersedes the ANOVA
  • The general model, which is a subset of the ANOVA
  • None of the above.

Q2. Generalized linear models (GLMs) handle only between-subjects factors.

  • True
  • False

Q3. Poisson regression is an example of a generalized linear model (GLM) with a Poisson distribution for the response and a log link function.

  • True
  • False

Q4. Which of the following is not an example of a generalized linear model (GLM)?

  • Poisson regression
  • Binomial regression
  • Gamma regression
  • Ordinal logistic regression
  • All are GLMs.

Q5. The link function in a generalized linear model (GLM) most precisely relates what to what?

  • Factors to each of the responses
  • Factors to the mean of the response
  • Factors to the distribution of the response
  • Factors to the error in the response
  • None of the above.

Q6. Nominal logistic regression can also be known as multinomial regression.

  • True
  • False

Q7. Multinomial regression with the cumulative logit link function is also known as:

  • Nominal logistic regression
  • Ordinal logistic regression
  • Poisson regression
  • Binomial regression
  • None of the above.

Q8. Poisson regression is often appropriate for analyzing which kind of data?

  • Error rates
  • Success percentages
  • Logarithmic distributions
  • Count data
  • None of the above.

Q9. Exponential regression is a special case of which generalized linear model (GLM)?

  • Poisson regression
  • Binomial regression
  • Ordinal logistic regression
  • Gamma regression
  • None of the above.

Q10. The generalized linear model (GLM) can be used in place of the linear model (LM) for between-subjects designs.

  • True
  • False

Quiz 2: Doing Generalized Linear Models

Q1. Download the file deviceprefsSr.csv from the course materials. This file describes the same study as in our deviceprefs.csv file, but now augmented with a column for Senior (1, 0), indicating whether a person is 64 years of age or older. It also still contains a column for Disability (1, 0). The research question is how preference for either touchpads or trackballs differs by disability status and senior citizen status. How many subjects took part in this study?

Enter answer here

Q2. Use binomial regression to examine Pref by Disability and Senior. To the nearest ten-thousandth (four digits), what is the p-value of the Disability × Senior interaction? Hint: Create a model with glm using family=binomial. Then use the car library and its Anova function with type=3. Prior to either, set sum-to-zero contrasts for both Disability and Senior.

Enter answer here

Q3. Multinomial regression generalizes binomial regression to dependent variables with more than two categories, so it can handle just two categories as well. Use multinomial regression to examine Pref by Disability and Senior. To the nearest ten-thousandth (four digits), what is the p-value of the Disability × Senior interaction? Hint: Use the nnet library and its multinom function. Then use the car library and its Anova function with type=3. Prior to either, set sum-to-zero contrasts for both Disability and Senior.

Enter answer here

Q4. Let us examine whether there was a significant preference for touchpads or trackballs within each Disability × Senior combination. Conduct such an exploration using post hoc binomial tests. Adjust for multiple comparisons using Holm’s sequential Bonferroni procedure. What is the lowest corrected p-value produced by such an exploration? Hint: Conduct four separate tests with binom.test. The four tests correspond to the four combinations of Disability and Senior. For each combination, test the sum of rows preferring “touchpad” against all rows having that same Disability × Senior combination. Since there are only two devices, a test for touchpad is implicitly a test for trackball, and vice versa.

Enter answer here

Q5. Download the file hwreco.csv from the course materials. This file describes a study of three handwriting recognizers (A, B, C) and subjects who were either right-handed or left-handed. The response is the number of incorrectly recognized handwritten words out of every 100 handwritten words. The research questions are how each recognizer fared overall and whether a given recognizer performed better for right-handed or left-handed writers. How many subjects took part in this study?

Enter answer here

Q6. Create an interaction plot with Recognizer on the X-axis and Hand as the traces. How many times, if any, do the two traces cross?

Enter answer here

Q7. Fit Poisson distributions to the Errors of each of the three Recognizer levels and test those fits with goodness-of-fit tests. To the nearest ten-thousandth (four digits), what is the lowest p-value produced by these tests? Hint: To fit a Poisson distribution, use the fitdistrplus library and its fitdist function. Then test the fit with the gofstat function.

Enter answer here

Q8. Use Poisson regression to examine Errors by Recognizer and Hand. To the nearest ten-thousandth (four digits), what is the p-value of the Recognizer × Hand interaction? Hint: Create a model with glm using family=poisson. Then use the car library and its Anova function with type=3. Prior to either, set sum-to-zero contrasts for both Recognizer and Hand.

Enter answer here

Q9. Conduct three planned comparisons between left- and right-handed recognition errors within each recognizer. Adjust for multiple comparisons using Holm’s sequential Bonferroni procedure. To the nearest ten-thousandth (four digits), what is the lowest corrected p-value from such tests? Hint: Use the multcomp and emmeans (formerly lsmeans) libraries and the emm (formerly lsm) formulation of the glht function. Because we only have three planned pairwise comparisons, use “none” for the initial multiple comparisons adjustment to avoid correcting for all possible pairwise comparisons. Instead, just find the three planned and as-yet uncorrected p-values and then pass them manually to p.adjust with method=”holm”.

  • Enter answer here
Q10. Which of the following conclusions are supported by the analyses you performed on hwreco.csv? (Mark all that apply.)
  • The handwriting error counts seemed to be Poisson-distributed.
  • There was a significant main effect of Recognizer on Errors.
  • There was a significant main effect of Hand on Errors.
  • There was a significant Recognizer × Hand interaction.
  • For recognizer “A”, there were significantly more errors for right-handed writers than left-handed writers.
  • For recognizer “B”, there were significantly more errors for left-handed writers than left-handed writers.
  • For recognizer “C”, there were significantly more errors for right-handed writers than left-handed writers.

Q11. Download the file bookflights.csv from the course materials. This file describes a survey in which website visitors booked a flight on either Expedia, Orbitz, or Priceline. Whether they booked a domestic or international flight was also recorded. The survey response was a 1-7 ordinal rating for Ease on a Likert-type scale, with “7” being easiest. The research question is which site felt easiest to use overall, and specifically for domestic vs. international bookings. How many subjects took part in this study? Hint: You will need to encode Ease as an ordinal response, but instead of using the factor function with which you’re familiar, you will need the ordered function.

Enter answer here

Q12. Create an interaction plot with Website on the X-axis and International as the traces. How many times, if any, do the two traces cross? Hint: If you already encoded Ease as an ordinal response, you must use as.numeric when passing it to interaction.plot.

Enter answer here

Q13. Use ordinal logistic regression to examine Ease by Website and International. To the nearest ten-thousandth (four digits), what is the p-value of the Website main effect? Hint: Use the MASS library and its polr function with Hess=TRUE to create the ordinal logistic model. Then use the car library and its Anova function with type=3. Prior to either, set sum-to-zero contrasts for both Website and International. And recall that Ease needs to have been encoded as an ordinal response with the ordered function.

Enter answer here

Q14. Conduct three planned comparisons of Ease ratings between domestic and international bookings for each website. Adjust for multiple comparisons using Holm’s sequential Bonferroni procedure. To the nearest ten-thousandth (four digits), what is the highest p-value from such tests? Hint: Because we only have three planned pairwise comparisons, use “none” for the multiple comparisons adjustment to avoid correcting for all possible pairwise comparisons. Instead, just find the three planned and as-yet uncorrected p-values and pass them manually to p.adjust with method=”holm”. Since the formulation for simultaneous comparisons is a little tricky for ordinal responses, we give that aspect of the code to you here:

1 # assuming m = polr(Ease ~ Website * International, data=df, Hess=TRUE)
2 summary(as.glht(pairs(emmeans(m, ~ Website * International))), test=adjusted(type="none"))
Enter answer here

Q15. Which of the following conclusions are supported by the analyses you performed on bookflights.csv? (Mark all that apply.)

  • There was a significant main effect of Website on Ease.
  • There was a significant main effect of International on Ease.
  • There was a significant Website × International interaction.
  • Expedia was perceived as significantly easier for booking international flights than domestic flights.
  • Orbitz was perceived as significantly easier for booking domestic flights than international flights.
  • Priceline was perceived as significantly easier for booking domestic flights than international flights.

Week 9: Designing, Running, and Analyzing Experiments

Quiz 1: Designing, Running, and Analyzing Experiments

Q1. A mixed model is “mixed” because it contains both between-subjects and within-subjects factors.

  • True
  • False

Q2. Which of the following best describes fixed effects?

  • Fixed effects are manipulated factors whose chosen levels are of explicit interest.
  • Fixed effects are manipulated factors whose levels are sampled randomly from a larger population of interest.
  • Fixed effects are random factors whose chosen levels are of explicit interest.
  • Fixed effects are random factors whose levels are sampled randomly from a larger population of interest.
  • None of the above.

Q3. Random effects are called “random” in part because their levels are randomly sampled from a larger population about which we wish to generalize.

  • True
  • False

Q4. Linear mixed models (LMMs) can handle Poisson response distributions.

  • True
  • False

Q5. Which is not an advantage of a linear mixed model (LMM)?

  • The ability to handle within-subjects factors.
  • The ability to handle unbalanced designs.
  • The ability to handle missing data.
  • The ability to handle non-normal response distributions.
  • The ability to handle violations of sphericity.

Q6. Analyses of variance using linear mixed models (LMMs) tend to produce smaller residual degrees of freedom than traditional fixed-effects ANOVAs.

  • True
  • False

Q7. Nesting is useful when the levels of a factor are not meaningful when pooled across all levels of the other factors.

  • True
  • False

Q8. Nesting is necessary when we wish to calculate the means and variances of a nested factor’s levels only within the levels of the other factors, that is, the nesting factors.

  • True
  • False

Q9. Linear mixed models (LMMs) generalize the linear model (LM) to non-normal response distributions.

  • True
  • False

Q10. Generalized linear mixed models (GLMMs) generalize the linear mixed model (LMM) to non-normal response distributions.

  • True
  • False

Q11. Why are planned pairwise comparisons important? (Mark all that apply.)

  • Planned pairwise comparisons enable experimenters to communicate more effectively with the public.
  • Planned pairwise comparisons force experimenters to consider their hypotheses before the data arrives to prevent revisions.
  • Planned pairwise comparisons should be based on a priori hypotheses and therefore prevent “fishing expeditions” for significant p-values.
  • Planned pairwise comparisons ensure that research funds are used only for anticipated purposes.
  • Planned pairwise comparisons guarantee that significant differences, if they exist, will be found eventually.

Q12. Generalized linear mixed models (GLMMs) are capable of handling repeated measures factors via random effects and non-normal response distributions.

  • True
  • False

Quiz 2: Doing Mixed Effects Models

Q1. Recall our file websearch3.csv. If you have not done so already, please download it from the course materials. This file describes a study of the number of searches people did with various search engines to successfully find 100 facts on the web. You originally analyzed this data with a one-way repeated measures ANOVA. Now you will use a linear mixed model (LMM). Let’s refresh our memory: How many subjects took part in this study?

Enter answer here

Q2. To the nearest hundredth (two digits), how many searches on average did subjects require with the Google search engine?

Enter answer here

Q3. Conduct a linear mixed model (LMM) analysis of variance on Searches by Engine. To the nearest ten-thousandth (four digits), what is the p-value of such a test? Hint: Use the lme4 library and its lmer function with Subject as a random effect. Also load the lmerTest library. Then use the car library and its Anova function with type=3 and test.statistic=”F”. Prior to either, set sum-to-zero contrasts for Engine.

Enter answer here

Q4. In light of your p-value result, are post hoc pairwise comparisons among levels of Engine justified, strictly speaking?

  • Yes
  • No

Q5. Regardless of your answer to the previous question, conduct simultaneous pairwise comparisons among all levels of Engine. Correct your p-values with Holm’s sequential Bonferroni procedure. To the nearest ten-thousandth (four digits), what is the lowest corrected p-value resulting from such tests? Hint: Use the multcomp library and its mcp function within a call to its glht function.

Enter answer here

Q6. Because the omnibus linear mixed model (LMM) analysis of variance did not result in a significant main effect of Engine on Searches, post hoc pairwise comparisons were not justified. As a result, despite one such comparison having p<.05, strictly speaking this “finding” must be disregarded.

  • True
  • False

Q7. Recall our file socialvalue.csv. If you have not done so already, please download it from the course materials. This file describes a study of people viewing a positive or negative film clip before going onto social media and then judging the value of the first 100 posts they see there. The number of valued posts was recorded. You originally analyzed this data with a 2×2 within-subjects ANOVA. Now you will use a linear mixed model (LMM). Let’s refresh our memory: How many subjects took part in this study?

Enter answer here

Q8. On average and to the nearest whole number, how many more posts were valued on Facebook than on Twitter after seeing a positive film clip?

Enter answer here

Q9. Conduct a linear mixed model (LMM) analysis of variance on Valued by Social and Clip. To the nearest ten-thousandth (four digits), what is the p-value of the interaction effect? Hint: Use the lme4 library and its lmer function with Subject as a random effect. Then use the car library and its Anova function with type=3 and test.statistic=”F”. Prior to either, set sum-to-zero contrasts for both Social and Clip.

Enter answer here

Q10. Conduct two planned pairwise comparisons of how the film clips may have influenced judgments about the value of social media. The first question is whether on Facebook, the number of valued posts was different after people saw a positive film clip versus a negative film clip. The second question is whether on Twitter, the number of valued posts was different after people saw a positive film clip versus a negative film clip. Correcting for these two planned comparisons using Holm’s sequential Bonferroni procedure, to the nearest ten-thousandth (four digits), what is the lowest corrected p-value of the two tests? Hint: Use the multcomp and emmeans (formerly lsmeans) libraries and the emm (formerly lsm) function within the glht function. Do not correct for multiple comparisons yet as only two planned comparisons are of interest. After retrieving the two as-yet uncorrected p-values of interest, manually pass them to p.adjust for correction.

Enter answer here

Q11. Download the file teaser.csv from the course materials. This file describes a survey in which respondents recruited online saw five different teaser trailers for upcoming movies of different genres. Respondents simply indicated whether they liked each teaser or not. The research question is whether trailers from certain film genres were liked more than others. How many respondents took part in this survey?

Enter answer here

Q12. By viewing the data table, discern which counterbalancing scheme was used for the Teaser factor, if any:

  • Full counterbalancing
  • Latin Square
  • Balanced Latin Square
  • Random
  • None of the above.

Q13. Create a plot of Liked by Teaser. Which teaser trailer genre was liked the most?

  • action
  • comedy
  • horror
  • romance
  • thriller

Q14. Using a generalized linear mixed model (GLMM), conduct a test of order effects on Liked to ensure counterbalancing worked. To the nearest ten-thousandth (four digits), what is the p-value for the Order main effect? Hint: Use the lme4 library and its glmer function with family=binomial and Subject as a random effect. Use the default value for nAGQ, which is 1; since this is the default, you do not need to specify nAGQ, or you can set nAGQ=1. (Higher values of nAGQ produce better estimates [max. 25], but increase execution time considerably.) Then use the car library and its Anova function with type=3. Prior to either, set sum-to-zero contrasts for Order.

Enter answer here

Q15. Using a generalized linear mixed model (GLMM), conduct a test of Liked by Teaser. To the nearest hundredth (two digits), what is the Chi-Square statistic for the Teaser main effect? Hint: Use the lme4 library and its glmer function with family=binomial and Subject as a random effect. Use the default value for nAGQ, which is 1; since this is the default, you do not need to specify nAGQ, or you can set nAGQ=1. (Higher values of nAGQ produce better estimates [max. 25], but increase execution time considerably.) Then use the car library and its Anova function with type=3. Prior to either, set sum-to-zero contrasts for Teaser.

Enter answer here

Q16. Conduct simultaneous post hoc pairwise comparisons among levels of Teaser. Be sure to use Holm’s sequential Bonferroni procedure. How many of the tests are statistically significant? Hint: Use the multcomp library and its mcp function called from within its glht function.

Enter answer here

Q17. Download the file vocab.csv from the course materials. This file describes a study in which 50 recent posts by heavy and light users of social media were analyzed for how many unique words they used, i.e., the size of their operational vocabulary. The research question is whether heavy andlight users’ online vocabularies differ on each of three social media platforms. How many subjects took part in this study?

Enter answer here

Q18. Create an interaction plot with Social on the X-axis and Heavy as the traces. How many times, if any, do these lines cross?

Enter answer here

Q19. Perform three Kolmogorov-Smirnov goodness-of-fit tests on Vocab for each level of Social using exponential distributions. To the nearest ten-thousandth (four digits), what is the lowest p-value of these three tests? Hint: Use the MASS library and its fitdistr function on Vocab separately for each level of Social. Use “exponential” as the distribution type. Save the estimate as a fit. Then use ks.test with “pexp” passing fit[1] as the rate and requesting an exact test. Ignore any warnings produced about ties.

Enter answer here

Q20. Use a generalized linear mixed model (GLMM) to conduct a test of order effects on Vocab to ensure counterbalancing worked. To the nearest ten-thousandth (four digits), what is the p-value for the Order main effect? Hint: Use the lme4 library and its glmer function with family=Gamma(link=”log”) and Subject as a random effect. Use the default value for nAGQ, which is 1; since this is the default, you do not need to specify nAGQ, or you can set nAGQ=1. (Higher values of nAGQ produce better estimates [max. 25], but increase execution time considerably.) Then use the car library and its Anova function with type=3. Prior to either, set sum-to-zero contrasts for Heavy and Order.

Enter answer here

Q21. Use a generalized linear mixed model (GLMM) to conduct a test of Vocab by Heavy and Social. To the nearest ten-thousandth (four digits), what is the p-value for the interaction effect? Hint: Use the lme4 library and its glmer function with family=Gamma(link=”log”) and Subject as a random effect. Use the default value for nAGQ, which is 1; since this is the default, you do not need to specify nAGQ, or you can set nAGQ=1. (Higher values of nAGQ produce better estimates [max. 25], but increase execution time considerably.) Then use the car library and its Anova function with type=3. Prior to either, set sum-to-zero contrasts for Heavy and Social.

Enter answer here

Q22. The only significant effect on Vocab was Social. Therefore, perform post hoc pairwise comparisons among levels of Social adjusted with Holm’s sequential Bonferroni procedure. To the nearest ten-thousandth (four digits), what is the p-value of the only non-significant pairwise comparison? Hint: Use the multcomp library and its mcp function called from within its glht function. Ignore any warnings produced.

Enter answer here

Q23. In Module 8, you employed a generalized linear model (GLM) for ordinal logistic regression using the polr function from the MASS library. You also conducted a GLM for nominal logistic regression using the multinom function from the nnet library. It is reasonable to wonder whether variants of such functions exist for generalized linear mixed models (GLMMs), i.e., variants that can handle random effects and therefore repeated measures. Unfortunately, although certain approaches exist, they are somewhat arcane, and the R community has not converged upon any approach to categorical response models with random effects. Our lectures did not venture into such territory, but as a final topic pointing toward the future, here is a brief treatment of mixed ordinal logistic regression. Let’s begin by revisiting our file websearch3.csv from the course materials. Effort is a Likert-type response. How many ordered categories does Effort have? Hint: Recode Effort as an ordinal response.

Enter answer here

Q24. Use a generalized linear mixed model (GLMM) for ordinal logistic regression to examine Effort by Engine. Specifically, we will use what is called a “cumulative link mixed model” (CLMM). We find the clmm function in the ordinal library. To produce significance tests, we use a special version of the Anova function from the RVAideMemoire library.

There are two quirks. One is that we must remake our data frame before passing it to clmm. The second is that the output of Anova.clmm always indicates Type II tests regardless of whether the type parameter is 2 or 3. (With a Type II ANOVA, if an interaction is present, then main effects are ignored; not an issue for our one-way analysis of Effort by Engine here.)

To the nearest ten-thousandth (four digits), what is the p-value of the Engine main effect? Hint: The code to use is below. (Note: Since there have been complications with installing RVAideMemoire, the answer to this question is given to you: 0.0174.)

1 # assuming df contains websearch3.csv
2 # assuming Subject has been coded as nominal
3 # assuming Engine has been coded as nominal
4 # assuming Effort has been coded as ordinal
5 library(ordinal) # for clmm
6 library(RVAideMemoire) # for Anova.clmm
7 df2 = as.data.frame(df) # copy
8 m = clmm(Effort ~ Engine + (1|Subject), data=df2)
9 Anova.clmm(m, type=3) # output says Type II but is Type III
Enter answer here

Q25. In light of the significant main effect of Engine on Effort, post hoc pairwise comparisons are justified among the three levels of Engine. (For simplicity, we’ll treat Effort now as a numeric response.) Plot the Effort ratings by Engine and perform pairwise comparisons with the following code. To the nearest ten-thousandth (four digits), what is the p-value of the one non-significant pairwise comparison?

1 # following the code from Q24
2 plot(as.numeric(Effort) ~ Engine, data=df2)
3 library(lme4)
4 library(multcomp)
5 m = lmer(as.numeric(Effort) ~ Engine + (1|Subject), data=df2)
6 summary(glht(m, mcp(Engine="Tukey")), test=adjusted(type="holm"))
Enter answer here

Get All Course Quiz Answers of Interaction Design Specialization

Human-Centered Design: an Introduction Coursera Quiz Answers

Design Principles: an Introduction Coursera Quiz Answers

Social Computing Coursera Quiz Answers

Input and Interaction Coursera Quiz Answers

User Experience: Research & Prototyping Coursera Quiz Answers

Information Design Coursera Quiz Answers

Team Networking Funda
Team Networking Funda

We are Team Networking Funda, a group of passionate authors and networking enthusiasts committed to sharing our expertise and experiences in networking and team building. With backgrounds in Data Science, Information Technology, Health, and Business Marketing, we bring diverse perspectives and insights to help you navigate the challenges and opportunities of professional networking and teamwork.

Leave a Reply

Your email address will not be published. Required fields are marked *