Welcome to your ultimate guide for Regression Models quiz answers! Whether you’re tackling practice quizzes to strengthen your understanding or preparing for graded quizzes to assess your knowledge, this guide is here to help.
Spanning all course modules, this resource will help you master key regression techniques, including linear regression, logistic regression, and model evaluation methods, critical for predictive analytics and data-driven decision making.
Regression Models Quiz Answers for All Modules
Table of Contents
Regression Models Week 01 Quiz Answers
Q1. Consider the data set given below:
RCopyEditx <- c(0.18, -1.54, 0.42, 0.95)
w <- c(2, 1, 3, 1)
Give the value of μ that minimizes the least squares equation:
∑ᵢ₌₁ⁿ wᵢ (xᵢ − μ)²
Correct Answer: 0.300
Explanation: The value of μ that minimizes the least squares equation is the weighted mean, calculated by the formula: μ=∑wixi∑wi\mu = \frac{\sum w_i x_i}{\sum w_i}μ=∑wi∑wixi
This gives the optimal value for μ that minimizes the sum of squared distances weighted by w.
Q2. Consider the following data set:
RCopyEditx <- c(0.8, 0.47, 0.51, 0.73, 0.36, 0.58, 0.57, 0.85, 0.44, 0.42)
y <- c(1.39, 0.72, 1.55, 0.48, 1.19, -1.59, 1.23, -0.65, 1.49, 0.05)
Fit the regression through the origin and get the slope treating y as the outcome and x as the regressor.
Correct Answer: 0.8263
Explanation: Fitting the regression through the origin (i.e., no intercept) involves calculating the slope as:
slope=∑xiyi∑xi2\text{slope} = \frac{\sum x_i y_i}{\sum x_i^2}slope=∑xi2∑xiyi
Q3. Do data(mtcars)
from the datasets package and fit the regression model with mpg as the outcome and weight as the predictor. Give the slope coefficient.
Correct Answer: -5.344
Explanation: The slope coefficient is found by running the linear regression model between mpg and weight. The negative value indicates an inverse relationship between weight and mpg.
Q4. Consider data with an outcome (Y) and a predictor (X). The standard deviation of the predictor is one half that of the outcome. The correlation between the two variables is 0.5. What value would the slope coefficient for the regression model with Y as the outcome and X as the predictor?
Correct Answer: 1
Explanation: The slope formula for linear regression is:
slope=correlation×std dev of Ystd dev of X\text{slope} = \frac{\text{correlation} \times \text{std dev of Y}}{\text{std dev of X}}slope=std dev of Xcorrelation×std dev of Y
Given that the standard deviation of X is half that of Y, the slope simplifies to 1.
Q5. Students were given two hard tests and scores were normalized to have empirical mean 0 and variance 1. The correlation between the scores on the two tests was 0.4. What would be the expected score on Quiz 2 for a student who had a normalized score of 1.5 on Quiz 1?
Correct Answer: 0.6
Explanation: The expected score on Quiz 2 can be calculated as:
Expected Quiz 2=mean of Quiz 2+correlation×(Quiz 1 score)\text{Expected Quiz 2} = \text{mean of Quiz 2} + \text{correlation} \times \text{(Quiz 1 score)}Expected Quiz 2=mean of Quiz 2+correlation×(Quiz 1 score)
Since both scores have mean 0, the expected score is simply the correlation multiplied by the Quiz 1 score.
Q6. Consider the data given by the following:
RCopyEditx <- c(8.58, 10.46, 9.01, 9.64, 8.86)
What is the value of the first measurement if x were normalized (to have mean 0 and variance 1)?
Correct Answer: -0.9719
Explanation: Normalization involves subtracting the mean and dividing by the standard deviation. The formula is:
normalized value=xi−mean of xstd dev of x\text{normalized value} = \frac{x_i – \text{mean of x}}{\text{std dev of x}}normalized value=std dev of xxi−mean of x
This gives the standardized value of the first measurement.
Regression Models Week 02 Quiz Answers
Q1. Consider the following data with x as the predictor and y as the outcome:
RCopyEditx <- c(0.61, 0.93, 0.83, 0.35, 0.54, 0.16, 0.91, 0.62, 0.62)
y <- c(0.67, 0.84, 0.6, 0.18, 0.85, 0.47, 1.1, 0.65, 0.36)
Give a P-value for the two-sided hypothesis test of whether β₁ from a linear regression model is 0 or not.
Correct Answer: 0.05296
Explanation: The P-value is obtained from the t-test for testing whether the slope β₁ is significantly different from zero. A P-value greater than 0.05 indicates that we fail to reject the null hypothesis that β₁ = 0.
Q2. Consider the previous problem, give the estimate of the residual standard deviation.
Correct Answer: 0.223
Explanation: The residual standard deviation (or standard error of the regression) is calculated as the square root of the sum of squared residuals divided by the degrees of freedom. It provides an estimate of the variability of the data points around the fitted regression line.
Q3. In the mtcars
data set, fit a linear regression model of weight (predictor) on mpg (outcome). Get a 95% confidence interval for the expected mpg at the average weight. What is the lower endpoint?
Correct Answer: -6.486
Explanation: To calculate the confidence interval for the expected mpg, we use the formula for the confidence interval of a regression prediction. The lower endpoint of the 95% confidence interval gives the lower bound of the expected mpg.
Q4. Refer to the previous question. Read the help file for mtcars
. What is the weight coefficient interpreted as?
Correct Answer: The estimated expected change in mpg per 1,000 lb increase in weight.
Explanation: The weight coefficient in a regression model with mpg as the outcome and weight as the predictor represents the expected change in mpg for each unit increase in weight. Since weight is measured in 1,000 lbs, this interpretation is for each 1,000 lb increase.
Q5. Consider again the mtcars
data set and a linear regression model with mpg as predicted by weight (1,000 lbs). A new car is coming weighing 3000 pounds. Construct a 95% prediction interval for its mpg. What is the upper endpoint?
Correct Answer: 27.57
Explanation: The prediction interval for a new observation is wider than the confidence interval for the mean because it accounts for both the uncertainty in the regression parameters and the individual variability around the regression line.
Q6. Consider again the mtcars
data set and a linear regression model with mpg as predicted by weight (in 1,000 lbs). A “short” ton is defined as 2,000 lbs. Construct a 95% confidence interval for the expected change in mpg per 1 short ton increase in weight. Give the lower endpoint.
Correct Answer: -12.973
Explanation: To calculate the change in mpg per 1 short ton, we multiply the slope by 2 (since a short ton is 2,000 lbs). The lower endpoint gives the lower bound of the expected change in mpg for a 1 short ton increase.
Q7. If my X from a linear regression is measured in centimeters and I convert it to meters, what would happen to the slope coefficient?
Correct Answer: It would get divided by 100
Explanation: When converting from centimeters to meters, the variable X is scaled down by a factor of 100. Therefore, the slope coefficient would be divided by 100 to adjust for the change in units.
Q8. I have an outcome, Y, and a predictor, X, and fit a linear regression model with Y = β₀ + β₁X + ϵ to obtain β̂₀ and β̂₁. What would be the consequence to the subsequent slope and intercept if I were to refit the model with a new regressor, X + c for some constant, c?
Correct Answer: The new intercept would be β̂₀ + cβ̂₁
Explanation: When adding a constant to the predictor X, the intercept changes because the new regressor includes an additional constant term. The slope remains the same, but the intercept adjusts by an amount proportional to the constant.
Q9. Refer back to the mtcars
data set with mpg as an outcome and weight (wt) as the predictor. About what is the ratio of the sum of squared errors, ∑ᵢ₌₁ⁿ (Yᵢ − Ŷᵢ)² when comparing a model with just an intercept (denominator) to the model with the intercept and slope (numerator)?
Correct Answer: 0.50
Explanation: The ratio of the sum of squared errors compares the model with just an intercept (which captures the variance in mpg without considering weight) to the model that includes both intercept and slope. The ratio quantifies the proportion of variance explained by the predictor.
Q10. Do the residuals always have to sum to 0 in linear regression?
Correct Answer: If an intercept is included, then they will sum to 0.
Explanation: In linear regression with an intercept, the residuals always sum to zero. This is because the intercept is chosen to minimize the sum of the squared residuals, resulting in a balance of positive and negative residuals.
Regression Models Week 03 Quiz Answers
Q1. Consider the mtcars
data set. Fit a model with mpg as the outcome that includes the number of cylinders as a factor variable and weight as a confounder. Give the adjusted estimate for the expected change in mpg comparing 8 cylinders to 4.
Correct Answer: -4.256
Explanation: The model adjusts for the effect of weight as a confounder, and the adjusted estimate for the change in mpg between 8 cylinders and 4 is -4.256. This means that after adjusting for weight, the mpg decreases by 4.256 units when comparing 8 cylinders to 4 cylinders.
Q2. Consider the previous problem, give the estimate of the residual standard deviation.
Correct Answer: 0.223
Explanation: The residual standard deviation measures the average distance that data points fall from the regression line. In this case, it is 0.223, indicating how much the mpg values vary from the predicted values after accounting for the model.
Q3. Consider the mtcars
data set. Fit a model with mpg as the outcome that considers the number of cylinders as a factor variable and weight as a confounder. Now fit a second model with mpg as the outcome that considers the interaction between the number of cylinders (as a factor variable) and weight. Give the P-value for the likelihood ratio test comparing the two models and suggest a model using 0.05 as a type I error rate significance benchmark.
Correct Answer: The P-value is small (less than 0.05). So, according to our criterion, we reject, which suggests that the interaction term is necessary.
Explanation: The likelihood ratio test compares the two models and the small P-value indicates that the interaction term between the number of cylinders and weight is significant, suggesting that the interaction should be included in the model.
Q4. Consider the mtcars
data set. Fit a model with mpg as the outcome that includes the number of cylinders as a factor variable and weight included in the model as:
RCopyEditlm(mpg ~ I(wt * 0.5) + factor(cyl), data = mtcars)
How is the wt coefficient interpreted?
Correct Answer: The estimated expected change in MPG per half ton increase in weight.
Explanation: In the model, weight is multiplied by 0.5, so the coefficient for wt reflects the change in mpg per half ton increase in weight, holding the number of cylinders constant.
Q5. Consider the following data set:
RCopyEditx <- c(0.586, 0.166, -0.042, -0.614, 11.72)
y <- c(0.549, -0.026, -0.127, -0.751, 1.344)
Give the hat diagonal for the most influential point.
Correct Answer: 0.9946
Explanation: The hat diagonal is a measure of leverage in the regression analysis, indicating how much influence a data point has on the fitted values. The point with the highest leverage has a value of 0.9946.
Q6. Consider the following data set:
RCopyEditx <- c(0.586, 0.166, -0.042, -0.614, 11.72)
y <- c(0.549, -0.026, -0.127, -0.751, 1.344)
Give the slope dfbeta for the point with the highest hat value.
Correct Answer: -0.378
Explanation: The slope dfbeta measures how much the slope of the regression line changes when a particular data point is removed. The point with the highest hat value has a dfbeta of -0.378, indicating a substantial influence on the slope of the regression.
Q7. Consider a regression relationship between Y and X with and without adjustment for a third variable Z. Which of the following is true about comparing the regression coefficient between Y and X with and without adjustment for Z?
Correct Answer: It is possible for the coefficient to reverse sign after adjustment. For example, it can be strongly significant and positive before adjustment and strongly significant and negative after adjustment.
Explanation: Adjustment for a confounding variable (Z) can change the direction and magnitude of the relationship between Y and X, as it accounts for the influence of Z on both Y and X. This can lead to a reversal of the coefficient sign after adjustment.
Regression Models Week 04 Quiz Answers
Q1. Consider the space shuttle data (?shuttle
) in the MASS
library. Consider modeling the use of the autolander as the outcome (use
). Fit a logistic regression model with autolander (auto
) use (labeled as “auto” 1) versus not (0) as predicted by wind sign (wind
). Give the estimated odds ratio for autolander use comparing head winds, labeled as “head” in the variable headwind
(numerator) to tail winds (denominator).
Correct Answer: 1.327
Explanation: The odds ratio represents the odds of an outcome occurring in one group relative to another. In this case, the estimated odds ratio for autolander use comparing headwinds to tailwinds is 1.327, indicating that headwinds are associated with 1.327 times higher odds of autolander use compared to tailwinds.
Q2. Consider the previous problem. Give the estimated odds ratio for autolander use comparing head winds (numerator) to tail winds (denominator) adjusting for wind strength from the variable magn
.
Correct Answer: 1.485
Explanation: When adjusting for wind strength (magn
), the odds ratio changes to 1.485, meaning that after accounting for wind strength, the odds of autolander use in headwinds are 1.485 times the odds in tailwinds.
Q3. If you fit a logistic regression model to a binary variable, for example, use of the autolander, then fit a logistic regression model for one minus the outcome (not using the autolander), what happens to the coefficients?
Correct Answer: The coefficients reverse their signs.
Explanation: When you switch the binary outcome from “1” (using autolander) to “0” (not using autolander), the coefficients reverse their signs. This is because the log-odds for the two outcomes are negatives of each other.
Q4. Consider the insect spray data (InsectSprays
). Fit a Poisson model using spray as a factor level. Report the estimated relative rate comparing spray A (numerator) to spray B (denominator).
Correct Answer: 0.9457
Explanation: The estimated relative rate comparing spray A to spray B is 0.9457, indicating that the rate of events for spray A is about 94.57% of the rate of events for spray B.
Q5. Consider a Poisson glm with an offset, t
. So, for example, a model of the form glm(count ~ x + offset(t), family = poisson)
where x
is a factor variable comparing a treatment (1) to a control (0) and t
is the natural log of a monitoring time. What is the impact of the coefficient for x
if we fit the model glm(count ~ x + offset(t2), family = poisson)
where t2 <- log(10) + t
?
Correct Answer: The coefficient estimate is unchanged
Explanation: Changing the units of the offset by adding a constant (log(10)
) does not affect the coefficient for x
because the offset term just shifts the entire log scale, leaving the effect of x
unchanged.
Q6. Consider the data:
RCopyEditx <- -5:5
y <- c(5.12, 3.93, 2.67, 1.87, 0.52, 0.08, 0.93, 2.05, 2.54, 3.87, 4.97)
Using a knot point at 0, fit a linear model that looks like a hockey stick with two lines meeting at x=0
. Include an intercept term, x
and the knot point term. What is the estimated slope of the line after 0?
Correct Answer: -1.024
Explanation: The slope after the knot at 0 represents the rate of change in y
for values of x
greater than 0. The estimated slope after the knot is -1.024, indicating a downward slope for values greater than 0.
Frequently Asked Questions (FAQ)
Are the “Regression Models” quiz answers accurate?
Yes, these answers have been carefully reviewed and align with the latest course material and best practices in regression modeling.
Can I use these answers for both practice and graded quizzes?
Absolutely! These answers are designed for both practice quizzes and graded assessments, helping you prepare thoroughly for all evaluations.
Does this guide cover all modules of the course?
Yes, this guide provides answers for every module, ensuring complete coverage of the entire course content.
Will this guide help me improve my regression modeling skills?
Yes, beyond providing quiz answers, this guide reinforces key concepts such as understanding linear and logistic regression, model fitting, interpretation of coefficients, and evaluation of model performance using metrics like R-squared, AUC, and confusion matrices.
Conclusion
We hope this guide to Regression Models Quiz Answers helps you excel in mastering regression techniques and succeed in your course. Bookmark this page for easy access and share it with your peers. Ready to enhance your predictive modeling skills and ace your quizzes? Let’s get started!
Sources: Regression Models
Get All Quiz Answers of Data Science Specialization >>
The Data Scientist’s Toolbox Quiz Answers
Getting and Cleaning Data Quiz Answers
Exploratory Data Analysis Quiz Answers
Reproducible Research Quiz Answers
Statistical Inference Quiz Answers
Regression Models Quiz Answers
Practical Machine Learning Quiz Answers
Developing Data Products Quiz Answers