Get All Weeks Linear Regression for Business Statistics Coursera Quiz Answers
Table of Contents
Week 01: Linear Regression for Business Statistics Coursera Quiz Answers
Quiz 1: Practice Quiz
Q1. In a regression, the variable of interest is also known as which of the following? Mark all that apply.
- independent variable
- dependent variable
- response variable
- Y variable
Q2. Which one of the following linear equations best represents an explanatory relationship in which hours worked in a year and the number of employees can be used to explain
changes in yearly production volume?
- Production Volume = β0 + β1Hours Worked + β2Employees
- Production
Volume = β1Hours Worked + β2Employees - Number of
Employees = β0+β1Hours
Worked + β2Production Volume - Hours
Worked = β0 + β1Production Volume + β2Employees
Quiz 2: Practice Quiz
Q1. Which of the following statements regarding regression are true? Mark all that apply.
- Multiple regression uses only one explanatory variable
- Only simple regression is a linear regression
- Both simple regression and multiple
regression are linear regressions - Multiple regression uses more than one explanatory variable
- Simple regression uses only one explanatory variable
Q2. Now that we have developed our model, we will estimate the model using software. Let’s continue the example from the previous lesson, in which our regression equation is
Which is the value for the coefficient β2, rounded to two decimal places?
- 501.21
- 2.25
- 142.26
Quiz 3: Practice Quiz
Q1. Continue with same example from the previous lesson. The regression equation is Production Volume = β0 + β1Hours
Worked + β2Employees with the following estimates:
What is the value of the Y variable (rounded to two decimal points) when all X variables are zero?
- 0
- 501.21
- 142.26
- 2.25
Q2. Notice that the value calculated in the previous question is β0 in the regression equation. Does the interpretation of β0 have managerial significance?
- No
- Yes
Quiz 4: Practice Quiz
Q1. Continue with the same example from the previous lesson. The regression equation is Production Volume = β0 + β1Hours Worked + β2Employees with the following estimates:
A manager wants to estimate the production volume for various numbers of employees and hours worked. Using the regression output, what is the best estimate for production volume if there are 4000 hours worked and 300 employees during a year?
- 49305
- 52170
- 41857
- 46559
Q2. In our regression model, assume a base case of 4000 hours worked and 300 employees for the year.
The manager has the opportunity to change the number of employees and hours worked for the year. Which of the following changes leads to the greatest predicted production volume?
- Let go of 10 employees and increase total hours by 500
- Keep the same number of employees and increase total hours by 300
- Hire 10 additional employees and increase total hours by 100
- Hire 20 additional employees and keep total hours the same
Quiz 5: Practice Quiz
Q1. Which of the following statements is true?
- If a regression has errors, then the equation is not good enough and a new one should be found
- The true relationship between two variables can usually be determined from regression
- Regression is a perfect process that has no errors
- Regression is a process that has errors
Q2. An R-square value of 1 indicates which of the following? Mark all that apply.
- The residuals are zero
- The predicted y-values equal the actual values
- The predicted y-values do not equal the actual values
- The residuals are large
Quiz 6: Practice Quiz
Q1. The residuals from a regression follow a _ distribution centered around _.
- t; x
- binomial; 0
- Poisson; x
- normal; 0
Q2. The expression (b0 – β0)/Sb0 follows a t distribution with n-k-1 degrees of freedom. What is Sb0?
- the standard error of b0
- the variance of b0
- the degrees of freedom of b0
- the average value of b0
Quiz 7: Regression Analysis: An Introduction
Q1. Download Grocery Store Sales, which provides data in the following categories: Sales per Square Foot, Size of Store (in Square Feet), Advertising Dollars (in thousands), and Number of Products Offered in Store, from a sample size of 70 grocery stores.
We want to see how changes in our independent variables affect Sales per Square Foot.
Please run one multiple regression including all independent variables to estimate the coefficients for each of our independent variables.
What is the coefficient for the Size of the Store? Please round to three decimal places.
The coefficient for the Size of Store is not provided. Please provide the coefficient values for the independent variables.
Q2. What is the coefficient for Advertising Dollars, rounded to three decimal places?
The coefficient for Advertising Dollars is not provided. Please provide the coefficient values for the independent variables.
Q3. Based on the sign of the coefficient for the Number of Products in Store, how will changes in the Number of Products likely increase or decrease the Sales per Square Foot?
- As the Number of Products increases, the Sales per Square Foot will decrease.
- As the Number of Products decreases, the Sales per Square Foot will decrease.
- As the Number of Products increases, the Sales per Square Foot will increase.
- As the Number of Products decreases, the Sales per Square Foot will increase.
Q4. What is the Sales per Square Foot if all of our X variables are zero (in $) ? Please round to one decimal place.
If all independent variables are zero, the Sales per Square Foot would be equal to the intercept (β0).
Q5. What would be the expected Sales per Square Foot if the Size of Store was 60,000 square feet, they spent $70,000 in Advertising Dollars, and offered 30,000 products (in $) ? Please round to two decimal places.
To calculate the expected Sales per Square Foot with specific values for the independent variables, you need to use the regression equation and plug in those values. The expected value is the predicted value from the regression equation.
Q6. R square helps explain the goodness of fit of the model. What is the R square for this regression model? Round to two decimal places.
The R-squared (R²) value indicates the proportion of the variance in the dependent variable that is explained by the independent variables in the model. It helps assess the goodness of fit. You haven't provided the R-squared value.
Q7. How might one improve the goodness of fit for this model? Select all that apply.
- Include additional variables.
- Remove one or two of the independent variables.
- Consider that the relationship between the independent and dependent variables may not be linear.
- Remove some of the sample data at random.
Q8. What are some assumptions made about errors in a regression equation?
- Errors are not normally distributed with a mean of zero.
- Errors are normally distributed with a mean of zero.
- Errors are typically distributed equally above and below the regression line.
- Errors are not typically distributed equally above and below the regression line.
Q9. What is the residual degrees of freedom for the regression model?
The residual degrees of freedom for the regression model are usually calculated as (n - k - 1), where "n" is the sample size and "k" is the number of coefficients estimated.
Q10. In utilizing notations, what are the primary differences in a regression model between b and β?
- The true value of β is never known.
- The true value of β is always known.
- The value of b is not normally distributed around the actual value of β.
- The value of b is normally distributed around the actual value of β.
Week 2 Quiz Answers
Quiz 1: Practice Quiz
Q1. From the video, the estimated coefficient produced from the regression for promotional expenditures is 1802.61 with a standard error of 392.85. However, the manager believes that the true value is 2000. To test this claim, we decide to run a hypothesis test. Which of the following is the correct calculation for the t-statistic?
- (1802.61 – 392.85) / 2000
- (2000 – 1802.61) / 392.85
- (1802.61 – 2000) / 392.85
- (2000 – 392.85) / 1802.61
Q2. Now that we have the t-statistic, we then calculate the value for t-cutoff. From the video, the t-cutoff is +/- 2.086. Do we reject the null hypothesis?
- We reject the null hypothesis because the t-statistic lies inside the rejection region
- We do not reject the null hypothesis because the t-statistic lies outside the rejection region
- We reject the null hypothesis because the t-statistic lies outside the rejection region
- We do not reject the null hypothesis because the t-statistic lies inside the rejection region
Quiz 2: Practice Quiz
Q1. In a one-tail test, the rejection region contains the probability of . In a two-tail test, each rejection region contains a probability of .
- α; α/2
- α; α
- α/2; α/2
- α/2; α
Q2. We will continue with the hypothesis test on the coefficient for promotional expenditures. The estimated coefficient is 1802.61 with a standard error of 392.85, and the claim is that the true value is 2000. The residual degrees of freedom obtained in the regression output is 20.
What is the p-value for this hypothesis test?
- 0.65
- 0.62
- 0.03
- 0.06
Quiz 3: Practice Quiz
Q1. Review the video again to find the 95% confidence interval for the coefficient for promotional expenditures. What can we conclude about the claim that the true value of the coefficient is 2000?
- Because 2000 lies within the 95% confidence interval, we cannot reject the null hypothesis.
- Because 2000 lies within the 95% confidence interval, we can reject the null hypothesis.
- Because 2000 lies outside the 95% confidence interval, we cannot reject the null hypothesis.
- Because 2000 lies outside the 95% confidence interval, we can reject the null hypothesis.
Q2. The p-value provided by Excel for each coefficient corresponds to the hypothesis test as to whether each coefficient is zero. Suppose the p-value for a coefficient is greater than our α value. What can we conclude about the estimated coefficient?
- It is not significant because we can reject the claim that the true value is zero.
- It is significant because we can reject the claim that the true value is zero.
- It is not significant because we cannot reject the claim that the true value is zero.
- It is significant because we cannot reject the claim that the true value is zero.
Quiz 4: Practice Quiz
Q1. Refer to the regression output from the video lesson. The coefficient for the annual income is 0.4891. What is an appropriate interpretation for this value? Mark all that apply.
- For every dollar increase in income, the home price increases by 0.4891 dollars, all other variables remaining the same.
- For every dollar increase in home price, the income increases by 0.4891 dollars, all other variables remaining the same.
- For every dollar increase in income, the home price decreases by 0.4891 dollars, all other variables remaining the same.
- For every 1000 dollar increase in income, the home price increases by 489.1 dollars, all other variables remaining the same.
Q2. What does the estimated value of 0.4891 tell us about the true value of the coefficient?
- The true value could be greater than or less than 0.4891.
- The true value must be greater than 0.4891.
- The true must be less than 0.4891.
- The true value must equal 0.4891.
Quiz 5: Practice Quiz
Q1. True or false: the R-square value indicates the proportion of total sum of squares explained by the regression.
- True
- False
Q2. When explanatory variables are added to a regression, the R-square value _ increases whereas the adjusted R-square value _ increases.
- always; always
- sometimes; never
- always; sometimes
- never; always
Quiz 6: Practice Quiz
Q1. Which of the following could be appropriate categorical variables?
- gender
- weight
- profession
- eye color
- height
Q2. A categorical variable that has five different categories requires __ dummy variables.
A categorical variable with five different categories requires four dummy variables.
Week 3 Quiz Answers
Quiz 1: Practice Quiz
Q1. Refer to the example shown in the video; the region is represented by two separate dummy variables, REGA and REGB, such that region C is the reference category. Which combination of values for REGA and REGB are valid? Select all that apply.
- REGA = 0; REGB = 0
- REGA = 1; REGB = 2
- REGA = 1; REGB = 0
- REGA = 1; REGB = 1
- REGA = 0; REGB = 1
Q2. Continue with the same example. To denote that delivery is made to region C, what should the values of REGA and REGB be?
- REGA = 0; REGB = 1
- REGA = 0; REGB = 0
- REGA = 1; REGB = 1
- REGA = 1; REGB = 0
Quiz 2: Practice Quiz
Q1. Refer to the regression from the video. Which of the following regions can be used as the reference category? Select all that apply.
- Region A
- Region B
- Region C
Q2. Suppose we choose region A as the reference category. We run the regression and obtain the following equation:
Minutes = β0 + β1REGB + β2REGC+ β3Parcels + β4TruckAge.
What does β2 represent?
- The difference between the fixed time to deliver to Region C versus the fixed time to deliver to Region A
- The fixed time it takes to deliver to Region C
- The difference between the fixed time to deliver to Region C versus the fixed time to deliver to Region B
- The time it takes to deliver to region C when all other explanatory variables are 0
Quiz 3: Practice Quiz
Q1. Refer to the regression from the video with the following estimated equation:
Minutes = -34.76 + 107.71REGA + 1.21REGB+ 9.92Parcels + 3.68TruckAge.
Approximately how long does it take to deliver 50 parcels to Region A using a truck that is 5 years old? Round your answer to the lowest integer.
- 307
- 480
- 587
- 569
Q2. Suppose the truck driver is on a tight schedule and wants to reduce the time of delivery by at least 100 minutes. Which of the following changes made to the delivery in question 1 would accomplish this goal? Select all that apply.
- Use a brand new truck instead of a 5-year-old truck
- Deliver the same number of parcels to Region B instead of Region A
- Deliver 30 parcels instead of 50 parcels to region A
- Deliver the same number of parcels to Region C instead of Region A
Quiz 4: Practice Quiz
Q1. Refer to the video lesson. When the first regression using CoolSize is changed to the second regression using RefSize, why must the column for CoolSize be moved to the far right?
- After performing the calculation to obtain RefSize, CoolSize must be moved to the end to allow the formula to work correctly.
- When running the regression in Excel, all explanatory variables must be placed side by side, so CoolSize must be moved to the end.
- CoolSize is moved to the end strictly for aesthetic purposes.
Q2. Refer to the regression using FreezeSize and RefSize. When interpreting a unit increase in the coefficient for FreezeSize, we assume that all other variables remain the same. What does this imply about the change in CoolSize (hint: CoolSize is not a variable in this regression, but can be derived from the values of FreezeSize and RefSize)?
- CoolSize must increase if FreezeSize increases
- CoolSize must decrease if FreezeSize increases
- CoolSize must stay the same if FreezeSize increases
- The change in CoolSize cannot be inferred
Quiz 5: Practice Quiz
Q1. Refer to the regressions on refrigerator price. How many dollars does the price of the refrigerator increase by when the freezer size increases by 1 cubic foot and the cooler size remains the same?
- 213.88
- 76.50
- 137.38
Q2. How many dollars does the price of the refrigerator increase by when the freezer size increases by 1 cubic foot and the cooler size decreases by 1 cubic foot?
- 76.50
- 137.38
- 213.88
Quiz 6: Practice Quiz
Q1. Which of the following is true regarding a regression with a high level of multicollinearity? Select all that apply.
- The regression might still be able to predict the dependent variable accurately
- The regression will not be able to predict the dependent variable accurately
- The regression can be used to interpret the impact of coefficients accurately
- The regression cannot be used to interpret the impact of coefficients accurately
Q2. Which of the following pairs of explanatory variables likely has the highest amount of correlation?
- length of right foot and length of left foot of a person
- height and weight of a person
- height and salary of a person
- weight and salary of a person
Week 4 Quiz Answers
Quiz 1: Practice Quiz
Q1. Please select all that ‘Mean-centering of variables’ does for a regression model.
- Mean centering is never useful in a regression model.
- It makes the intercept to be interpreted more meaningfully.
- It improves R-square.
- It improves prediction using the regression model.
Q2. One could center variables at a value other than the mean. True or False?
- False
- True
Q3. Mean-centering the Y variable helps in the interpretation of the intercept in the regression model. True or False?
- True
- False
Quiz 2: Practice Quiz
Q1. Please choose all that apply.
- The formula for the confidence interval for a predicted value uses the ‘standard error’ of regression produced below the adjusted R-squared.
- The predicted value (the point prediction) is exactly at the center of the confidence interval.
- All values in the confidence interval are equally likely.
- The confidence interval for the predicted value is a way of incorporating uncertainty in our prediction.
Q2. What is the correct formula for the margin of error for constructing a 95% confidence interval for the predicted value?
- |T.INV(0.025,residual df)|*std error of regression
- |T.DIST(0.025,residual df)|*std error of regression
- |T.INV(0.05,residual df)|*std error of regression
- T.INV(0.025,residual df)*std error of regression
Quiz 3: Practice Quiz
Q1. Which of the following are true in regard to interaction variables? Please select all that apply.
- Interaction variables allow you to study the impact of one variable at different levels of
another variable. - Interaction variables are created by adding the variables.
- Interaction variables are created by multiplying the variables.
- Interaction variables are created by either adding or multiplying the variables.
Q2. Following is a regression equation developed using salary data for employees at a company:
Salary = β0 + β1Male
β2Age + β3Male*Age
Salary is measured in dollars. Age is measured in years and Male is a dummy variable representing the categorical variable Gender.
What is the interpretation of β2 ? Please mark the most appropriate answer.
- It is the salary change with each additional year of age.
- It is the change in age with every dollar increase in salary.
- It is the change in salary with each additional year of age for a female employee.
- It is the change in salary with each additional year of age for a male employee.
Quiz 4: Practice Quiz
Q1. When creating an interaction variable, one of the variables has to be a dummy variable. Is this statement True or False?
- True
- False
Q2. Following is a regression equation equating salary to
gender and years of experience..
Salary = β0 + β1Male + β2Years_of_Experience
β3Male*Years_of_Experience
Salary is measured in dollars. Years_of_Experience is
measured in years and Male is a dummy variable representing the categorical
variable Gender.
What is the interpretation of β3 ? Please mark
the most appropriate answer.
- It is the salary change for male employees.
- It is the change in years of experience at different salary levels.
- It is the change in salary with each additional year of experience.
- It is the ‘extra’ change in salary with each additional year of experience for a male employee as compared to a female employee.
Quiz 5: Practice Quiz
Q1. Please select all statements that apply.
- Transforming variables in a regression implies adding additional variables to the regression model.
- Transforming variables in a regression may improve the R-square of the model.
- Natural log transformation is a common transformation used in regression.
- There are transformations other than the natural log that can be used in the regression.
- Transforming variables in a regression may improve the linearity of the model.
Q2. In the following regression model, what is the correct interpretation of β1?
LN(Y) = β0 + β1X1 + β2X2
Please select all that apply.
- For every unit increase in X1, the natural log of the Y variable increases by β1 units, all other variables are kept at the same level.
- For every % increase in X1, the Y variable increases by β1 %, all other variables are kept at the same level.
- For every % increase in X1, the natural log of the Y variable increases by β1 %, all other variables are kept at the same level.
- For every unit increase in in X1, the natural log of the Y variable increases by 100*β1 %, all other variables are kept at the same level.
- For every unit increase in in X1, the Y variable increases by 100*β1 %, all other variables are kept at the same level.
Q3. In the following regression model, what is the correct interpretation of β2?
LN(Y) = β0 + β1ln(X1) + β2ln(X2)
Please select all that apply.
- For every % increase in X2, the Y variable increases by β2 %, and all other variables are kept at the same level.
- For every unit increase in in X2, the Y variable increases by 100*β2 %, all other variables are kept at the same level.
- For every unit increase in X2, the natural log of the Y variable increases by β2 units, all other variables are kept at the same level.
- For every % increase in X2, the natural log of the Y variable increases by β2 %, all other variables are kept at the same level.
- For every unit increase in in X2, the natural log of the Y variable increases by 100*β2 %, all other variables are kept at the same level.
Quiz 6: Practice Quiz
Q1. Which of the following is the right function to calculate the natural log in Excel?
- =LOG( )
- =LN( )
- =NLOG( )
- =NATLOG( )
Q2. The coefficients in a log-log model can directly be interpreted as:
- Growth rates
- Elasticities
- Exponential decays
- Sum of squares
Q3. Which of the following are reasons to take a natural log transformation of variables in a regression model?
Select all that apply.
- To make the regression complicated.
- To improve the R-square measure.
- It is popular to take the natural log transformation.
- To interpret the beta coefficients directly as elasticities or growth rates.
Quiz 7: Regression Analysis: Various Extensions
Q1. Data for Questions 1 through 5 are contained in the file realestate.xlsx. Please download this file.
The data contains information about apartment prices and characteristics for a sought-after area in a large metropolitan city in the USA. The data include sale price (PRICE) in $, floor area (SQFT) in square feet, number of bedrooms (BED), number of bathrooms (BATH), number of floors in the building (FLOORS), and distance from a centrally located city park (DIST) in meters.
You need to establish a relationship between PRICE and these other characteristics. Specifically, estimate the following regression model,
LN(PRICE) = β0 + β1LN(SQFT) + β2BED + β3BATH + β4FLOORS + β5DIST
Notice that in the regression you need to take a log transformation of the PRICE and SQFT variables. Report the estimated value of β4, and round the answer to four decimal digits.
0.0001
Q2. How do you interpret the coefficient estimate of β1 ?
- When the size of the apartment increases by 1%, then the Price increases by 1.013%, all other variables remaining at the same level.
- When the Price increases by 100,000$, the size of the apartment increases by 1.013*100 = 101.3 sqft.
- When the size of the apartment increases by 1 unit, then the Price increases by 1.013 units, all other variables remaining at the same level.
- When the size of the apartment increases by 1 unit, then the Price increases by 1.013 %, all other variables remaining at the same level.
Q3. What is the impact of an additional Bathroom on apartment price?
- All other variables being held constant, an additional Bathroom does not significantly impact the price.
- All other variables being held constant, an additional Bathroom raises the apartment price by 0.0293%.
- All other variables being held constant, an additional Bathroom raises the apartment price by 29,300$.
- All other variables being held constant, an additional Bathroom raises the apartment price by 2.93%.
Q4. Using the estimated regression model, predict the price in dollars of an apartment that is 1000 sqft in size, has 2 Bedrooms, 2 Bathrooms, is in a building with 8 Floors, and is 1.2 Km from the City Park. Round your answer to a whole number, and input the answer without any “$” or “,” sign.
440032
Q5. Calculate a 95% confidence interval for your predicted price from Question 4.
Report the lower limit of the confidence interval (in dollars), and round your answer to a whole number. Input the answer without any “$” or “,” sign.
313916
Q6. Data for Questions 6 through 11 is contained in the file Majors.xlsx. Please download this file.
The data contains information about the starting salary of a sample of 50 undergraduate students at a Business school. The data consists of the starting salary (SALARY) in dollars, the field of study of the student (MAJOR), and the field of study is either ‘Finance’ or ‘International Business’. Finally, the variable UGPA is the undergraduate Grade Point Average of the student.
Estimate a regression model linking starting salary to the field of study and UGPA as follows,
SALARY = β0 + β1IB + β2UGPA
In the above regression, IB is a dummy variable that takes a value =1 when the MAJOR is IB, otherwise, it takes a value 0.
Report the estimated value of β1, and round the answer to a whole number.
11495
Q7. Now, the mean center is the UGPA variable. That is, subtract the mean value of UGPA from all the data points. Denote this mean-centered variable as [UGPA].
Run a regression as follows,
SALARY = β0 + β1IB + β2[UGPA]
Round the estimated value of β0 to a whole number and interpret it. Please mark all that apply.
- 60,630 is the salary of a FINANCE Major with a UGPA equal to the average UGPA observed in the data.
- 60,630$ is the salary of an IB Major with a UGPA equal to the average UGPA observed in the data.
- 60,630$ is the salary of a FINANCE Major with 0 UGPA
- 60,630$ is the value of the Y variable when all X variables are zero.
Q8. Based on the regression carried out in Question 7, how much less salary (in dollars) does an IB Major get as compared to a FINANCE Major, when they have the same UGPAs? Round your answer to a whole number. Input the answer without any “$” or “,” sign.
10412
Q9. There is a belief among students that a higher UGPA is more important in terms of impacting the starting salary for IB undergraduates as compared to FINANCE undergraduates.
You can empirically check for this belief by introducing an interaction variable in your regression model constructed in Question 7 and then checking the estimated coefficient for that variable.
- To introduce the interaction variable which variables would you interact?
- Intercept and IB
- IB and [UGPA]
- Intercept and [UGPA]
Q10. Introduce an interaction effect in your data and estimate the model. Report the estimate of the coefficient on the interaction variable. Please round your answer to a whole number.
1215
Q11. How do you interpret the coefficient on the interaction effect?
- The coefficient is the differential impact of UGPA on the starting salary of FINANCE majors as compared to IB majors.
- The coefficient is the impact of UGPA on the starting salary of IB majors.
- The coefficient is the impact of UGPA on the starting salary of FINANCE majors.
- The coefficient is the differential impact of UGPA on the starting salary of IB majors as compared to FINANCE majors.
- The coefficient is the impact of UGPA on starting salary.
Get All Course Quiz Answers of Business Statistics and Analysis Specialization
Introduction to Data Analysis Using Excel Coursera Quiz Answers
Business Applications of Hypothesis Testing and Confidence Interval Estimation Quiz Answers
Linear Regression for Business Statistics Quiz Answers