Linear Regression for Business Statistics Coursera Quiz Answers

Get All Weeks Linear Regression for Business Statistics Coursera Quiz Answers

Regression Analysis is perhaps the single most important Business Statistics tool used in the industry. Regression is the engine behind a multitude of data analytics applications used for many forms of forecasting and prediction.

This is the fourth course in the specialization, “Business Statistics and Analysis”. The course introduces you to the very important tool known as Linear Regression. You will learn to apply various procedures such as dummy variable regressions, transforming variables, and interaction effects. All these are introduced and explained using easy-to-understand examples in Microsoft Excel.

The focus of the course is on understanding and application, rather than detailed mathematical derivations. Note: This course uses the ‘Data Analysis’ toolbox which is standard with the Windows version of Microsoft Excel. It is also standard with the 2016 or later Mac version of Excel. However, it is not standard with earlier versions of Excel for Mac.

Enroll on Coursera

Linear Regression for Business Statistics Coursera Quiz Answers

Week 1 Quiz Answers

Quiz 1: Practice Quiz

Q1. In a regression, the variable of interest is also known as which of the following? Mark all that apply.

  • independent variable
  • dependent variable
  • response variable
  • Y variable

Q2. Which one of the following linear equations best represents an explanatory relationship in which hours worked in a year and number of employees can be used to explain
changes in yearly production volume?

  • Production Volume = β0 + β1Hours Worked + β2Employees
  • Production
    Volume = β1Hours Worked + β2Employees
  • Number of
    Employees = β0+β1Hours
    Worked + β2Production Volume
  • Hours
    Worked = β0 + β1Production Volume + β2Employees

Quiz 2: Practice Quiz

Q1. Which of the following statements regarding regression are true? Mark all that apply.

  • Multiple regression uses only one explanatory variable
  • Only simple regression is a linear regression
  • Both simple regression and multiple
    regression are linear regressions
  • Multiple regression uses more than one explanatory variable
  • Simple regression uses only one explanatory variable

Q2. Now that we have developed our model, we will estimate the model using software. Let’s continue the example from the previous lesson, in which our regression equation is

Which is the value for the coefficient β2, rounded to two decimal places?

  • 501.21
  • 2.25
  • 142.26

Quiz 3: Practice Quiz

Q1. Continue with same example from the previous lesson. The regression equation is Production Volume = β0 + β1Hours
Worked + β2Employees with the following estimates:

What is the value of the Y variable (rounded to two decimal points) when all X variables are zero?

  • 0
  • 501.21
  • 142.26
  • 2.25

Q2. Notice that the value calculated in the previous question is β0 in the regression equation. Does the interpretation of β0 have managerial significance?

  • No
  • Yes

Quiz 4: Practice Quiz

Q1. Continue with the same example from the previous lesson. The regression equation is Production Volume = β0 + β1Hours Worked + β2Employees with the following estimates:

A manager wants to estimate the production volume for various numbers of employees and hours worked. Using the regression output, what is the best estimate for production volume if there are 4000 hours worked and 300 employees during a year?

  • 49305
  • 52170
  • 41857
  • 46559

Q2. In our regression model, assume a base case of 4000 hours worked and 300 employees for the year.

The manager has the opportunity to change the number of employees and hours worked for the year. Which of the following changes leads to the greatest predicted production volume?

  • Let go of 10 employees and increase total hours by 500
  • Keep the same number of employees and increase total hours by 300
  • Hire 10 additional employees and increase total hours by 100
  • Hire 20 additional employees and keep total hours the same

Quiz 5: Practice Quiz

Q1. Which of the following statements is true?

  • If a regression has errors, then the equation is not good enough and a new one should be found
  • The true relationship between two variables can usually be determined from regression
  • Regression is a perfect process that has no errors
  • Regression is a process that has errors

Q2. An R-square value of 1 indicates which of the following? Mark all that apply.

  • The residuals are zero
  • The predicted y-values equal the actual values
  • The predicted y-values do not equal the actual values
  • The residuals are large

Quiz 6: Practice Quiz

Q1. The residuals from a regression follow a _ distribution centered around _.

  • t; x
  • binomial; 0
  • Poisson; x
  • normal; 0

Q2. The expression (b0 – β0)/Sb0 follows a t distribution with n-k-1 degrees of freedom. What is Sb0?

  • the standard error of b0
  • the variance of b0
  • the degrees of freedom of b0
  • the average value of b0

Quiz 7: Regression Analysis: An Introduction

Q1. Download Grocery Store Sales, which provides data in the following categories: Sales per Square Foot, Size of Store (in Square Feet), Advertising Dollars (in thousands), and Number of Products Offered in Store, from a sample size of 70 grocery stores.

We want to see how changes in our independent variables affect Sales per Square Foot.

Please run one multiple regression including all independent variables to estimate the coefficients for each of our independent variables.

What is the coefficient for Size of Store? Please round to three decimal places.

Enter answer here

Q2. What is the coefficient for Advertising Dollars, rounded to three decimal places?

Enter answer here

Q3. Based on the sign of the coefficient for Number of Products in Store, how will changes in Number of Products likely increase or decrease the Sales per Square Foot?

  • As the Number of Products increases, the Sales per Square Foot will decrease.
  • As the Number of Products decreases, the Sales per Square Foot will decrease.
  • As the Number of Products increases, the Sales per Square Foot will increase.
  • As the Number of Products decreases, the Sales per Square Foot will increase.

Q4. What is the Sales per Square Foot if all of our X variables are zero (in $) ? Please round to one decimal place.

Enter answer here

Q5. What would be the expected Sales per Square Foot if the Size of Store was 60,000 square feet, they spent $70,000 in Advertising Dollars, and offered 30,000 products (in $) ? Please round to two decimal places.

Enter answer here

Q6. R square helps explain the goodness of fit of the model. What is the R square for this regression model? Round to two decimal places.

Enter answer here

Q7. How might one improve the goodness of fit for this model? Select all that apply.

  • Include additional variables.
  • Remove one or two of the independent variables.
  • Consider that the relationship between the independent and dependent variables may not be linear.
  • Remove some of the sample data at random.

Q8. What are some assumptions made about errors in a regression equation?

  • Errors are not normally distributed with a mean of zero.
  • Errors are normally distributed with a mean of zero.
  • Errors are typically distributed equally above and below the regression line.
  • Errors are not typically distributed equally above and below the regression line.

Q9. What is the residual degrees of freedom for the regression model?

Enter answer here

Q10. In utilizing notations, what are the primary differences in a regression model between b and β?

  • The true value of β is never known.
  • The true value of β is always known.
  • The value of b is not normally distributed around the actual value of β.
  • The value of b is normally distributed around the actual value of β.

Week 2 Quiz Answers

Quiz 1: Practice Quiz

Q1. From the video, the estimated coefficient produced from the regression for promotional expenditures is 1802.61 with a standard error of 392.85. However, the manager believes that the true value is 2000. To test this claim, we decide to run a hypothesis test. Which of the following is the correct calculation for the t-statistic?

  • (1802.61 – 392.85) / 2000
  • (2000 – 1802.61) / 392.85
  • (1802.61 – 2000) / 392.85
  • (2000 – 392.85) / 1802.61

Q2. Now that we have the t-statistic, we then calculate the value for t-cutoff. From the video, the t-cutoff is +/- 2.086. Do we reject the null hypothesis?

  • We reject the null hypothesis because the t-statistic lies inside the rejection region
  • We do not reject the null hypothesis because the t-statistic lies outside the rejection region
  • We reject the null hypothesis because the t-statistic lies outside the rejection region
  • We do not reject the null hypothesis because the t-statistic lies inside the rejection region

Quiz 2: Practice Quiz

Q1. In a one-tail test, the rejection region contains probability of . In a two-tail test, each rejection region contains probability of .

  • α; α/2
  • α; α
  • α/2; α/2
  • α/2; α

Q2. We will continue with the hypothesis test on the coefficient for promotional expenditures. The estimated coefficient is 1802.61 with a standard error of 392.85, and the claim is that the true value is 2000. The residual degrees of freedom obtained in the regression output is 20.

What is the p-value for this hypothesis test?

  • 0.65
  • 0.62
  • 0.03
  • 0.06

Quiz 3: Practice Quiz

Q1. Review the video again to find the 95% confidence interval for the coefficient for promotional expenditures. What can we conclude about the claim that the true value of the coefficient is 2000?

  • Because 2000 lies within the 95% confidence interval, we cannot reject the null hypothesis.
  • Because 2000 lies within the 95% confidence interval, we can reject the null hypothesis.
  • Because 2000 lies outside the 95% confidence interval, we cannot reject the null hypothesis.
  • Because 2000 lies outside the 95% confidence interval, we can reject the null hypothesis.

Q2. The p-value provided by Excel for each coefficient corresponds to the hypothesis test as to whether each coefficient is zero. Suppose the p-value for a coefficient is greater than our α value. What can we conclude about the estimated coefficient?

  • It is not significant because we can reject the claim that the true value is zero.
  • It is significant because we can reject the claim that the true value is zero.
  • It is not significant because we cannot reject the claim that the true value is zero.
  • It is significant because we cannot reject the claim that the true value is zero.

Quiz 4: Practice Quiz

Q1. Refer to the regression output from the video lesson. The coefficient for the annual income is 0.4891. What is an appropriate interpretation for this value? Mark all that apply.

  • For every dollar increase in income, the home price increases by 0.4891 dollars, all other variables remaining the same.
  • For every dollar increase in home price, the income increases by 0.4891 dollars, all other variables remaining the same.
  • For every dollar increase in income, the home price decreases by 0.4891 dollars, all other variables remaining the same.
  • For every 1000 dollar increase in income, the home price increases by 489.1 dollars, all other variables remaining the same.

Q2. What does the estimated value of 0.4891 tell us about the true value of the coefficient?

  • The true value could be greater than or less than 0.4891.
  • The true value must be greater than 0.4891.
  • The true must be less than 0.4891.
  • The true value must equal 0.4891.

Quiz 5: Practice Quiz

Q1. True or false: the R-square value indicates the proportion of total sum of squares explained by the regression.

  • True
  • False

Q2. When explanatory variables are added to a regression, the R-square value _ increases whereas the adjusted R-square value _ increases.

  • always; always
  • sometimes; never
  • always; sometimes
  • never; always

Quiz 6: Practice Quiz

Q1. Which of the following could be appropriate categorical variables?

  • gender
  • weight
  • profession
  • eye color
  • height

Q2. A categorical variable that has five different categories requires __ dummy variables.

Enter answer here

Quiz 7: Regression Analysis: Hypothesis Testing and Goodness of Fit

Q1. Download the file (Final Exam Scores.xlsx), which provides data for the following variables: Final Exam Score, Attended Review Session, Mid-Term Score, and Homework Score, from a sample size of 40 students.

Please run a multiple regression with ‘Final Exam Score’ as the dependent variable and the remaining variables as independent variables. Remember, for categorical variable(s), you will need to create dummy or indicator variable(s).

Which of the following variables is not statistically significant? Assume an alpha level of .05.

  • Intercept
  • Attended Review Session
  • Mid-Term Score
  • Homework Score

Q2. If a student received scores of 0 on both the Mid-Term and Homework, and did not attend the Review Session, what would one predict his or her score on the final exam to be? Please round to the nearest whole number.

Enter answer here

Q3. There is a belief among students that if they attend the Review Session, it will increase their Final Exam Scores by 10 points. You need to evaluate this belief by setting up an appropriate hypothesis test.

First, please calculate the t-statistic for this hypothesis test. Please round to two decimal points.

Enter answer here

Q4. Please calculate the t-cutoff for this hypothesis testing; round the answer to two decimal points. Assume α = .05.

What is the absolute value of the t-cutoff?

Enter answer here

Q5. Based on the t-statistic and the t-cutoff calculated, what is your conclusion regarding the belief held by students?

  • The t-statistic falls in the rejection region, therefore we reject the belief.
  • The t-statistic falls in the rejection region,
    therefore we fail to reject the belief.
  • The t-statistic does not fall in the rejection region,
    therefore we reject the belief.
  • The t-statistic does not fall in the rejection region, therefore we fail to reject the belief.

Q6. Now, please test the same hypothesis, this time using the appropriate confidence interval. What is the lower limit of the confidence interval, rounded to two decimal places?

Enter answer here

Q7. What is the upper limit of the confidence interval, rounded to two decimal places?

Enter answer here

Q8. Utilizing the range of the confidence interval, what is your conclusion regarding the belief held by students?

  • The belief falls outside the range of the confidence interval, therefore we do not reject it.
  • The belief falls within the range of the confidence interval, therefore we do not reject it.
  • The belief falls outside the range of the confidence interval, therefore we reject it.
  • The belief falls within the range of the confidence interval, therefore we reject it.

Q9. R-square helps explain goodness of fit, but one can increase the R-square for any regression model just by adding more X variables. Adjusted R-square is an attempt to account for this phenomenon. What is the Adjusted R-square for this regression model, rounded to two decimal places?

Enter answer here

Q10. A common misconception is that a low R-square is of no use. When might you NOT want to use a model with a lower R-square?

  • You want to understand the relationship between
    the dependent and independent variables
  • You want to see the effects of changes in one variable on another
  • You need to make an accurate prediction

Week 3 Quiz Answers

Quiz 1: Practice Quiz

Q1. Refer to the example shown in the video; region is represented by two separate dummy variables, REGA and REGB, such that region C is the reference category. Which combination of values for REGA and REGB are valid? Select all that apply.

  • REGA = 0; REGB = 0
  • REGA = 1; REGB = 2
  • REGA = 1; REGB = 0
  • REGA = 1; REGB = 1
  • REGA = 0; REGB = 1

Q2. Continue with the same example. To denote that a delivery is made to region C, what should the values of REGA and REGB be?

  • REGA = 0; REGB = 1
  • REGA = 0; REGB = 0
  • REGA = 1; REGB = 1
  • REGA = 1; REGB = 0

Quiz 2: Practice Quiz

Q1. Refer to the regression from the video. Which of the following regions can be used as the reference category? Select all that apply.

  • Region A
  • Region B
  • Region C

Q2. Suppose we choose region A as the reference category. We run the regression and obtain the following equation:

Minutes = β0 + β1REGB + β2REGC+ β3Parcels + β4TruckAge.

What does β2 represent?

  • The difference between the fixed time to deliver to region C versus the fixed time to deliver to region A
  • The fixed time it takes to deliver to region C
  • The difference between the fixed time to deliver to region C versus the fixed time to deliver to region B
  • The time it takes to deliver to region C when all other explanatory variables are 0

Quiz 3: Practice Quiz

Q1. Refer to the regression from the video with the following estimated equation:

Minutes = -34.76 + 107.71REGA + 1.21REGB+ 9.92Parcels + 3.68TruckAge.

Approximately how long does it take to deliver 50 parcels to region A using a truck that is 5 years old? Round your answer to the lowest integer.

  • 307
  • 480
  • 587
  • 569

Q2. Suppose the truck driver is on a tight schedule and wants to reduce the time of delivery by at least 100 minutes. Which of the following changes made to the delivery in question 1 would accomplish this goal? Select all that apply.

  • Use a brand new truck instead of a 5-year old truck
  • Deliver the same number of parcels to region B instead of region A
  • Deliver 30 parcels instead of 50 parcels to region A
  • Deliver the same number of parcels to region C instead of region A

Quiz 4: Practice Quiz

Q1. Refer to the video lesson. When the first regression using CoolSize is changed to the second regression using RefSize, why must the column for CoolSize be moved to the far right?

  • After performing the calculation to obtain RefSize, CoolSize must be moved to the end to allow the formula to work correctly.
  • When running the regression in Excel, all explanatory variables must be placed side by side, so CoolSize must be moved to the end.
  • CoolSize is moved to the end strictly for aesthetic purposes.

Q2. Refer to the regression using FreezeSize and RefSize. When interpreting a unit increase in the coefficient for FreezeSize, we assume that all other variables remain the same. What does this imply about the change in CoolSize (hint: CoolSize is not a variable in this regression, but can be derived from the values of FreezeSize and RefSize)?

  • CoolSize must increase if FreezeSize increases
  • CoolSize must decrease if FreezeSize increases
  • CoolSize must stay the same if FreezeSize increases
  • The change in CoolSize cannot be inferred

Quiz 5: Practice Quiz

Q1. Refer to the regressions on refrigerator price. How many dollars does the price of the refrigerator increase by when the freezer size increases by 1 cubic foot and the cooler size remains the same?

  • 213.88
  • 76.50
  • 137.38

Q2. How many dollars does the price of the refrigerator increase by when the freezer size increases by 1 cubic foot and the cooler size decreases by 1 cubic foot?

  • 76.50
  • 137.38
  • 213.88

Quiz 6: Practice Quiz

Q1. Which of the following is true regarding a regression with a high level of multicollinearity? Select all that apply.

  • The regression might still be able to predict the dependent variable accurately
  • The regression will not be able to predict the dependent variable accurately
  • The regression can be used to interpret the impact of coefficients accurately
  • The regression cannot be used to interpret the impact of coefficients accurately

Q2. Which of the following pairs of explanatory variable likely has the highest amount of correlation?

  • length of right foot and length of left foot of a person
  • height and weight of a person
  • height and salary of a person
  • weight and salary of a person

Quiz 7: Regression Analysis: Model Application and Multicollinearity

Q1. For questions 1-5, download the file ‘Sales by Territory.xlsx’, which provides data in the following categories: Sales Revenue (in $), Territory, Quantity of Orders, and Number of Sales Calls, from a sample of size 38.

Please run a multiple regression with Sales Revenue as your dependent variable to estimate the coefficients for each of the independent variables; remember, for categorical variables, you will need to create dummy or indicator variables. Please use “West” Territory as your reference variable, and assume an alpha of .05.

Which of the following variables are statistically significant according to the p-values?

  • Number of Sales Calls
  • Intercept
  • Quantity of Orders

Q2. What is the value of your Y variable when all X variables are zero? That is, what is the value of ‘Sales Revenue’ when all X variables are zero. Round your answer to a single decimal.

Enter answer here

Q3. For Territory “South”, what is the coefficient, rounded to two decimal places?

Enter answer here

Q4. Based on the p-value of Territory “South”, which
of the following statements are true?

  • The variable is solely there to fit the model.
  • There would be no managerial significance.
  • Sales in “South” territory are less than sales in “West” territory, all other variables held at the same level.
  • The coefficient for Territory “South” is statistically different from Territory “West”, all other variables remaining at the same level.

Q5. Please estimate what Sales Revenue would be for a salesperson covering the “South” Territory, making 28 Sales Calls, and taking 200 Orders, rounding the answer to two decimal points.

Enter answer here

Q6. Which of the following could be a sign that a regression model has multicollinearity issues? Please mark all that apply

  • The independent variables have low p-values, yet the overall fit of the model is high.
  • The independent variables have high p-values, yet the overall fit of the model is high.
  • The independent variables have low p-values, yet the overall fit of the model is low.
  • The signs on some of the coefficients are contrary to common sense.

Q7. Which of the following statements are false
regarding multicollinearity in a regression model?

  • Multicollinearity is always a problem; to correct, one should remove X variables causing high correlation.
  • Multicollinearity matters more when interpreting coefficient impacts; to correct, one should remove X variables causing high correlation.
  • Multicollinearity should always be corrected when one is using a model for predictive purposes.
  • Multicollinearity may not need to be corrected for if one is only utilizing a model for predictive purposes.

Q8. What is the R-Square for this model, rounding to two decimal points?

Enter answer here

Q9. Within our regression, we notice that some of our independent variables have statistically insignificant p-values, yet as calculated above, the R-Square for the model is okay. This may imply some multicollinearity. Please calculate the correlation between “Number of Sales Calls” and “Quantity of Orders”, rounding to two decimal points.

Enter answer here

Q10. Does your model suffer from multicollinearity? Please mark all that apply

  • Most likely, multicollinearity is an issue in the regression model.
  • Low correlation drives multicollinearity in this model.
  • High correlation drives multicollinearity in this model.
  • Most likely, multicollinearity is not an issue in the regression model.

Week 4 Quiz Answers

Quiz 1: Practice Quiz

Q1. Please select all that ‘Mean-centering of variables’ does for a regression model.

  • Mean centering is never useful in a regression model.
  • It makes the intercept to be interpreted more meaningfully.
  • It improves R-square.
  • It improves prediction using the regression model.

Q2. One could center variables at a value other than the mean. True or False?

  • False
  • True

Q3. Mean-centering the Y variable helps in the interpretation of the intercept in the regression model. True or False?

  • True
  • False

Quiz 2: Practice Quiz

Q1. Please choose all that applies.

  • The formula for the confidence interval for a predicted value uses the ‘standard error’ of regression produced below the adjusted R-Square.
  • The predicted value (the point prediction) is exactly at the center of the confidence interval.
  • All values in the confidence interval are equally likely.
  • The confidence interval for the predicted value is a way of incorporating uncertainty in our prediction.

Q2. What is the correct formula for the margin of error for constructing a 95% confidence interval for the predicted value?

  • |T.INV(0.025,residual df)|*std error of regression
  • |T.DIST(0.025,residual df)|*std error of regression
  • |T.INV(0.05,residual df)|*std error of regression
  • T.INV(0.025,residual df)*std error of regression

Quiz 3: Practice Quiz

Q1. Which of the following are true in regard to interaction variables? Please select all that apply.

  • Interaction
    variable allow you to study the impact of one variable at different levels of
    another variable.
  • Interaction variables are created by adding the variables.
  • Interaction variables are created by multiplying the variables.
  • Interaction variables are created by either adding or multiplying the variables.

Q2. Following is a regression equation developed using salary data for employees at a company:

Salary = β0 + β1Male

β2Age + β3Male*Age

Salary is measured in dollars. Age is measured in years and Male is a dummy variable representing the categorical variable Gender.

What is the interpretation of β2 ? Please mark the most appropriate answer.

  • It is the salary change with each additional year of age.
  • It is the change in age with every dollar increase in salary.
  • It is the change in salary with each additional year of age for a female employee.
  • It is the change in salary with each additional year of age for a male employee.

Quiz 4: Practice Quiz

Q1. When creating an interaction variable, one of the variables has to be a dummy variable. Is this statement True or False?

  • True
  • False

Q2. Following is a regression equation equating salary to
gender and years of experience..

Salary = β0 + β1Male + β2Years_of_Experience

β3Male*Years_of_Experience

Salary is measured in dollars. Years_of_Experience is
measured in years and Male is a dummy variable representing the categorical
variable Gender.

What is the interpretation of β3 ? Please mark
the most appropriate answer.

  • It is the salary change for male employees.
  • It is the change in years of experience at different salary levels.
  • It is the change in salary with each additional year of experience.
  • It is the ‘extra’ change in salary with each additional year of experience for a male employee as compared to a female employee.

Quiz 5: Practice Quiz

Q1. Please select all statements that apply.

  • Transforming variables in a regression implies adding additional variables to the regression model.
  • Transforming variables in a regression may improve R-square of the model.
  • Natural log transformation is a common transformation used in regression.
  • There are transformations other than the natural log that can be used in regression.
  • Transforming variables in a regression may improve the linearity of the model.

Q2. In the following regression model, what is the correct interpretation of β1?

LN(Y) = β0 + β1X1 + β2X2

Please select all that apply.

  • For every unit increase in X1, the natural log of Y variable increases by β1 units, all other variables kept at the same level.
  • For every % increase in X1, the Y variable increases by β1 %, all other variables kept at the same level.
  • For every % increase in X1, the natural log of Y variable increases by β1 %, all other variables kept at the same level.
  • For every unit increase in in X1, the natural log of Y variable increases by 100*β1 %, all other variables kept at the same level.
  • For every unit increase in in X1, the Y variable increases by 100*β1 %, all other variables kept at the same level.

Q3. In the following regression model, what is the correct interpretation of β2?

LN(Y) = β0 + β1ln(X1) + β2ln(X2)

Please select all that apply.

  • For every % increase in X2, the Y variable increases by β2 %, all other variables kept at the same level.
  • For every unit increase in in X2, the Y variable increases by 100*β2 %, all other variables kept at the same level.
  • For every unit increase in X2, the natural log of Y variable increases by β2 units, all other variables kept at the same level.
  • For every % increase in X2, the natural log of Y variable increases by β2 %, all other variables kept at the same level.
  • For every unit increase in in X2, the natural log of Y variable increases by 100*β2 %, all other variables kept at the same level.

Quiz 6: Practice Quiz

Q1. Which of the following is the right function to calculate the natural log in Excel?

  • =LOG( )
  • =LN( )
  • =NLOG( )
  • =NATLOG( )

Q2. The coefficients in a log-log model can directly be interpreted as:

  • Growth rates
  • Elasticities
  • Exponential decays
  • Sum of squares

Q3. Which of the following are reasons to take a natural log transformation of variables in a regression model?

Select all that apply.

  • To make the regression complicated.
  • To improve the R-square measure.
  • It is popular to take the natural log transformation.
  • To interpret the beta coefficients directly as elasticities or growth rates.

Quiz 7: Regression Analysis: Various Extensions

Q1. Data for Questions 1 through 5 are contained in the file realestate.xlsx. Please download this file.

The data contains information about apartment prices and characteristics for a sought after area in a large metropolitan city in the USA. The data include sale price (PRICE) in $, floor area (SQFT) in square feet, number of bedrooms (BED), number of bathrooms (BATH), number of floors in the building (FLOORS), and distance from a centrally located city park (DIST) in meters.

You need to establish a relationship between PRICE and these other characteristics. Specifically, estimate the following regression model,

LN(PRICE) = β0 + β1LN(SQFT) + β2BED + β3BATH + β4FLOORS + β5DIST

Notice that in the regression you need to take a log transformation of PRICE and SQFT variables. Report the estimated value of β4, round the answer to four decimal digits.

Enter answer here

Q2. How do you interpret the coefficient estimate of β1 ?

  • When the size of the apartment increases by 1%, then the Price increases by 1.013%, all other variables remaining at the same level.
  • When the Price increases by a 100,000$, the size of apartment increases by 1.013*100 = 101.3 sqft.
  • When the size of the apartment increases by 1 unit, then the Price increases by 1.013 units, all other variables remaining at the same level.
  • When the size of the apartment increases by 1 unit, then the Price increases by 1.013 %, all other variables remaining at the same level.

Q3. What is the impact of an additional Bathroom on apartment price?

  • All other variables being held constant, an additional Bathroom does not significantly impact the price.
  • All other variables being held constant, an additional Bathroom raises the apartment price by 0.0293%.
  • All other variables being held constant, an additional Bathroom raises the apartment price by 29,300$.
  • All other variables being held constant, an additional Bathroom raises the apartment price by 2.93%.

Q4. Using the estimated regression model, predict the price in dollars of an apartment that is 1000 sqft in size, has 2 Bedrooms, 2 Bathrooms, is in a building with 8 Floors and is 1.2 Km from the City Park. Round your answer to a whole number, input the answer without any “$” or “,” sign.

Enter answer here

Q5. Calculate a 95% confidence interval for your predicted price from Question 4.

Report the lower limit of the confidence interval (in dollars), round your answer to a whole number. Input the answer without any “$” or “,” sign.

Enter answer here

Q6. Data for Questions 6 through 11 is contained in the file Majors.xlsx. Please download this file.

The data contains information about the starting salary of a sample of 50 undergraduate students at a Business school. The data consists of the starting salary (SALARY) in dollars, the field of study of the student (MAJOR), the field of study is either ‘Finance’ or ‘International Business’. Finally, the variable UGPA is undergraduate Grade Point Average of the student.

Estimate a regression model linking starting salary to the field of study and UGPA as follows,

SALARY = β0 + β1IB + β2UGPA

In the above regression, IB is a dummy variable which takes a value =1 when the MAJOR is IB, otherwise it takes a value 0.

Report the estimated value of β1, round the answer to a whole number.

Enter answer here

Q7. Now, mean center the UGPA variable. That is, subtract the mean value of UGPA from all the data points. Denote this mean centered variable as [UGPA].

Run a regression as follows,

SALARY = β0 + β1IB + β2[UGPA]

Round the estimated value of β0 to a whole number and interpret it. Please mark all that apply.

  • 60,630$ is the salary of a FINANCE Major with a UGPA equal to the average UGPA observed in the data.
  • 60,630$ is the salary of a IB Major with a UGPA equal to the average UGPA observed in the data.
  • 60,630$ is the salary of a FINANCE Major with 0 UGPA
  • 60,630$ is the value of the Y variable when all X variables are zero.

Q8. Based on the regression carried out in Question 7, how much less salary (in dollars) does a IB Major get as compared to a FINANCE Major, when they have the same UGPAs. Round your answer to a whole number. Input the answer without any “$” or “,” sign.

Enter answer here

Q9. There is a belief among students that higher UGPA is more important in terms of impacting starting salary for IB undergraduates as compared to FINANCE undergraduates.

You can empirically check for this belief by introducing an interaction variable in your regression model constructed in Question 7 and then checking the estimated coefficient for that variable.

  • To introduce the interaction variable which variables would you interact.
  • Intercept and IB
  • IB and [UGPA]
  • Intercept and [UGPA]

Q10. Introduce an interaction effect in your data and estimate the model. Report the estimate of the coefficient on the interaction variable. Please round your answer to a whole number.

Enter answer here

Q11. How do you interpret the coefficient on the interaction effect ?

  • The coefficient is the differential impact of UGPA on starting salary of FINANCE majors as compared to IB majors.
  • The coefficient is the impact of UGPA on starting salary of IB majors.
  • The coefficient is the impact of UGPA on starting salary of FINANCE majors.
  • The coefficient is the differential impact of UGPA on starting salary of IB majors as compared to FINANCE majors.
  • The coefficient is the impact of UGPA on starting salary.
Conclusion:

I hope this Linear Regression for Business Statistics Coursera Quiz Answer would be useful for you to learn something new from this Course. If it helped you then don’t forget to bookmark our site for more Quiz Answers.

This course is intended for audiences of all experiences who are interested in learning about new skills in a business context; there are no prerequisite courses.

Keep Learning!

Get All Course Quiz Answers of Business Statistics and Analysis Specialization

Introduction to Data Analysis Using Excel Coursera Quiz Answers

Basic Data Descriptors, Statistical Distributions, and Application to Business Decisions Quiz Answers

Business Applications of Hypothesis Testing and Confidence Interval Estimation Quiz Answers

Linear Regression for Business Statistics Quiz Answers

Leave a Reply

Your email address will not be published.

error: Content is protected !!