Multiple Linear Regression

 Multiple Linear Regression

Introduction

Multiple linear regression (MLR), also known simply as multiple regression, is a statistical technique that uses several explanatory variables to predict the outcome of a response variable. The goal of multiple linear regression (MLR) is to model the linear relationship between the explanatory (independent) variables and response (dependent) variable.Today I will introduce you some basic concept and procedures, talk about the limit of multiple linear regression model.



Model Specification





MLR Model: 

E(Y|X1 =x1,X2 =x2,...,Xp =xp)=b0 +b1x1 +b2x2 +...+bpxp . Thus

Yi =b0 +b1x1i +b2x2i +...+bpxpi +ei 

ei = random fluctuation (or error) in Yi such that E(ei | X) = 0. In this case the response variable Y is predicted from p predictor (or explanatory) variables X1, X2, ..., Xp and the relationship between Y and X1, X2, ..., Xp is linear in the parameters b0, b1, b2,..., bp. 


Matrix formulation: 

A convenient way to study the properties of the least squares estimates,
hat_
b0,b1,b2,…,bp is to use matrix and vector notation. Define the (n 1) vector, Y, the n (p + 1) matrix, X, the (p + 1) 1 vector, b of unknown regression parameters and the (n 1) vector, e of random errors by 








R Code

(form Sheather, 2009)

Data from surveys of customers of 168 Italian restau- rants in the target area are available. The data are in the form of the average of customer views on 

Y = Price = the price (in $US) of dinner (including 1 drink & a tip) x1= Food = customer rating of the food (out of 30)
x
2= Décor = customer rating of the decor (out of 30)
x
3= Service = customer rating of the service (out of 30) 

x4= East = dummy variable = 1 (0) if the restaurant is east (west) of Fifth Avenue 

The data are given on the book web site in the file nyc.csv. The source of the data is the following restaurant guide book 

Zagat Survey 2001: New York City Restaurants, Zagat, New York 

we shall begin by considering the following model: 

Y =b0 +b1x1 +b2x2 +b3x3 +b4x4 +e 













  • The initial regression model is
    Price = – 24.02 + 1.54 Food + 1.91 Decor – 0.003 Service + 2.07 East

    At this point we shall leave the variable Service in the model even though its regression coefficient is not statistically significant.

The Difference Between Linear and Multiple Regression


Ordinary linear squares regression compares the response of a dependent variable given a change in some explanatory variables. However, it is rare that a dependent variable is explained by only one variable. In this case, an analyst uses multiple regression, which attempts to explain a dependent variable using more than one independent variable.

Multiple regressions are based on the assumption that there is a linear relationship between both the dependent and independent variables. It also assumes no major correlation between the independent variables.



Reference

https://www.investopedia.com/terms/m/mlr.asp


Sheather, S. J. (2009). A modern approach to regression with R.


https://www.facebook.com/qlik/photos/a.10150386567645203/10161259674855203/?type=3


Comments

  1. Your summary is very specific. I learned a lot.

    ReplyDelete
  2. Like the way you summarize the knowledge!!! These code really helps me a lot!!!

    ReplyDelete
  3. Very impressive! I have learned this in my statistic class as well, this blog enhances my understanding of multiple linear regression.

    ReplyDelete
  4. very detailed procedure, like it !

    ReplyDelete

Post a Comment