Multiple Linear Regression
Multiple Linear Regression
Introduction
Multiple linear regression (MLR), also known simply as multiple regression, is a statistical technique that uses several explanatory variables to predict the outcome of a response variable. The goal of multiple linear regression (MLR) is to model the linear relationship between the explanatory (independent) variables and response (dependent) variable.Today I will introduce you some basic concept and procedures, talk about the limit of multiple linear regression model.
Model Specification
MLR Model:
E(Y|X1 =x1,X2 =x2,...,Xp =xp)=b0 +b1x1 +b2x2 +...+bpxp . Thus
Yi =b0 +b1x1i +b2x2i +...+bpxpi +ei
ei = random fluctuation (or error) in Yi such that E(ei | X) = 0. In this case the response variable Y is predicted from p predictor (or explanatory) variables X1, X2, ..., Xp and the relationship between Y and X1, X2, ..., Xp is linear in the parameters b0, b1, b2,..., bp.
Matrix formulation:
A convenient way to study the properties of the least squares estimates,
hat_b0,b1,b2,…,bp is to use matrix and vector notation. Define the (n ◊ 1) vector, Y, the n ◊ (p + 1) matrix, X, the (p + 1) ◊ 1 vector, b of unknown regression parameters and the (n ◊ 1) vector, e of random errors by
R Code
(form Sheather, 2009)
Data from surveys of customers of 168 Italian restau- rants in the target area are available. The data are in the form of the average of customer views on
Y = Price = the price (in $US) of dinner (including 1 drink & a tip) x1= Food = customer rating of the food (out of 30)
x2= Décor = customer rating of the decor (out of 30)
x3= Service = customer rating of the service (out of 30)
x4= East = dummy variable = 1 (0) if the restaurant is east (west) of Fifth Avenue
The data are given on the book web site in the file nyc.csv. The source of the data is the following restaurant guide book
Zagat Survey 2001: New York City Restaurants, Zagat, New York
we shall begin by considering the following model:
Y =b0 +b1x1 +b2x2 +b3x3 +b4x4 +e
- The initial regression model is
Price = – 24.02 + 1.54 Food + 1.91 Decor – 0.003 Service + 2.07 East
At this point we shall leave the variable Service in the model even though its regression coefficient is not statistically significant.
The Difference Between Linear and Multiple Regression
Ordinary linear squares regression compares the response of a dependent variable given a change in some explanatory variables. However, it is rare that a dependent variable is explained by only one variable. In this case, an analyst uses multiple regression, which attempts to explain a dependent variable using more than one independent variable.
Multiple regressions are based on the assumption that there is a linear relationship between both the dependent and independent variables. It also assumes no major correlation between the independent variables.
Reference
https://www.investopedia.com/terms/m/mlr.asp
Sheather, S. J. (2009). A modern approach to regression with R.
https://www.facebook.com/qlik/photos/a.10150386567645203/10161259674855203/?type=3




Very useful!
ReplyDeleteGood Job!!
ReplyDeleteYour summary is very specific. I learned a lot.
ReplyDeleteLike the way you summarize the knowledge!!! These code really helps me a lot!!!
ReplyDeleteVery nice work!!!
ReplyDeleteVery impressive! I have learned this in my statistic class as well, this blog enhances my understanding of multiple linear regression.
ReplyDeletevery detailed procedure, like it !
ReplyDeleteI like your explanation!
ReplyDelete