Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

R language multiple regression


May 12, 2021 R language tutorial


Table of contents


Multiple regression is an extension of the relationship between linear regression to more than two variables. I n a simple linear relationship, we have a predictor and a response variable, but in multiple regressions, we have multiple predictors and a response variable.

The general mathematical equations for multiple regression are -

y = a + b1x1 + b2x2 +...bnxn

The following is a description of the parameters used -

  • y is the response variable.

  • a,b1,b2 ... b n is the coefficient.

  • x1,x2,... x n is the predictor.

We use the lm() function in the R language to create a regression model. T he model uses input data to determine the value of the coefficient. N ext, we can use these factors to predict the value of the response variable for a given set of predictors.

lm() function

This function creates a model of the relationship between predictors and response variables.

Grammar

The basic syntax of the lm() function in multi-regression is -

lm(y ~ x1+x2+x3...,data)

The following is a description of the parameters used -

  • A formula is a symbol that represents the relationship between a response variable and a predictor.

  • The data is the vector to which the formula is applied.

Cases

Enter the data

Consider the dataset "mtcars" available in the R-language environment. It gives a comparison of mileage per gallon (mpg), cylinder displacement ("disp"), horsepower ("hp"), car weight ("wt") and some other parameters between different car models.

The goal of the model is to establish a relationship between "mpg" as a response variable and "disp", "hp" and "wt" as predictors. To do this, we create a subset of these variables from the mtcars data set.

input <- mtcars[,c("mpg","disp","hp","wt")]
print(head(input))

When we execute the code above, it produces the following results -

                   mpg   disp   hp    wt
Mazda RX4          21.0  160    110   2.620
Mazda RX4 Wag      21.0  160    110   2.875
Datsun 710         22.8  108     93   2.320
Hornet 4 Drive     21.4  258    110   3.215
Hornet Sportabout  18.7  360    175   3.440
Valiant            18.1  225    105   3.460

Create a relationship model and get a coefficient

input <- mtcars[,c("mpg","disp","hp","wt")]

# Create the relationship model.
model <- lm(mpg~disp+hp+wt, data = input)

# Show the model.
print(model)

# Get the Intercept and coefficients as vector elements.
cat("# # # # The Coefficient Values # # # ","
")

a <- coef(model)[1]
print(a)

Xdisp <- coef(model)[2]
Xhp <- coef(model)[3]
Xwt <- coef(model)[4]

print(Xdisp)
print(Xhp)
print(Xwt)

When we execute the code above, it produces the following results -

Call:
lm(formula = mpg ~ disp + hp + wt, data = input)

Coefficients:
(Intercept)         disp           hp           wt  
  37.105505      -0.000937        -0.031157    -3.800891  

# # # # The Coefficient Values # # # 
(Intercept) 
   37.10551 
         disp 
-0.0009370091 
         hp 
-0.03115655 
       wt 
-3.800891 

Create an equation for the regression model

Based on the above intercept and coefficient values, we created a mathematical equation.

Y = a+Xdisp.x1+Xhp.x2+Xwt.x3
or
Y = 37.15+(-0.000937)*x1+(-0.0311)*x2+(-3.8008)*x3

Apply equations to predict new values

When providing a new set of displacement, horsepower, and weight values, we can use the regression equation created above to predict mileage.
For cars with disp s 221, hp s 102 and wt s 2.91, the projected mileage is -

Y = 37.15+(-0.000937)*221+(-0.0311)*102+(-3.8008)*2.91 = 22.7104