Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

R language linear regression


May 12, 2021 R language tutorial


Table of contents


Regression analysis is a very widely used statistical tool for modeling the relationship between two variables. O ne of these variables is called a predictor, and its values are collected experimentally. A nother variable is called a response variable, whose value is derived from the predictor.

In linear regression, the two variables are related by equations, where the exponent (power) of the two variables is 1. Mathematically, the linear relationship represents a straight line when drawn as a graph. A nonlinear relationship in which the exponent of any variable is not equal to 1 creates a curve.

The general mathematical equation for linear regression is -

y = ax + b

The following is a description of the parameters used -

  • y is the response variable.

  • x is a predictor.

  • A and b are called coefficient constants.

The step to establish a regression

A simple example of regression is to predict a person's weight when his height is known. T o do this, we need to have a relationship between a person's height and weight.

The steps to create a relationship are -

  • Conduct experiments to collect samples of observations of height and corresponding weight.

  • Create a relationship model using the lm() function in the R language.

  • Find the coefficients from the model you created and use them to create mathematical equations

  • Get a summary of the relationship model to understand the average error in the forecast. A lso known as residuals.

  • To predict the weight of a new person, use the predict() function in R.

Enter the data

Below is sample data for observation -

# Values of height
151, 174, 138, 186, 128, 136, 179, 163, 152, 131

# Values of weight.
63, 81, 56, 91, 47, 57, 76, 72, 62, 48

LM() function

This function creates a model of the relationship between predictors and response variables.

Grammar

The basic syntax of the lm() function in linear regression is -

lm(formula,data)

The following is a description of the parameters used -

  • A formula is a symbol that represents the relationship between x and y.

  • The data is the vector to which the formula is applied.

Create a relationship model and get a coefficient

x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

# Apply the lm() function.
relation <- lm(y~x)

print(relation)

When we execute the code above, it produces the following results -

Call:
lm(formula = y ~ x)

Coefficients:
(Intercept)            x  
   -38.4551          0.6746 

Get a summary of the relevant

x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

# Apply the lm() function.
relation <- lm(y~x)

print(summary(relation))

When we execute the code above, it produces the following results -

Call:
lm(formula = y ~ x)

Residuals:
    Min      1Q     Median      3Q     Max 
-6.3002    -1.6629  0.0412    1.8944  3.9775 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -38.45509    8.04901  -4.778  0.00139 ** 
x             0.67461    0.05191  12.997 1.16e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.253 on 8 degrees of freedom
Multiple R-squared:  0.9548,    Adjusted R-squared:  0.9491 
F-statistic: 168.9 on 1 and 8 DF,  p-value: 1.164e-06

The predict() function

Grammar

The basic syntax of predict() in linear regression is -

predict(object, newdata)

The following is a description of the parameters used -

  • object is a formula that has been created using the lm() function.

  • newdata is a vector that contains the new value of the predictor.

Predict the weight of the new person

# The predictor vector.
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)

# The resposne vector.
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

# Apply the lm() function.
relation <- lm(y~x)

# Find weight of a person with height 170.
a <- data.frame(x = 170)
result <-  predict(relation,a)
print(result)

When we execute the code above, it produces the following results -

       1 
76.22869 

Visual regression graphically

# Create the predictor and response variable.
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
relation <- lm(y~x)

# Give the chart file a name.
png(file = "linearregression.png")

# Plot the chart.
plot(y,x,col = "blue",main = "Height & Weight Regression",
abline(lm(x~y)),cex = 1.3,pch = 16,xlab = "Weight in Kg",ylab = "Height in cm")

# Save the file.
dev.off()

When we execute the code above, it produces the following results -

R language linear regression