May 12, 2021 R language tutorial
Logical regression is a regression model in which the response variable (due variable) has a classification value such as True/False or 0/1. I t is actually based on the mathematical equation associated with the predictor that measures the probability of a binary response as the value of the response variable.
The general mathematical equation for logical regression is -
y = 1/(1+e^-(a+b1x1+b2x2+b3x3+...))
The following is a description of the parameters used -
y is the response variable.
x is a predictor.
a and b are coefficients that are constants for numbers.
The function used to create the regression model is the glm() function.
The basic syntax of the glm() function in logical regression is -
glm(formula,data,family)
The following is a description of the parameters used -
Formula is a symbol that represents the relationship between variables.
data is a dataset that gives the values of these variables.
Family is an R language object that specifies the details of the model. I ts value is two logical regressions.
The built-in dataset "mtcars" describes different models of cars with various engine specifications. I n the "mtcars" data set, the transfer mode (automatic or manual) is described by the am column, which is a binary value (0 or 1). W e can create a logical regression model between columns "am" and the other 3 columns (hp, wt, and cyl).
# Select some columns form mtcars. input <- mtcars[,c("am","cyl","hp","wt")] print(head(input))
When we execute the code above, it produces the following results -
am cyl hp wt Mazda RX4 1 6 110 2.620 Mazda RX4 Wag 1 6 110 2.875 Datsun 710 1 4 93 2.320 Hornet 4 Drive 0 6 110 3.215 Hornet Sportabout 0 8 175 3.440 Valiant 0 6 105 3.460
We use the glm() function to create a regression model and get a summary of it for analysis.
input <- mtcars[,c("am","cyl","hp","wt")] am.data = glm(formula = am ~ cyl + hp + wt, data = input, family = binomial) print(summary(am.data))
When we execute the code above, it produces the following results -
Call: glm(formula = am ~ cyl + hp + wt, family = binomial, data = input) Deviance Residuals: Min 1Q Median 3Q Max -2.17272 -0.14907 -0.01464 0.14116 1.27641 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 19.70288 8.11637 2.428 0.0152 * cyl 0.48760 1.07162 0.455 0.6491 hp 0.03259 0.01886 1.728 0.0840 . wt -9.14947 4.15332 -2.203 0.0276 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 43.2297 on 31 degrees of freedom Residual deviance: 9.8415 on 28 degrees of freedom AIC: 17.841 Number of Fisher Scoring iterations: 8
In summary, for the variables "cyl" and "hp", the p-values in the last column are greater than 0.05, and we don't think it matters if they contribute to the values of the variables "am". O nly weight (wt) affects the "am" value in the regression model.