Multiple linear regression is a statistical method that is an extension of simple linear regression in which more than one independent variable (X) is used to predict a single dependent variable (Y). The predicted value of Y is a linear transformation of the X variables such that the sum of squared deviations of the observed and predicted Y is a minimum. The computations are more complex, however, because the interrelationships among all the variables must be considered in the weights assigned to the variables. The interpretation of the results of a multiple regression analysis is also more complex for the same reason. With two independent variables the prediction of Y is expressed by the following equation:
Y’i = b0 + b1X1i + b2X2i
This transformation is similar to the linear transformation of two variables discussed in the previous chapter except that the w’s have been replaced with b’s and the X’i has been replaced with a Y’i.
The “b” values are called regression weights and are computed in a way that minimizes the sum of squared deviations
Multiple linear regression for Obesity, Inactivity, and diabetics. The relationship between two independent variables, “inactive” and “obesity”, and a dependent variable “diabetics”. R-squared, is a statistical measure used in regression analysis to evaluate the goodness of fit of a regression model. It provides an indication of the proportion of variance in the dependent variable that can be explained by the independent variables in the model. An R-squared value of 0.34 in multiple linear regression indicates that the independent variables included in the model collectively explain about 34% of the variability in the dependent variable.