P-Value:
A p-value, a “probability value,” is a statistical measure used in hypothesis testing to determine the strength of evidence against a null hypothesis. The null hypothesis is a statement that there is no significant effect or relationship in a given set of data.
- For example, the Null hypothesis (H0) would be that the coin is fair, meaning it has an equal chance of landing on heads or tails (probability of each = 0.5). An alternative hypothesis (Ha) would be that the coin is biased towards tails, meaning it’s more likely to land on tails than heads.
A linear regression plot on two variables, Obesity, and Diabetic, helps you visualize the relationship between these two variables and understand how they are related linearly. The formula y = c + b*x, where y = estimated dependent variable score, c = constant, b = coefficient, and x = score on the independent variable. Regression analysis draws a line through these points that minimizes their overall distance from the line. More specifically, least squares regression minimizes the sum of the squared differences between the data points and the line. Following the practice in statistics, the Y-axis displays the dependent variable, % DIABETIC. The X-axis shows the independent variable, which is the % OBESE. The Pearson correlation coefficient is used to measure the strength of a linear association between Obesity and Diabetes. The values have a moderate positive correlation.
array([[1. , 0.38532577], [0.38532577, 1. ]])
The Breusch-Pagan test is a statistical test used to detect heteroskedasticity in a regression model. Heteroskedasticity occurs when the variance of the residuals in the regression model is not constant across all levels of the independent variables, violating one of the assumptions of linear regression.