When working with bivariate data, which of these are useful when deciding whether it’s appropriate to use a linear model?
I. the scatterplot
II. the residuals plot
III. the correlation coefficient
A) I only
B) II only
C) III only
D) I and II only
E) I, II, and III
CoormaBak9
Answered question
2020-11-09
When working with bivariate data, which of these are useful when deciding whether it’s appropriate to use a linear model?
I. the scatterplot
II. the residuals plot
III. the correlation coefficient
A) I only
B) II only
C) III only
D) I and II only
E) I, II, and III
Answer & Explanation
Tuthornt
Skilled2020-11-10Added 107 answers
Step 1
Scatterplot:
A scatterplot is a type of data display that shows the relationship between two numerical variables. Each member of the data set gets plotted as a point whose (x, y) coordinates relates to its values for the two variables.
When the y variable tends to increase as the x variable increases, it can be said that there is a positive correlation between the variables. In other words, when the points on the scatterplot produce a lower left to upper right pattern, there is a positive correlation between the variables.
When the y variable tends to decrease as the x variable increases, it can be said that there is a negative correlation between the variables. In other words, when the points on the scatterplot produce an upper left to lower right pattern, there is a positive correlation between the variables.
When all the points on a scatterplot lie on a straight line, it can be said that there is a perfect correlation between the two variables.
A scatterplot in which the points do not have a linear trend (either positive or negative) is called a zero correlation or a near-zero correlation.
If the regression line is from lower left to upper right pattern, then relation is positive. It also indicates a positive correlation.
If the regression line is from upper left to lower right pattern, then relation is negative. It indicates a negative correlation.
Step 2
Residual plot:
A residual plot is a graph that may plot the residual (or the difference between the observed and predicted value of the response variable) against each predictor variable or against the predicted values of the response variable, according to need.
A careful inspection of the residual plots reveals the scattered-ness of the residual values with respect to values it is plotted against. If all the multiple linear regression inferences are met by the predictor variables and response variable, each of the plots must be somewhat centered and symmetric about the horizontal axis.
Thus, the plots used in the module to decide whether it is reasonable to presume that the assumptions for multiple linear regression inferences are met by the predictor variables and response variable are residual plots.
The properties of the residual plot are:
-A residual plot drawn against the predictor variables must be somewhat centered and symmetric about the horizontal axis.
-A residual plot drawn against the predicted values of the response variable must be somewhat centered and symmetric about the horizontal axis.
-The normal probability plot of the residuals must be linear or close to linear.
Step 3
Correlation coefficient:
Correlation a measure which indicates the “go-togetherness” of two data sets. It can be denoted as r. The value of correlation coefficient lies between –1 and +1. The positive 1 indicates that the two data sets are perfect and both are in same direction. The negative 1 indicates that the two data sets are perfect and both are in opposite direction. It will be zero when there is no relationship between the two data sets.
Linear correlation coefficient, measures the strength and the direction of a linear relationship between two variables.
Answer : I, II, and III.