Can you use bivariate analysis in a multiple regression problem

shiya43

shiya43

Answered question

2022-10-29

Can you use bivariate analysis in a multiple regression problem
I have one response variable and around 10 potential explanatory variables. I am looking for simple ways to visualize and explore my data before modelling. As well as carrying out PCA and looking for correlations between my explanatory variables I wanted to look at bi-variate scatter plots of the response with each explanatory variable - but I have been told this wont work.
This is they way I am picturing it -
Imagine you have n independent, uncorrelated explanatory variables.
You propose to use these variables in a multiple regression. You want to simplify the analysis by not including more variables than is necessary.
So , you want to know if the explanatory variables have any correlation with the response variable to decide whether to include the variable in the regression.
My feeling is 1) that you can look at bi-variate plots for the response variable with each explanatory variable to see if there is a correlation.
2) If a single explanatory variable has no correlation with a the response in bi-variate correlation, it cannot have any affect on the response if included in the multiple regression.
(The only way this variable could affect the response variable would through an interaction with another explanatory variable - but it has already been determined that all the explanatory variables are uncorrelated.)
I have been told that the above is incorrect and a variable showing no correlation in a bi-variate plot can affect the response variable in a multiple regression.
Could someone help me see where I am going wrong? Many Thanks.

Answer & Explanation

Jimena Torres

Jimena Torres

Beginner2022-10-30Added 20 answers

Pairwise uncorrelated ( p) explanatory variables form an orthogonal design matrix, i.e.,
x i T x j = 0
for all i j, and let for the sake of simplicity each vector x i be normalized to a unit "ball", i.e., x i T x i = 1
The OLS vector is given by
β ^ = ( X X ) 1 X y ,
Namely,
β ^ j = i = 1 n x j i y i ,
meaning that the coefficient of the jth explanatory variable does not depend on other variables. Note that in the simple linear regression of y on x j the OLS coefficient will be
i x i j y i i x i j 2 = i x i j y i ,
namely, the same as in the multiple regression.
PCA in this case will result in p orthogonal eigenvectors that are the initial axes and all the eigenvalues will equal 1

Do you have a similar question?

Recalculate according to your conditions!

New Questions in Inferential Statistics

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?