Laila Murphy

2022-11-20

Correlation bound

Let x and y be two random variables such that:

Corr(x,y) = b, where Corr(x,y) represents correlation between x and y, b is a scalar number in range of [-1, 1]. Let y' be an estimation of y. An example could be y'=y+(rand(0,1)-0.5)*.1, rand(0,1) gives random number between 0, 1. I am adding some noise to the data.

My questions are:

Is there a way where I can bound the correlation between x, y' i.e. Corr(x,y')?I mentioned y' in light of random perturbation, I would like to know what if I don't have that information, where I only know that y' is a estimation of y. Are there any literature that cover it?

Let x and y be two random variables such that:

Corr(x,y) = b, where Corr(x,y) represents correlation between x and y, b is a scalar number in range of [-1, 1]. Let y' be an estimation of y. An example could be y'=y+(rand(0,1)-0.5)*.1, rand(0,1) gives random number between 0, 1. I am adding some noise to the data.

My questions are:

Is there a way where I can bound the correlation between x, y' i.e. Corr(x,y')?I mentioned y' in light of random perturbation, I would like to know what if I don't have that information, where I only know that y' is a estimation of y. Are there any literature that cover it?

Julius Haley

Beginner2022-11-21Added 19 answers

Let $e={y}^{\prime}-y$. Assuming that e is independent from x and y with ${\mu}_{e}=E(e)=0$, then ${\mu}_{{y}^{\prime}}=E({y}^{\prime})=E(y)={\mu}_{y}$ and:

$\begin{array}{rl}\mathrm{C}\mathrm{o}\mathrm{r}\mathrm{r}(x,{y}^{\prime})& =\frac{E((x-{\mu}_{x})({y}^{\prime}-{\mu}_{y}))}{{\sigma}_{x}{\sigma}_{{y}^{\prime}}}\\ & =\frac{E((x-{\mu}_{x})(y-{\mu}_{y}))+E((x-{\mu}_{x})e)}{{\sigma}_{x}{\sigma}_{y}^{\prime}}\\ & =\mathrm{C}\mathrm{o}\mathrm{r}\mathrm{r}(x,y)\frac{{\sigma}_{y}}{{\sigma}_{{y}^{\prime}}}\end{array}$

$E((x-{\mu}_{x})e)=E(x-{\mu}_{x})E(e)=0$ since x and e are independent.

Now, ${\sigma}_{{y}^{\prime}}=\sqrt{{\sigma}_{y}^{2}+{\sigma}_{e}^{2}}$, again by independence, so:

$\mathrm{C}\mathrm{o}\mathrm{r}\mathrm{r}(x,{y}^{\prime})=Corr(x,y)\frac{1}{\sqrt{1+{\left(\frac{{\sigma}_{e}}{{\sigma}_{y}}\right)}^{2}}}$

So definitely $|\mathrm{C}\mathrm{o}\mathrm{r}\mathrm{r}(x,{y}^{\prime})|<|\mathrm{C}\mathrm{o}\mathrm{r}\mathrm{r}(x,y)|$

I believe the specific e you have given, we have ${\sigma}_{e}=\frac{0.1}{\sqrt{6}}$

There is no meaning to "estimation" technically. You can always say that $y-{y}^{\prime}$ is another random variable. If you don't know that y′−y is independent of x, you don't know what $E((x-{\mu}_{x})(e-{\mu}_{e}))$ is. If you don't know that $e={y}^{\prime}-y$ and y are independent, you don't know ${\sigma}_{{y}^{\prime}}$ in terms of ${\sigma}_{e}$ and ${\sigma}_{y}$. In particular, you don't know ${\sigma}_{{y}^{\prime}}>{\sigma}_{y}$.

A simple example is that if ${y}^{\prime}=x$ then $E(x,{y}^{\prime})=1$. So if x,y are close enough that x can be said to be an "estimate for y" then $E(x,x)=1>E(x,y)$).

$\begin{array}{rl}\mathrm{C}\mathrm{o}\mathrm{r}\mathrm{r}(x,{y}^{\prime})& =\frac{E((x-{\mu}_{x})({y}^{\prime}-{\mu}_{y}))}{{\sigma}_{x}{\sigma}_{{y}^{\prime}}}\\ & =\frac{E((x-{\mu}_{x})(y-{\mu}_{y}))+E((x-{\mu}_{x})e)}{{\sigma}_{x}{\sigma}_{y}^{\prime}}\\ & =\mathrm{C}\mathrm{o}\mathrm{r}\mathrm{r}(x,y)\frac{{\sigma}_{y}}{{\sigma}_{{y}^{\prime}}}\end{array}$

$E((x-{\mu}_{x})e)=E(x-{\mu}_{x})E(e)=0$ since x and e are independent.

Now, ${\sigma}_{{y}^{\prime}}=\sqrt{{\sigma}_{y}^{2}+{\sigma}_{e}^{2}}$, again by independence, so:

$\mathrm{C}\mathrm{o}\mathrm{r}\mathrm{r}(x,{y}^{\prime})=Corr(x,y)\frac{1}{\sqrt{1+{\left(\frac{{\sigma}_{e}}{{\sigma}_{y}}\right)}^{2}}}$

So definitely $|\mathrm{C}\mathrm{o}\mathrm{r}\mathrm{r}(x,{y}^{\prime})|<|\mathrm{C}\mathrm{o}\mathrm{r}\mathrm{r}(x,y)|$

I believe the specific e you have given, we have ${\sigma}_{e}=\frac{0.1}{\sqrt{6}}$

There is no meaning to "estimation" technically. You can always say that $y-{y}^{\prime}$ is another random variable. If you don't know that y′−y is independent of x, you don't know what $E((x-{\mu}_{x})(e-{\mu}_{e}))$ is. If you don't know that $e={y}^{\prime}-y$ and y are independent, you don't know ${\sigma}_{{y}^{\prime}}$ in terms of ${\sigma}_{e}$ and ${\sigma}_{y}$. In particular, you don't know ${\sigma}_{{y}^{\prime}}>{\sigma}_{y}$.

A simple example is that if ${y}^{\prime}=x$ then $E(x,{y}^{\prime})=1$. So if x,y are close enough that x can be said to be an "estimate for y" then $E(x,x)=1>E(x,y)$).

mxty42ued

Beginner2022-11-22Added 4 answers

Let X,Y be random variables with a given correlation b. Let Z be any random variable independent of σ(X,Y), and suppose $Z\ne 0$ has a strictly positive, finite variance, and $\mathbb{E}[Z]=0$. Here Z is 'noise' that will contribute to ${Y}^{\prime}=Y+Z$

Notice that $Cov(X,Z)=0$ and

$Cov(X,{Y}^{\prime})=Cov(X,Y)+Cov(X,Z)=Cov(X,Y)$since Z is independent of X. Moreover $Var({Y}^{\prime})=Var(Y)+Var(Z)>Var(Y)$ by independence. We conclude that

$|Corr(X,{Y}^{\prime})|=\left|\frac{Cov(X,{Y}^{\prime})}{\sqrt{Var(X)Var({Y}^{\prime})}}\right|<\left|\frac{Cov(X,Y)}{\sqrt{Var(X)Var(Y)}}\right|=|Corr(X,Y)|$

We conclude that the addition of any zero-expectation, independent noise of finite variance will diminish the correlation.

Notice that $Cov(X,Z)=0$ and

$Cov(X,{Y}^{\prime})=Cov(X,Y)+Cov(X,Z)=Cov(X,Y)$since Z is independent of X. Moreover $Var({Y}^{\prime})=Var(Y)+Var(Z)>Var(Y)$ by independence. We conclude that

$|Corr(X,{Y}^{\prime})|=\left|\frac{Cov(X,{Y}^{\prime})}{\sqrt{Var(X)Var({Y}^{\prime})}}\right|<\left|\frac{Cov(X,Y)}{\sqrt{Var(X)Var(Y)}}\right|=|Corr(X,Y)|$

We conclude that the addition of any zero-expectation, independent noise of finite variance will diminish the correlation.

Which of the following statements is not correct for the relation R defined by aRb, if and only if b lives within one kilometre from a?

A) R is reflexive

B) R is symmetric

C) R is not anti-symmetric

D) None of the aboveA line segment is a part of a line as well as a ray. True or False

Which characteristic of a data set makes a linear regression model unreasonable?

Find the meaning of 'Sxx' and 'Sxy' in simple linear regression

In the least-squares regression line, the desired sum of the errors (residuals) should be

a) zero

b) positive

c) 1

d) negative

e) maximizedCan the original function be derived from its ${k}^{th}$ order Taylor polynomial?

Should the independent (or dependent) variables in a linear regression model be normal or just the residual?

What is the relationship between the correlation of two variables and their covariance?

What kind of technique is to be adopted if I have to find an equation or model for say, $D$ depends on $C$, $C$ changes for a set of $B$, which changes for different $A$.

What is the benefit of OU vs regression for modeling data, say data in the form of ($x,y$) pairs?

Can you determine the correlation coefficient from the coefficient of determination?

How can one find the root of sesquilinear form with positive definite matrix?

From numerical simulation and regression analysis I discovered that the root-mean-square amplitude of white noise with bandwidth $\mathrm{\Delta}\phantom{\rule{negativethinmathspace}{0ex}}f$ is proportional to $\sqrt{\phantom{\rule{negativethinmathspace}{0ex}}\mathrm{\Delta}\phantom{\rule{negativethinmathspace}{0ex}}f}$. How can this be derived mathematically ?

In a Simple Linear Regression analysis, independent variable is weekly income and dependent variable is weekly consumption expenditure. Here $95$% confidence interval of regression coefficient, ${\beta}_{1}$ is $(.4268,.5914)$.

How to find AIC values for both models using $R$ software?