 Laila Murphy

2022-11-20

Correlation bound
Let x and y be two random variables such that:
Corr(x,y) = b, where Corr(x,y) represents correlation between x and y, b is a scalar number in range of [-1, 1]. Let y' be an estimation of y. An example could be y'=y+(rand(0,1)-0.5)*.1, rand(0,1) gives random number between 0, 1. I am adding some noise to the data.
My questions are:
Is there a way where I can bound the correlation between x, y' i.e. Corr(x,y')?I mentioned y' in light of random perturbation, I would like to know what if I don't have that information, where I only know that y' is a estimation of y. Are there any literature that cover it? Julius Haley

Let $e={y}^{\prime }-y$. Assuming that e is independent from x and y with ${\mu }_{e}=E\left(e\right)=0$, then ${\mu }_{{y}^{\prime }}=E\left({y}^{\prime }\right)=E\left(y\right)={\mu }_{y}$ and:
$\begin{array}{rl}\mathrm{C}\mathrm{o}\mathrm{r}\mathrm{r}\left(x,{y}^{\prime }\right)& =\frac{E\left(\left(x-{\mu }_{x}\right)\left({y}^{\prime }-{\mu }_{y}\right)\right)}{{\sigma }_{x}{\sigma }_{{y}^{\prime }}}\\ & =\frac{E\left(\left(x-{\mu }_{x}\right)\left(y-{\mu }_{y}\right)\right)+E\left(\left(x-{\mu }_{x}\right)e\right)}{{\sigma }_{x}{\sigma }_{y}^{\prime }}\\ & =\mathrm{C}\mathrm{o}\mathrm{r}\mathrm{r}\left(x,y\right)\frac{{\sigma }_{y}}{{\sigma }_{{y}^{\prime }}}\end{array}$
$E\left(\left(x-{\mu }_{x}\right)e\right)=E\left(x-{\mu }_{x}\right)E\left(e\right)=0$ since x and e are independent.
Now, ${\sigma }_{{y}^{\prime }}=\sqrt{{\sigma }_{y}^{2}+{\sigma }_{e}^{2}}$, again by independence, so:
$\mathrm{C}\mathrm{o}\mathrm{r}\mathrm{r}\left(x,{y}^{\prime }\right)=Corr\left(x,y\right)\frac{1}{\sqrt{1+{\left(\frac{{\sigma }_{e}}{{\sigma }_{y}}\right)}^{2}}}$
So definitely $|\mathrm{C}\mathrm{o}\mathrm{r}\mathrm{r}\left(x,{y}^{\prime }\right)|<|\mathrm{C}\mathrm{o}\mathrm{r}\mathrm{r}\left(x,y\right)|$
I believe the specific e you have given, we have ${\sigma }_{e}=\frac{0.1}{\sqrt{6}}$
There is no meaning to "estimation" technically. You can always say that $y-{y}^{\prime }$ is another random variable. If you don't know that y′−y is independent of x, you don't know what $E\left(\left(x-{\mu }_{x}\right)\left(e-{\mu }_{e}\right)\right)$ is. If you don't know that $e={y}^{\prime }-y$ and y are independent, you don't know ${\sigma }_{{y}^{\prime }}$ in terms of ${\sigma }_{e}$ and ${\sigma }_{y}$. In particular, you don't know ${\sigma }_{{y}^{\prime }}>{\sigma }_{y}$.
A simple example is that if ${y}^{\prime }=x$ then $E\left(x,{y}^{\prime }\right)=1$. So if x,y are close enough that x can be said to be an "estimate for y" then $E\left(x,x\right)=1>E\left(x,y\right)$). mxty42ued

Let X,Y be random variables with a given correlation b. Let Z be any random variable independent of σ(X,Y), and suppose $Z\ne 0$ has a strictly positive, finite variance, and $\mathbb{E}\left[Z\right]=0$. Here Z is 'noise' that will contribute to ${Y}^{\prime }=Y+Z$
Notice that $Cov\left(X,Z\right)=0$ and
$Cov\left(X,{Y}^{\prime }\right)=Cov\left(X,Y\right)+Cov\left(X,Z\right)=Cov\left(X,Y\right)$since Z is independent of X. Moreover $Var\left({Y}^{\prime }\right)=Var\left(Y\right)+Var\left(Z\right)>Var\left(Y\right)$ by independence. We conclude that
$|Corr\left(X,{Y}^{\prime }\right)|=|\frac{Cov\left(X,{Y}^{\prime }\right)}{\sqrt{Var\left(X\right)Var\left({Y}^{\prime }\right)}}|<|\frac{Cov\left(X,Y\right)}{\sqrt{Var\left(X\right)Var\left(Y\right)}}|=|Corr\left(X,Y\right)|$
We conclude that the addition of any zero-expectation, independent noise of finite variance will diminish the correlation.

Do you have a similar question?