How to compute Bias and Variance for the given scenarios? I'm currently studying the "Learning from

rigliztetbf

rigliztetbf

Answered question

2022-06-12

How to compute Bias and Variance for the given scenarios?
I'm currently studying the "Learning from data" course - by Professor Yaser Abu, and I do not get the "bias-variance tradeoff" part of it. Actually, the concepts are fine − the math is the problem.
In the lecture 08, he defined bias and variance as follows:
Bias = E x [ ( g ¯ ( x ) f ( x ) ) 2 ] , where g ¯ ( x ) = E D [ g ( D ) ( x ) ]
Var = E x [ E D [ ( g ( D ) ( x ) g ¯ ( x ) ) 2 ] ]
To clarify the notation:
D means the data set ( x 1 , y 1 ) , , ( x n , y n ).
g is the function that approximates f; i.e., I'm estimating f by using g. In this case, g is chosen by an algorithm A in the hypothesis set H .
After that, he proposed an example that was stated in the following manner:
Example: Let f ( x ) = sin ( π x ) and a data set D of size N=2. We sample x uniformly in [−1,1] to generate ( x 1 , y 1 ) and ( x 2 , y 2 ). Now, suppose that I have two models, H 0 and H 1 .
H 0 : h ( x ) = b
H 1 : h ( x ) = a x + b
H 0 : h ( x ) = b
H 1 : h ( x ) = a x + b
For H 0 , let b = y 1 + y 2 2 . For H 1 , choose the line that passes through ( x 1 , y 1 ) and ( x 2 , y 2 ).
Simulating the process as described, he states that:
Looking for H 0 , Bias 0.50 and Var 0.25.
Looking for H 1 , Bias 0.21 and Var 1.69.
Here is my main question: How can one get these results analytically?
I've tried to solve the integrals (it didn't work) that came from the E [ ], but I'm not sure if
I'm interpreting in the right way which distribution is which. For example, how to evaluate E D [ g ( D ) ( x ) ] (it is the same as evaluating E D [ b ] or E D [ a x + b ] , for H 0 and H 1 , respectively, right?)? The random variable which has uniform distribution over [−1,1] is x, right? Thus
E x [ ] is evaluated with respect to a random variable that follows U [ 1 , 1 ]] distribution, right?
If anyone could help me to understand at least one of the two scenarios, by achieving the provided numbers for the Bias and Var quantities; it would be extremely helpful.
Thanks in advance,
André

Answer & Explanation

Cristian Hamilton

Cristian Hamilton

Beginner2022-06-13Added 23 answers

The answer to all your questions is “yes”. (Where you write “evaluating E D [ b ] or E D [ a x + b ] for H 0 and H 1 ”, a and b need to be computed from the data as given in the problem statement, e.g. b = y 1 + y 2 2 .)
I'll calculate the bias and variance for H 0
We have
g ¯ ( x ) = E D [ g ( D ) ( x ) ] = 1 1 d x 1 2 1 1 d x 2 2 sin π x 1 + sin π x 2 2 = 0 ,
so the bias is
E x [ ( g ¯ ( x ) f ( x ) ) 2 ] = E x [ f ( x ) 2 ] = 1 1 d x 2 sin 2 π x = 1 2
and the variance is
E x [ E D [ ( g ( D ) ( x ) g ¯ ( x ) ) 2 ] ] = 1 1 d x 2 1 1 d x 1 2 1 1 d x 2 2 ( sin π x 1 + sin π x 2 2 ) 2 = 1 1 d x 1 2 1 1 d x 2 2 ( sin π x 1 + sin π x 2 2 ) 2 = 1 4 .
I don't know why they're given with ≈, as these are their exact values.
For H 1 , you'll have more involved integrations, since you get x 2 x 1 in the denominator: a = y 2 y 1 x 2 x 1 = sin π x 2 sin π x 1 x 2 x 1
and
b = x 2 y 1 x 1 y 2 x 2 x 1 = x 2 sin π x 1 x 1 sin π x 2 x 2 x 1 .
Also, in this case you have an actual dependence on x, whereas for H 0 the integration over x for the variance was trivial since g was constant.

Do you have a similar question?

Recalculate according to your conditions!

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?