How to compute Bias and Variance for the given scenarios?
I'm currently studying the "Learning from
rigliztetbf
Answered question
2022-06-12
How to compute Bias and Variance for the given scenarios?
I'm currently studying the "Learning from data" course - by Professor Yaser Abu, and I do not get the "bias-variance tradeoff" part of it. Actually, the concepts are fine − the math is the problem.
In the lecture 08, he defined bias and variance as follows:
, where
To clarify the notation:
D means the data set .
g is the function that approximates ; i.e., I'm estimating by using g. In this case, g is chosen by an algorithm in the hypothesis set .
After that, he proposed an example that was stated in the following manner:
Example: Let and a data set of size N=2. We sample x uniformly in [−1,1] to generate and . Now, suppose that I have two models, and .
For , let . For , choose the line that passes through and .
Simulating the process as described, he states that:
Looking for , and .
Looking for , and .
Here is my main question: How can one get these results analytically?
I've tried to solve the integrals (it didn't work) that came from the , but I'm not sure if
I'm interpreting in the right way which distribution is which. For example, how to evaluate (it is the same as evaluating or , for and , respectively, right?)? The random variable which has uniform distribution over [−1,1] is x, right? Thus
is evaluated with respect to a random variable that follows ] distribution, right?
If anyone could help me to understand at least one of the two scenarios, by achieving the provided numbers for the Bias and Var quantities; it would be extremely helpful.
Thanks in advance,
André