How to compute Bias and Variance for the given scenarios?I'm currently studying the "Learning from data" course - by Professor Yaser Abu, and I do not get the "bias-variance tradeoff" part of it. Actually, the concepts are fine − the math is the problem.In the lecture 08, he defined bias and variance as follows: Bias = E x [ ( g ¯ ( x ) − f ( x ) ) 2 ] , where g ¯ ( x ) = E D [ g ( D ) ( x ) ] Var = E x [ E D [ ( g ( D ) ( x ) − g ¯ ( x ) ) 2 ] ] To clarify the notation:D means the data set ( x 1 , y 1 ) , ⋯ , ( x n , y n ).g is the function that approximates f; i.e., I'm estimating f by using g. In this case, g is chosen by an algorithm A in the hypothesis set H .After that, he proposed an example that was stated in the following manner:Example: Let f ( x ) = sin ⁡ ( π x ) and a data set D of size N=2. We sample x uniformly in [−1,1] to generate ( x 1 , y 1 ) and ( x 2 , y 2 ). Now, suppose that I have two models, H 0 and H 1 . H 0 : h ( x ) = b H 1 : h ( x ) = a x + b H 0 : h ( x ) = b H 1 : h ( x ) = a x + bFor H 0 , let b = y 1 + y 2 2 . For H 1 , choose the line that passes through ( x 1 , y 1 ) and ( x 2 , y 2 ).Simulating the process as described, he states that:Looking for H 0 , Bias ≈ 0.50 and Var ≈ 0.25.Looking for H 1 , Bias ≈ 0.21 and Var ≈ 1.69.Here is my main question: How can one get these results analytically?I've tried to solve the integrals (it didn't work) that came from the E [ ⋅ ], but I'm not sure if I'm interpreting in the right way which distribution is which. For example, how to evaluate E D [ g ( D ) ( x ) ] (it is the same as evaluating E D [ b ] or E D [ a x + b ] , for H 0 and H 1 , respectively, right?)? The random variable which has uniform distribution over [−1,1] is x, right? Thus E x [ ⋅ ] is evaluated with respect to a random variable that follows U [ − 1 , 1 ]] distribution, right?If anyone could help me to understand at least one of the two scenarios, by achieving the provided numbers for the Bias and Var quantities; it would be extremely helpful.Thanks in advance,André

Question

How to compute Bias and Variance for the given scenarios?I&#039;m currently studying the &quot;Learning from data&quot; course - by Professor Yaser Abu, and I do not get the &quot;bias-variance tradeoff&quot; part of it. Actually, the concepts are fine − the math is the problem.In the lecture 08, he defined bias and variance as follows:  Bias  =            E                      x                  [    (                  g        ¯              (          x        )    −    f    (          x        )          )      2        ]  , where             g      ¯        (      x    )  =            E                      D                  [          g              (                  D                )              (          x        )    ]    Var  =            E                      x                  [                  E                              D                            [      (              g                  (                      D                    )                    (              x            )      −                        g          ¯                    (              x            )              )        2            ]        ]  To clarify the notation:D means the data set   (            x        1    ,      y    1    )  ,  ⋯  ,  (            x        n    ,      y    n    ).g is the function that approximates   f; i.e., I&#039;m estimating   f by using g. In this case, g is chosen by an algorithm       A   in the hypothesis set       H  .After that, he proposed an example that was stated in the following manner:Example: Let   f  (  x  )  =  sin  ⁡  (  π  x  ) and a data set       D   of size N=2. We sample x uniformly in [−1,1] to generate   (            x        1    ,      y    1    ) and   (            x        2    ,      y    2    ). Now, suppose that I have two models,             H        0   and             H        1  .            H        0    :  h  (  x  )  =  b            H        1    :  h  (  x  )  =  a  x  +  b            H        0    :  h  (  x  )  =  b            H        1    :  h  (  x  )  =  a  x  +  bFor             H        0  , let   b  =                    y        1            +              y        2              2  . For             H        1  , choose the line that passes through   (            x        1    ,      y    1    ) and   (            x        2    ,      y    2    ).Simulating the process as described, he states that:Looking for             H        0  ,   Bias  ≈  0.50 and   Var  ≈  0.25.Looking for             H        1  ,   Bias  ≈  0.21 and   Var  ≈  1.69.Here is my main question: How can one get these results analytically?I&#039;ve tried to solve the integrals (it didn&#039;t work) that came from the       E    [  ⋅  ], but I&#039;m not sure if I&#039;m interpreting in the right way which distribution is which. For example, how to evaluate             E                      D                  [          g              (                  D                )              (          x        )    ]   (it is the same as evaluating             E                      D                  [    b    ]   or             E                      D                  [    a    x    +    b    ]  , for             H        0   and             H        1  , respectively, right?)? The random variable which has uniform distribution over [−1,1] is x, right? Thus             E                      x              [  ⋅  ] is evaluated with respect to a random variable that follows   U  [  −  1  ,  1  ]] distribution, right?If anyone could help me to understand at least one of the two scenarios, by achieving the provided numbers for the Bias and Var quantities; it would be extremely helpful.Thanks in advance,André

Cristian Hamilton · Accepted Answer

The answer to all your questions is “yes”. (Where you write “evaluating             E                      D              [  b  ] or             E                      D              [  a  x  +  b  ] for             H        0   and             H        1  ”, a and b need to be computed from the data as given in the problem statement, e.g.   b  =                    y        1            +              y        2              2  .)I&#039;ll calculate the bias and variance for             H        0  We have                                          g            ¯                          (        x        )                            =                                          E                                              D                                                [                      g                          (                              D                            )                                (          x          )          ]                                                  =                              ∫                      −            1                    1                                                    d                                      x              1                                2                          ∫                      −            1                    1                                                    d                                      x              2                                2                                      sin            ⁡            π                          x              1                        +            sin            ⁡            π                          x              2                                2                                                  =                    0                ,            so the bias is                                          E                    x                          [                                    (                                                g                  ¯                                            (              x              )              −              f              (              x              )              )                        2                    ]                                    =                                          E                    x                          [          f          (          x                      )            2                    ]                                                  =                              ∫                      −            1                    1                                                    d                        x                    2                          sin          2                ⁡        π        x                                          =                              1          2                    and the variance is                                          E                    x                          [                                    E                                                      D                                                          [                                          (                                  g                                      (                                          D                                        )                                                  (                x                )                −                                                      g                    ¯                                                  (                x                )                )                            2                        ]                    ]                                    =                              ∫                      −            1                    1                                                    d                        x                    2                          ∫                      −            1                    1                                                    d                                      x              1                                2                          ∫                      −            1                    1                                                    d                                      x              2                                2                                      (                                          sin                ⁡                π                                  x                  1                                +                sin                ⁡                π                                  x                  2                                            2                        )                    2                                                  =                              ∫                      −            1                    1                                                    d                                      x              1                                2                          ∫                      −            1                    1                                                    d                                      x              2                                2                                      (                                          sin                ⁡                π                                  x                  1                                +                sin                ⁡                π                                  x                  2                                            2                        )                    2                                                  =                              1          4                        .            I don&#039;t know why they&#039;re given with ≈, as these are their exact values.For             H        1  , you&#039;ll have more involved integrations, since you get       x    2    −      x    1   in the denominator:  a  =                    y        2            −              y        1                            x        2            −              x        1              =            sin      ⁡      π              x        2            −      sin      ⁡      π              x        1                            x        2            −              x        1            and  b  =                    x        2                    y        1            −              x        1                    y        2                            x        2            −              x        1              =                    x        2            sin      ⁡      π              x        1            −              x        1            sin      ⁡      π              x        2                            x        2            −              x        1                .Also, in this case you have an actual dependence on x, whereas for             H        0   the integration over x for the variance was trivial since g was constant.

How to compute Bias and Variance for the given scenarios? I'm currently studying the "Learning from

Answered question

Answer & Explanation

New Questions in Pre-Algebra