vittorecostao1

2022-06-21

I'm reading The Elements of Statistical Learning. I have a question about the curse of dimensionality.

In section 2.5, p.22:

Consider N data points uniformly distributed in a p-dimensional unit ball centered at the origin. suppose we consider a nearest-neighbor estimate at the origin. The median distance from the origin to the closest data point is given by the expression:

$d(p,N)={(1-\frac{1}{{2}^{1/N}})}^{1/p}.$

For N=500, p=10, $d(p,N)\approx 0.52$, more than halfway to the boundary. Hence most data points are closer to the boundary of the sample space than to any other data point.

I accept the equation. My question is, how we deduce this conclusion?

In section 2.5, p.22:

Consider N data points uniformly distributed in a p-dimensional unit ball centered at the origin. suppose we consider a nearest-neighbor estimate at the origin. The median distance from the origin to the closest data point is given by the expression:

$d(p,N)={(1-\frac{1}{{2}^{1/N}})}^{1/p}.$

For N=500, p=10, $d(p,N)\approx 0.52$, more than halfway to the boundary. Hence most data points are closer to the boundary of the sample space than to any other data point.

I accept the equation. My question is, how we deduce this conclusion?

jarakapak7

Beginner2022-06-22Added 14 answers

This is the exercise 2.3 that they refer to.

Probability Distribution Function is mentioned in the PDF.

Cumulative Distribution Function, or CDF.

The former is the derivative of the latter since continuous distributions are what we are thinking about.

The volume of a ball of radius r in ${\mathbb{R}}^{p}$is ${\omega}_{p}{r}^{p},$, where ωp is a constant depending only on p, the value indicated by shorthand

${\omega}_{p}=\frac{{\pi}^{p/2}}{(p/2)!}.$.

As a result, the probability that a point, taken uniformly in the unit ball, is within distance x of the origin is the volume of that ball divided by the volume of the unit ball. The factors of ${\omega}_{p}$cancel, so we get CDF

$F(x)={x}^{p},\phantom{\rule{thickmathspace}{0ex}}\phantom{\rule{thickmathspace}{0ex}}\phantom{\rule{thickmathspace}{0ex}}0\le x\le 1.$.

The corresponding PDF is the derivative,

$f(x)=p{x}^{p-1},\phantom{\rule{thickmathspace}{0ex}}\phantom{\rule{thickmathspace}{0ex}}\phantom{\rule{thickmathspace}{0ex}}0\le x\le 1.$.

From page 150, section 4.6 of Introduction to Mathematical Statistics by Hogg and Craig, we are told that the marginal (individual) PDF for ${y}_{1},$, the smallest order statistic (the minimum) of n points with CDF F and PDF f is

$g(y)=n{(1-F(y))}^{n-1}f(y).$.

In our case that gives

$g(y)=n{(1-{y}^{p})}^{n-1}p{y}^{p-1},$,

It is easily incorporated to provide the CDF

$G(y)=1-{(1-{y}^{p})}^{n}.$.

The expected value of y, or the mean, is a confusing integral. Instead, in the case of a continuous variable, the median is simply defined as the value of the random variable y such that G(y)=1/2. The probability of having a minimum less than the median is 50% if you repeated the experiment, and the probability of receiving a minimum greater than the median is also 50%. The median and mean are probably quite close for the conventional bell curve. I'm not sure if the median and mean are necessary close to one another in this case because the polynomial in question is constrained to a limited interval.I don't understand how you could ever read this book without having taken a complete semester of calculus-based quantitative statistics.

Solve G(y)=1/2, you get their expression.

Which expression has both 8 and n as factors???

One number is 2 more than 3 times another. Their sum is 22. Find the numbers

8, 14

5, 17

2, 20

4, 18

10, 12Perform the indicated operation and simplify the result. Leave your answer in factored form

$\left[\frac{(4x-8)}{(-3x)}\right].\left[\frac{12}{(12-6x)}\right]$ An ordered pair set is referred to as a ___?

Please, can u convert 3.16 (6 repeating) to fraction.

Write an algebraic expression for the statement '6 less than the quotient of x divided by 3 equals 2'.

A) $6-\frac{x}{3}=2$

B) $\frac{x}{3}-6=2$

C) 3x−6=2

D) $\frac{3}{x}-6=2$Find: $2.48\xf74$.

Multiplication $999\times 999$ equals.

Solve: (128÷32)÷(−4)=

A) -1

B) 2

C) -4

D) -3What is $0.78888.....$ converted into a fraction? $\left(0.7\overline{8}\right)$

The mixed fraction representation of 7/3 is...

How to write the algebraic expression given: the quotient of 5 plus d and 12 minus w?

Express 200+30+5+4100+71000 as a decimal number and find its hundredths digit.

A)235.47,7

B)235.047,4

C)235.47,4

D)234.057,7Find four equivalent fractions of the given fraction:$\frac{6}{12}$

How to find the greatest common factor of $80{x}^{3},30y{x}^{2}$?