Kendal Day

2022-04-08

Trouble Understanding the Formal Definition of a Confidence Interval
A $1-\alpha$ confidence interval for a paramater $\theta$ is an interval ${C}_{n}=\left(a,b\right)$ where $a=a\left({X}_{1},\cdots ,{X}_{n}\right)$ and $b=b\left({X}_{1},\dots ,{X}_{n}\right)$ are functions of the data such that

If $\theta$ is a vector then we use a confidence set (such as a sphere or an ellipse) instead of an interval.
Question:
While I understand conceptually what a confidence interval is (i.e., a 95% CI means that 95% of experiments will trap the paramater in the interval), I don't understand how this formality is capturing this concept.
In particular, I don't understand what is meant by the notion of ${\mathbb{P}}_{\theta }\left(\theta \in {C}_{n}\right)$. What is the sample space which P is drawing from? What is the set $\theta \in {C}_{n}$? It seems here $\theta$ is being treated both as a fixed value (from the notation Pθ) and as a random variable (by the notation $\theta \in {C}_{n}$).

${C}_{n}$ is an interval with random endpoints, denoted a and b. Both the endpoints are functions of your sample ${X}_{1},{X}_{2},\dots ,{X}_{n}$, and the joint distribution of the X's is parametrized by $\theta$, hence the subscript on ${P}_{\theta }$. The parameter $\theta$ that governs this joint distribution is nonrandom, and generally unknown (and the mission of the CI is to capture this unknown parameter). The set $\left\{\theta \in {C}_{n}\right\}$ is shorthand for
$\left\{\omega :a\left({X}_{1}\left(\omega \right),{X}_{2}\left(\omega \right),\dots ,{X}_{n}\left(\omega \right)\right)<\theta
Viewed this way, the event (1) is more a statement about the random endpoints a and b, than about the parameter $\theta$: it's asking whether the random left endpoint is less than the number $\theta$ and the random right endpoint is greater than the number $\theta$. In the frequentist treatment of confidence intervals, $\theta$ is not a random variable; in other treatments it is possible to regard $\theta$ as the observed value of a random variable, but that's not what this definition appears to be about.

Do you have a similar question?