Mathematical Statistics I

UChicago STAT 30100, Winter 2024

Basic Statistics

We model the outcome of a statistical experiment by some P_\theta, where \theta \in \Theta is a parameter space; we take some X_1, \dots, X_n \sim P_\theta, and we ask: can we summarize X_1, \dots, X_n by some statistic T = T(X_1, \dots, X_n) without losing information?

Def: Link A sufficient statistic is a statistic T = T(X_1, \dots, X_n) whose conditional distribution X \mid T does not depend on \theta.

The intuition is that a sufficient statistic recovers all information about \theta. In fact, if we sample Y_1 \mid T, \dots, Y_n \mid T, then Y_1, \dots, Y_n has the same joint distribution as X_1, \dots, X_n.

Def: Let X_1, \dots, X_n \sim N(\theta, 1); then T = n^{-1} \sum_{i=1}^n X_i = \bar X is a sufficient statistic. In fact, \left[\begin{matrix}X_1 \\ \vdots \\ X_n\end{matrix}\right] \mid T \sim N \left( \left[\begin{matrix} \bar X \\ \vdots \\ \bar X\end{matrix}\right], I - n^{-1} U \right) where U is a matrix with ones on upper triangular entries.

Def: Let X_1, \dots, X_n \sim P_\theta for some \theta \in \Theta; the order statistic is just T = (X_{(1)}, \dots, X_{(n)}) where X_{(1)} \leq \dots \leq X_{(n)} is the ordered list of observations. This is a sufficient statistic when X_1, \dots, X_n are exchangeable.

Theorem (Factorization): Suppose P_\theta is either discrete or continuous; then T = T(X) is sufficient if and only if p(X \mid \theta) = g_\theta(T(x)) h(x) where p is the density function corresponding to P_\theta.

Proof: The continuous case is similar to the discrete one. The backwards direction is just computation. The forwards direction is just noting that P(X = x) = P(X = x, T(X) = T(x)) = P(X = x \mid T(X) = T(x)) P(T(X) = T(x)).