Suppose that we have data points (X_1,
y_1), \dots, (X_n, y_n); we operate under the framework that
X_i are always implicitly conditioned
on. The generalized linear model will allow us to generalize from
continuous real values to more diverse outputs. Particularly common is
y_i arising from an exponential
family.

Def: A link function is how E[y_i] (or E[y_i
\mid X_i]) depends on X_i. In
particular, we will have
g(E[y_i]) = g(\mu_i) = X_i^T \beta.