I am studying a Tutorial on Maximum Likelihood Estimation in Linear Regression and I have a question.
When we have more than one regressor (a.k.a. multiple linear regression1), the model comes in its matrix form , (1)where y is the response vector, X is the design matrix with each its row specifying under what design or conditions the corresponding response is observed (hence the name), is the vector of regression coefficients, and is the residual vector distributing as a zero-mean multivariable Gaussian with a diagonal covariance matrix , where is the identity matrix. Therefore , (2)meaning that linear combination explains (or predicts) response y with uncertainty characterized by a variance of .
Assume y,, and Under the model assumptions, we aim to estimate the unknown parameters ( and ) from the data available (X and y).
Maximum likelihood (ML) estimation is the most common estimator. We maximize the log-likelihood w.r.t. and
I am trying to understand that how the log-likelihood, , is formed. Normally, I saw these problems when we have as vector of size d(d is number of parameter for each data). specifically, when xi is a vector, I wrote is as
. But in the case that is shown in this tutorial, there is no index I to apply summation.