Three Questions on Interpreting the Outcome of the Probability Density Function
I have three questi
Armeninilu
Answered question
2022-06-11
Three Questions on Interpreting the Outcome of the Probability Density Function I have three questions: What is the interpretation of the outcome of the Probability Density Function (PDF) at a particular point? How is this result related to probability? What we exactly do when we maximize the likelihood? To better explain my questions: (i) Consider a continuous random variable X with a normal distribution such that and . If we evaluate the PDF at a particular point, say 3.4, using the formula:
we get . How we interpret this value? (ii) I previously read that the result of 0.1144 is not necessarily the probability that X takes the value of 3.4. But how the result is related to probability concept? (iii) Consider a sample of the continuous random variable X of size N=2.5, such that and . We can use this sample to maximize the log-likelihood:
If f(X) is not exactly a probability, what are we maximizing? Some texts detail that "we are maximizing the probability that a model (set of parameters) reproduces the original data". Is this phrase incorrect?
Answer & Explanation
Leland Ochoa
Beginner2022-06-12Added 25 answers
I'm not an expert, but this is the way I understand it. Denote the cumulative distribution function (CDF) by
(i) For small ε,
so
That is, the PDF gives the "rate of change" of the CDF . To illustrate using your example, we can approximate using (∗) above. We get
Using Excel, we see that , which is very close to our approximation. (I also used Excel to compute .) (ii) As you said, is not the probability that X=3.4. In fact, the probability that X=3.4 if X is a continuous random variable is 0 since
In general, the probability that X=x, where x is a real number, is 0.
Tristian Velazquez
Beginner2022-06-13Added 7 answers
The probability density function is the unsigned derivative of the cumulative probability function.
It may be considered the "gradient of the tangent" of the curve; that is, the "rate of change" of accumulation of probability, as value for the continuous random variable increases. So you are maximising the amount the parameters contribute to immediate accumulation of probability around the data points.