Is the Bayesian Prior representing a hypothesis with no data, or with all data? I have an understan

Kellen Perkins

Kellen Perkins

Answered question

2022-05-26

Is the Bayesian Prior representing a hypothesis with no data, or with all data?
I have an understanding question about the Bayes' Theorem: in
p ( z | x ) = p ( x | z ) p ( z ) p ( x ) ,
the term p(z) is usually interpreted as the prior probability distribution of a hypothesis z before observing any data x.
However, if we write p(z) as the marginal
p ( z ) = p ( z , x ) d x = p ( z | x ) p ( x ) d x = E x p ( x ) p ( z | x ) ,
then the term p(z) seems to contain the knowledge about all data x.
Therefore, is the prior really representing the hypothesis with no data, or with all data?
We are not any smarter with all data than we are with no data?
Or is it a question of perspective?
How should I understand the prior correctly?
Thank you!

Answer & Explanation

Harley Fitzpatrick

Harley Fitzpatrick

Beginner2022-05-27Added 13 answers

The prior encodes the asker's existing belief about the state of the world. This may be in context of prior knowledge that's given to you (if the question rests on certain assumptions), or as an entire philosophy.
For the former, suppose you've been told that before you flipped a coin, it's been observed previously to come up heads 99% of the time. Depending on how strongly you decide to weight this as evidence, you may decide it should count "as if" you've seen several extra flips. This leads to the concept of conjugate priors, which are mathematically convenient ways to have the posterior be the same form as the prior - which really exposes the correspondence that Bayesian inference is updating your prior with additional evidence.
You may, at one extreme, decide this is conclusive information and assign this infinity weight - you will in effect ignore any and all evidence to the contrary. The frequentist side would completely ignore the pre-existing information and estimate the coin purely based on what was observed in the experiment.
As a philosophy, Bayesian statistics fundamentally rejects the frequentist assumption that there is a fully objective description of the world independent of the asker's experience and belief. See for example this XKCD for a humourous comparison.
There are particular choices which may be better suited to expressing complete ignorance, for example the Jeffreys prior, but even these may be arguable on whether "minimizing information" truly embodies "ignorance".

Do you have a similar question?

Recalculate according to your conditions!

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?