Proportionality is used to simplify analysis
Bayesian analysis is generally done via an even simpler statement of Bayes' theorem, where we work only in terms of proportionality with respect to the parameter of interest. For a standard IID model with sampling density f(X|θ) we can express this as:
p(θ|x)∝Lx(θ)⋅p(θ)Lx(θ)∝∏i=1nf(xi|θ).
This statement of Bayesian updating works in terms of proportionality with respect to the parameter θ. It uses two proportionality simplifications: one in the use of the likelihood function (proportional to the sampling density) and one in the posterior (proportional to the product of likelihood and prior). Since the posterior is a density function (in the continuous case), the norming rule then sets the multiplicative constant that is required to yield a valid density (i.e., to make it integrate to one).
This method use of proportionality has the advantage of allowing us to ignore any multiplicative elements of the functions that do not depend on the parameter θ. This tends to simplify the problem by allowing us to sweep away unnecessary parts of the mathematics, and get simpler statements of the updating mechanism. This is not a mathematical requirement (since Bayes' rule works in its non-proportional form too), but it makes things simpler for our tiny animal brains.
An applied example: Consider an IID model with observed data X1,...,Xn∼IID N(θ,1). To facilitate our analysis we define the statistics x¯=1n∑ni=1xi and x¯¯=1n∑ni=1x2i, which are the first two sample moments. For this model we have sampling density:
f(x|θ)=∏i=1nf(xi|θ)=∏i=1nN(xi|θ,1)=∏i=1n12π−−√exp(−12(xi−θ)2)=(2π)n/2exp(−12∑i=1n(xi−θ)2).=(2π)n/2exp(−n2(θ2−2x¯θ+x¯¯))=(2π)n/2exp(−nx¯¯2)⋅exp(−n2(θ2−2x¯θ))
Now, we can work directly with this sampling density if we want to. But notice that the first two terms in this density are multiplicative constants that do not depend on θ. It is annoying to have to keep track of these terms, so let's just get rid of them, so we have the likelihood function:
Lx(θ)=exp(−n2(θ2−2x¯θ)).
That simplifies things a little bit, since we don't have to keep track of an additional term. Now, we could apply Bayes' rule using its full equation-version, including the integral denominator. But again, this requires us to keep track of another annoying multiplicative constant that does not depend on θ (more annoying because we have to solve an integral to get it). So let's just apply Bayes' rule in its proportional form. Using the conjugate prior θ∼N(0,λ0), with some known precision parameter λ0>0, we get the following result (by completing the square):
p(θ|x)∝Lx(θ)⋅p(θ)=exp(−n2(θ2−2x¯θ))⋅N(θ|0,λ0)∝exp(−n2(θ2−2x¯θ))⋅exp(−λ02θ2)=exp(−12(nθ2−2nx¯θ+λ0θ2))=exp(−12((n+λ0)θ2−2nx¯θ))=exp(−n+λ02(θ2−2nx¯n+λ0θ))∝exp(−n+λ02(θ−nn+λ0⋅x¯)2)∝N(θ∣∣nn+λ0⋅x¯,n+λ0).
So, from this working we can see that the posterior distribution is proportional to a normal density. Since the posterior must be a density, this implies that the posterior is that normal density:
p(θ|x)=N(θ∣∣nn+λ0⋅x¯,n+λ0).
Hence, we see that a posteriori the parameter θ is normally distributed with posterior mean and variance given by:
E(θ|x)=nn+λ0⋅x¯V(θ|x)=1n+λ0.
Now, the posterior distribution we have derived has a constant of integration out the front of it (which we can find easily by looking up the form of the normal distribution). But notice that we did not have to worry about this multiplicative constant - all our working removed (or brought in) multiplicative constants whenever this simplified the mathematics. The same result can be derived while keeping track of the multiplicative constants, but this is a lot messier.