Schritte, um eine posteriore Verteilung herauszufinden, wenn es einfach genug sein könnte, eine analytische Form zu haben?

Dies wurde auch bei Computational Science gefragt .

Ich versuche, eine Bayes'sche Schätzung einiger Koeffizienten für eine Autoregression mit 11 Datenproben zu berechnen: wobei ist Gauß mit Mittelwert 0 und Varianz Die vorherige Verteilung auf dem Vektor ist Gauß mit Mittelwert und einer diagonalen Kovarianzmatrix mit diagonale Einträge gleich .

Y_{i} = μ + α \cdot Y_{i - 1} + ϵ_{i}

$Y_{i} = \mu + \alpha\cdot{}Y_{i-1} + \epsilon_{i}$

ϵ_{i}

$\epsilon_{i}$

σ_{e}^{2}

$\sigma_{e}^{2}$

(μ, α)^{t}

$(\mu, \alpha)^{t}$

(0, 0)

$(0,0)$

σ_{p}^{2}

$\sigma_{p}^{2}$

Basierend auf der Autoregressionsformel bedeutet dies, dass die Verteilung der Datenpunkte ( ) normal mit dem Mittelwert und der Varianz . Somit wäre die Dichte für alle Datenpunkte gemeinsam (unter der Annahme, dass die Unabhängigkeit für das von mir geschriebene Programm in Ordnung ist): $Y_{i}$ $\mu + \alpha\cdot{}Y_{i-1}$ $\sigma_{e}^{2}$ $(Y)$

p (Y | (μ, α)^{t}) = \prod_{i = 2}^{11} \frac{1}{\sqrt{2 π σ_{e}^{2}}} \exp \frac{- (Y_{i} - μ - α \cdot Y_{i - 1})^{2}}{2 σ_{e}^{2}} .

$p(Y \quad | (\mu, \alpha)^{t}) = \prod_{i=2}^{11}\frac{1}{\sqrt{2\pi\sigma_{e}^{2}}}\exp{\frac{-(Y_{i} - \mu - \alpha\cdot{}Y_{i-1})^{2}}{2\sigma_{e}^{2}}}.$

Nach dem Bayes'schen Theorem können wir das Produkt der obigen Dichte mit der vorherigen Dichte nehmen, und dann brauchen wir nur die Normalisierungskonstante. Meine Vermutung ist, dass dies eine Gaußsche Verteilung sein sollte, so dass wir uns am Ende um die Normalisierungskonstante kümmern können, anstatt sie explizit mit Integralen über und berechnen . $\mu$ $\alpha$

Dies ist der Teil, mit dem ich Probleme habe. Wie berechne ich die Multiplikation der vorherigen Dichte (die multivariat ist) und dieses Produkts der univariaten Datendichten? Der hintere Teil muss nur eine Dichte von und , aber ich kann nicht sehen, wie Sie das aus einem solchen Produkt herausholen können. $\mu$ $\alpha$

Alle Hinweise sind wirklich hilfreich, auch wenn Sie mich nur in die richtige Richtung weisen und ich dann die unordentliche Algebra ausführen muss (was ich bereits mehrmals versucht habe).

Als Ausgangspunkt ist hier die Form des Zählers aus Bayes 'Regel:

\frac{1}{(2 π σ_{e}^{2})^{5} \cdot 2 π σ_{p}^{2}} \exp [\frac{1}{2 σ_{e}^{2}} \sum_{i = 2}^{11} (Y_{i} - μ - α \cdot Y_{i - 1})^{2} - \frac{μ^{2}}{2 σ_{p}^{2}} - \frac{α^{2}}{2 σ_{p}^{2}}] .

$\frac{1}{(2\pi\sigma_{e}^{2})^{5}\cdot{}2\pi\sigma_{p}^{2}} \exp{\biggl [ \frac{1}{2\sigma_{e}^{2}}\sum_{i=2}^{11}(Y_{i} - \mu - \alpha\cdot{}Y_{i-1})^{2} - \frac{\mu^{2}}{2\sigma_{p}^{2}} - \frac{\alpha^{2}}{2\sigma_{p}^{2}} \biggr ] }.$

Das Problem ist, wie man sieht, dass dies auf eine Gaußsche Dichte von . $(\mu, \alpha)^{t}$

Hinzugefügt

Letztendlich läuft dies auf das folgende allgemeine Problem hinaus. Wenn Sie einen quadratischen Ausdruck wie wie bringen Sie das in eine quadratische Form für eine 2x2-Matrix ? Es ist in einfachen Fällen einfach genug, aber welchen Prozess verwenden Sie, um die mittleren Schätzungen und ?

A μ^{2} + B μ α + C α^{2} + J μ + K α + L

$A\mu^{2} + B\mu\alpha + C\alpha^{2} + J\mu + K\alpha + L$

(μ - \hat{μ}, α - \hat{α}) Q (μ - \hat{μ}, α - \hat{α})^{t}

$(\mu-\hat{\mu},\alpha-\hat{\alpha})Q(\mu-\hat{\mu},\alpha-\hat{\alpha})^{t}$

Q

$Q$

\hat{μ}

$\hat{\mu}$

\hat{α}

$\hat{\alpha}$

Beachten Sie, dass ich die einfache Möglichkeit ausprobiert habe, die Matrixformel zu erweitern und dann zu versuchen, die Koeffizienten wie oben darzustellen. In meinem Fall ist das Problem, dass die Konstante Null ist und ich dann drei Gleichungen in zwei Unbekannten erhalte, so dass es unterbestimmt ist, nur Koeffizienten abzugleichen (selbst wenn ich eine symmetrische quadratische Formmatrix annehme). $L$

bayesian mathematical-statistics posterior

— ely
quelle

Meine Antwort auf [diese Frage] ( stats.stackexchange.com/questions/22852/… ) kann hilfreich sein. Beachten Sie, dass Sie für Ihre erste Beobachtung ein Prior benötigen - die Iterationen hören dort auf.

— Wahrscheinlichkeitslogik

Ich verstehe nicht, warum ich es in diesem Fall brauche. Ich soll die Zeitintervalle so behandeln, als wären sie bedingt unabhängig von der Beobachtung. Beachten Sie, dass das Produkt der Fugendichte gerade . Ich glaube nicht, dass ich hier eine sequentiell aktualisierte Formel bekommen soll, nur eine einzige Formel für das hintere .

i = 2..11

$i=2..11$

p ((μ, α)^{t} | Y)

$p((\mu,\alpha)^{t}\quad |Y)$

— ely

Das "multivariate" im vorherigen steht nicht im Widerspruch zu dem "univariaten" in den Datendichten, da es sich um Dichten in den .

p (α, μ)

$p(\alpha,\mu)$

y_{i}

$y_i$

— Xi'an,

In meiner Antwort auf die vorherige Antwort ging es darum, zu untersuchen, wie ich die Parameter integriert habe - denn Sie werden hier genau die gleichen Integrale verwenden. Bei Ihrer Frage wird davon ausgegangen, dass die Varianzparameter bekannt sind, sodass es sich um Konstanten handelt. Sie müssen nur die Abhängigkeit von vom Zähler betrachten. Um dies zu sehen, beachten Sie, dass wir schreiben können: $\alpha,\mu$

p (μ, α | Y) = \frac{p (μ, α) p (Y | μ, α)}{\int \int p (μ, α) p (Y | μ, α) d μ d α}

$p(\mu,\alpha|Y)=\frac{p(\mu,\alpha)p(Y|\mu,\alpha)}{\int\int p(\mu,\alpha)p(Y|\mu,\alpha)d\mu d\alpha}$

= \frac{\frac{1}{(2 π σ_{e}^{2})^{5} \cdot 2 π σ_{p}^{2}} \exp [- \frac{1}{2 σ_{e}^{2}} \sum_{i = 2}^{11} (Y_{i} - μ - α \cdot Y_{i - 1})^{2} - \frac{μ^{2}}{2 σ_{p}^{2}} - \frac{α^{2}}{2 σ_{p}^{2}}]}{\int \int \frac{1}{(2 π σ_{e}^{2})^{5} \cdot 2 π σ_{p}^{2}} \exp [- \frac{1}{2 σ_{e}^{2}} \sum_{i = 2}^{11} (Y_{i} - μ - α \cdot Y_{i - 1})^{2} - \frac{μ^{2}}{2 σ_{p}^{2}} - \frac{α^{2}}{2 σ_{p}^{2}}] d μ d α}

$=\frac{\frac{1}{(2\pi\sigma_{e}^{2})^{5}\cdot{}2\pi\sigma_{p}^{2}} \exp{\biggl [ -\frac{1}{2\sigma_{e}^{2}}\sum_{i=2}^{11}(Y_{i} - \mu - \alpha\cdot{}Y_{i-1})^{2} - \frac{\mu^{2}}{2\sigma_{p}^{2}} - \frac{\alpha^{2}}{2\sigma_{p}^{2}} \biggr ] }}{\int\int \frac{1}{(2\pi\sigma_{e}^{2})^{5}\cdot{}2\pi\sigma_{p}^{2}} \exp{\biggl [ -\frac{1}{2\sigma_{e}^{2}}\sum_{i=2}^{11}(Y_{i} - \mu - \alpha\cdot{}Y_{i-1})^{2} - \frac{\mu^{2}}{2\sigma_{p}^{2}} - \frac{\alpha^{2}}{2\sigma_{p}^{2}} \biggr ] }d\mu d\alpha}$

Beachten Sie, wie wir den ersten Faktor aus dem Doppelintegral auf dem Nenner, und es bricht mit dem Zähler ab. Wir können auch die Summe der Quadrate $\frac{1}{(2\pi\sigma_{e}^{2})^{5}\cdot{}2\pi\sigma_{p}^{2}}$ und es wird auch abgebrochen. Das Integral, mit dem wir übrig bleiben, ist jetzt (nach Erweiterung des quadratischen Terms): $\exp{\biggl [ -\frac{1}{2\sigma_{e}^{2}}\sum_{i=2}^{11}Y_{i}^{2} \biggr ]}$

= \frac{\exp [- \frac{10 μ^{2} + α^{2} \sum_{i = 1}^{10} Y_{i}^{2} - 2 μ \sum_{i = 2}^{11} Y_{i} - 2 α \sum_{i = 2}^{11} Y_{i} Y_{i - 1} + 2 μ α \sum_{i = 1}^{10} Y_{i}}{2 σ_{e}^{2}} - \frac{μ^{2}}{2 σ_{p}^{2}} - \frac{α^{2}}{2 σ_{p}^{2}}]}{\int \int \exp [- \frac{10 μ^{2} + α^{2} \sum_{i = 1}^{10} Y_{i}^{2} - 2 μ \sum_{i = 2}^{11} Y_{i} - 2 α \sum_{i = 2}^{11} Y_{i} Y_{i - 1} + 2 μ α \sum_{i = 1}^{10} Y_{i}}{2 σ_{e}^{2}} - \frac{μ^{2}}{2 σ_{p}^{2}} - \frac{α^{2}}{2 σ_{p}^{2}}] d μ d α}

$=\frac{\exp{\biggl [ -\frac{10\mu^2+\alpha^2\sum_{i=1}^{10}Y_{i}^{2}-2\mu\sum_{i=2}^{11}Y_i-2\alpha\sum_{i=2}^{11}Y_{i}Y_{i-1}+2\mu\alpha\sum_{i=1}^{10}Y_i}{2\sigma_{e}^{2}} - \frac{\mu^{2}}{2\sigma_{p}^{2}} - \frac{\alpha^{2}}{2\sigma_{p}^{2}} \biggr ] }}{\int\int \exp{\biggl [ -\frac{10\mu^2+\alpha^2\sum_{i=1}^{10}Y_{i}^{2}-2\mu\sum_{i=2}^{11}Y_i-2\alpha\sum_{i=2}^{11}Y_{i}Y_{i-1}+2\mu\alpha\sum_{i=1}^{10}Y_i}{2\sigma_{e}^{2}} - \frac{\mu^{2}}{2\sigma_{p}^{2}} - \frac{\alpha^{2}}{2\sigma_{p}^{2}} \biggr ] }d\mu d\alpha}$

Now we can use a general result from the normal pdf.

\int \exp (- a z^{2} + b z - c) d z = \sqrt{\frac{π}{a}} \exp (\frac{b^{2}}{4 a} - c)

$\int \exp\left(-az^2+bz-c\right)dz=\sqrt{\frac{\pi}{a}}\exp\left(\frac{b^2}{4a}-c\right)$ This follows from completing the square on

- a z^{2} + b z

$-az^2+bz$ and noting that

c

$c$ does not depend on

z

$z$ . Note that the inner integral over

μ

$\mu$ is of this form with

a = \frac{10}{2 σ_{e}^{2}} + \frac{1}{2 σ_{p}^{2}}

$a=\frac{10}{2\sigma^2_e}+\frac{1}{2\sigma^2_p}$ and

b = \frac{\sum_{i = 2}^{11} Y_{i} - α \sum_{i = 1}^{10} Y_{i}}{σ_{e}^{2}}

$b=\frac{\sum_{i=2}^{11}Y_i-\alpha\sum_{i=1}^{10}Y_i}{\sigma_{e}^{2}}$ and

c = \frac{α^{2} \sum_{i = 1}^{10} Y_{i}^{2} - 2 α \sum_{i = 2}^{11} Y_{i} Y_{i - 1}}{2 σ_{e}^{2}} + \frac{α^{2}}{2 σ_{p}^{2}}

$c=\frac{\alpha^2\sum_{i=1}^{10}Y_{i}^{2}-2\alpha\sum_{i=2}^{11}Y_{i}Y_{i-1}}{2\sigma_{e}^{2}}+ \frac{\alpha^{2}}{2\sigma_{p}^{2}}$ . After doing this integral, you will find that the remaining integral over

α

$\alpha$ is also of this form, so you can use this formula again, with a different

a, b, c

$a,b,c$ . Then you should be able to write your posterior in the form

\frac{1}{2 π | V |^{\frac{1}{2}}} \exp [- \frac{1}{2} (μ - \hat{μ}, α - \hat{α}) V^{- 1} (μ - \hat{μ}, α - \hat{α})^{T}]

$\frac{1}{2\pi|V|^{\frac{1}{2}}}\exp\left[-\frac{1}{2}(\mu-\hat{\mu},\alpha-\hat{\alpha})V^{-1}(\mu-\hat{\mu},\alpha-\hat{\alpha})^T\right]$ where

V

$V$ is a

2 \times 2

$2\times 2$ matrix

Let me know if you need more clues.

update

(note: correct formula, should be $10\mu^2$ instead of $\mu^2$ )

if we look at the quadratic form you've written in the update, we notice there is $5$ coefficients ( $L$ is irrelevant for posterior as we can always add any constant which will cancel in the denominator). We also have $5$ unknowns $\hat{\mu},\hat{\alpha},Q_{11},Q_{12}=Q_{21},Q_{22}$ . Hence this is a "well posed" problem so long as the equations are linearly independent. If we expand the quadratic $(\mu-\hat{\mu},\alpha-\hat{\alpha})Q(\mu-\hat{\mu},\alpha-\hat{\alpha})^{t}$ we get:

Q_{11} (μ - \hat{μ})^{2} + Q_{22} (α - \hat{α})^{2} + 2 Q_{12} (μ - \hat{μ}) (α - \hat{α})

$Q_{11}(\mu-\hat{\mu})^2+Q_{22}(\alpha-\hat{\alpha})^2+2Q_{12}(\mu-\hat{\mu})(\alpha-\hat{\alpha})$

= Q_{11} μ^{2} + 2 Q_{21} μ α + Q_{22} α^{2} - (2 Q_{11} \hat{μ} + 2 Q_{12} \hat{α}) μ - (2 Q_{22} \hat{α} + 2 Q_{12} \hat{μ}) α +

$=Q_{11}\mu^{2} + 2Q_{21}\mu\alpha + Q_{22}\alpha^{2} - (2Q_{11}\hat{\mu}+2Q_{12}\hat{\alpha})\mu - (2Q_{22}\hat{\alpha}+2Q_{12}\hat{\mu})\alpha +$

+ Q_{11} {\hat{μ}}^{2} + Q_{22} {\hat{α}}^{2} + 2 Q_{12} \hat{μ} \hat{α}

$+Q_{11}\hat{\mu}^2+Q_{22}\hat{\alpha}^2+2Q_{12}\hat{\mu}\hat{\alpha}$

Comparing second order coefficient we get $A=Q_{11},B=2Q_{12},C=Q_{22}$ which tells us what the (inverse) covariance matrix looks like. Also we have two slightly more complicated equations for $\hat{\alpha},\hat{\mu}$ after substituting for $Q$ . These can be written in matrix form as:

- (\begin{matrix} 2 A & B \\ B & 2 C \end{matrix}) (\begin{matrix} \hat{μ} \\ \hat{α} \end{matrix}) = (\begin{matrix} J \\ K \end{matrix})

$-\begin{pmatrix}2A & B \\ B & 2C\end{pmatrix} \begin{pmatrix}\hat{\mu} \\ \hat{\alpha}\end{pmatrix} = \begin{pmatrix}J \\ K\end{pmatrix}$

Thus the estimates are given by:

(\begin{matrix} \hat{μ} \\ \hat{α} \end{matrix}) = - {(\begin{matrix} 2 A & B \\ B & 2 C \end{matrix})}^{- 1} (\begin{matrix} J \\ K \end{matrix}) = \frac{1}{4 A C - B^{2}} (\begin{matrix} B K - 2 J C \\ B J - 2 K A \end{matrix})

$\begin{pmatrix}\hat{\mu} \\ \hat{\alpha}\end{pmatrix} = -\begin{pmatrix}2A & B \\ B & 2C\end{pmatrix}^{-1}\begin{pmatrix}J \\ K\end{pmatrix}=\frac{1}{4AC-B^2}\begin{pmatrix}BK-2JC \\ BJ-2KA\end{pmatrix}$

Showing that we do not have unique estimates unless $4AC\neq B^2$ . Now we have:

\begin{array}{cc} A = \frac{10}{2 σ_{e}^{2}} + \frac{1}{2 σ_{p}^{2}} & B = \frac{\sum_{i = 1}^{10} Y_{i}}{σ_{e}^{2}} & C = \frac{\sum_{i = 1}^{10} Y_{i}^{2}}{2 σ_{e}^{2}} + \frac{1}{2 σ_{p}^{2}} \\ J = - \frac{\sum_{i = 2}^{11} Y_{i}}{σ_{e}^{2}} & K = - \frac{\sum_{i = 2}^{11} Y_{i} Y_{i - 1}}{σ_{e}^{2}} \end{array}

$\begin{array}{c c} A=\frac{10}{2\sigma^2_e}+\frac{1}{2\sigma^2_p} & B=\frac{\sum_{i=1}^{10}Y_i}{\sigma_{e}^{2}} & C=\frac{\sum_{i=1}^{10}Y_{i}^{2}}{2\sigma^2_e}+\frac{1}{2\sigma^2_p} \\ J=-\frac{\sum_{i=2}^{11}Y_i}{\sigma_{e}^{2}} & K=-\frac{\sum_{i=2}^{11}Y_{i}Y_{i-1}}{\sigma_{e}^{2}} \end{array}$

Note that if we define $X_i=Y_{i-1}$ for $i=2,\dots,11$ and take the limit $\sigma^2_p\to\infty$ then the estimates for $\mu,\alpha$ are given by the usual least squares estimate $\hat{\alpha}=\frac{\sum_{i=2}^{11}(Y_{i}-\overline{Y})(X_{i}-\overline{X})}{\sum_{i=2}^{11}(X_{i}-\overline{X})^2}$ and $\hat{\mu}=\overline{Y}-\hat{\alpha}\overline{X}$ where $\overline{Y}=\frac{1}{10}\sum_{i=2}^{11}Y_i$ and $\overline{X}=\frac{1}{10}\sum_{i=2}^{11}X_i=\frac{1}{10}\sum_{i=1}^{10}Y_i$ . So the posterior estimates are a weighted average between the OLS estimates and the prior estimate $(0,0)$ .

— probabilityislogic
quelle

This isn't particularly helpful because I mentioned specifically that it's not the denominator that matters here. The denominator is just a normalizing constant, which will be obvious once you reduce the numerator to a Gaussian form. So tricks for evaluating the integrals in the denominator are mathematically really cool, but just not needed for my application. The only issue I need resolution with is manipulating the numerator.

— ely

This answer gives you both numerator and denominator. The numerator exhibits the proper second degree polynomial in

(α, μ)

$(\alpha,\mu)$ that leads to the normal quadratic form, as stressed by probabilityislogic.

— Xi'an

@ems - by calculating the normalising constant you will construct the quadratic form required. it will contain the terms needed to compllete the square

— probabilityislogic

I don't understand how this gives you the quadratic form. I've worked out the two integrals in the denominator using the Gaussian integral identity that you posted. In the end, I just get a huge, messy constant. There doesn't seem to be any clear way to take that constant and turn it into something times a determinant to the 1/2 power, etc. Not to mention I don't see how any of this explains how to calculate the new 'mean vector'

(\hat{μ}, \hat{α})^{t}

$(\hat{\mu},\hat{\alpha})^{t}$ .. This is what I was asking for help for in the original question.

— ely

Thanks tremendously for the detailed addition. I was making some silly errors when trying to do the algebra to figure out the quadratic form. Your comments about the relation to the OLS estimator are highly interesting and appreciated as well. I think this will speed up my code because I'll be able to draw samples from an analytic form that has built-in, optimized methods. My original plan was to use Metropolis-Hastings to sample from this, but it was very slow. Thanks!

— ely