Konstruktion der Dirichlet-Verteilung mit Gamma-Verteilung

Sei $X_1,\dots,X_{k+1}$ voneinander unabhängige Zufallsvariablen mit jeweils einer Gammaverteilung mit Parametern $\alpha_i,i=1,2,\dots,k+1$ zeige $Y_i=\frac{X_i}{X_1+\cdots+X_{k+1}},i=1,\dots,k$ , haben eine gemeinsame Verteilung als $\text{Dirichlet}(\alpha_1,\alpha_2,\dots,\alpha_k;\alpha_{k+1})$

Gemeinsames pdf von $(X_1,\dots,X_{k+1})=\frac{e^{-\sum_{i=1}^{k+1}x_i}x_1^{\alpha_1-1}\dots x_{k+1}^{\alpha_{k+1}-1}}{\Gamma(\alpha_1)\Gamma(\alpha_2)\dots \Gamma(\alpha_{k+1})}$ Dann kann ich das gemeinsame pdf von $(Y_1,\dots,Y_{k+1})$ nicht finden, dh $J(\frac{x_1,\dots,x_{k+1}}{y_1,\dots,y_{k+1}})$

— Argha
quelle

Schauen Sie sich die Seiten 13-14 dieses Dokuments an .

@Procrastinator Vielen Dank, Ihr Dokument ist die beste Antwort auf meine Frage.

— Argha,

@Procrastinator - Vielleicht sollten Sie dies als Antwort angeben, da das OP damit einverstanden ist, und ein paar Sätze hinzufügen, damit Sie die Warnung "Wir wollen mehr als einen Satz beantworten" nicht auslösen.

— jbowman

Dieses Dokument kann jetzt nicht beantwortet werden, da es sich um ein 404 handelt.

— whuber

Wayback-Maschine zur Rettung: pdf

— Mobeets

Die Jakobiner - die absoluten Determinanten der Veränderung der variablen Funktion - scheinen gewaltig und können kompliziert sein. Sie sind jedoch ein wesentlicher und unvermeidlicher Bestandteil der Berechnung einer multivariaten Variablenänderung. Es scheint, dass es nichts anderes gibt, als eine x Matrix von Ableitungen aufzuschreiben und die Berechnung durchzuführen. $k+1$ $k+1$

Es gibt einen besseren Weg. Es wird am Ende im Abschnitt "Lösung" angezeigt. Da der Zweck dieses Beitrags darin besteht, Statistiker mit einer möglicherweise für viele neuen Methode bekannt zu machen, ist ein Großteil davon der Erläuterung der Maschinerie gewidmet, die der Lösung zugrunde liegt. Dies ist die Algebra der Differentialformen . (Differentialformen sind die Dinge, die man in mehrere Dimensionen integriert.) Ein detailliertes, ausgearbeitetes Beispiel soll helfen, dies bekannter zu machen.

Hintergrund

Vor über einem Jahrhundert entwickelten Mathematiker die Theorie der Differentialalgebra , um mit den "Ableitungen höherer Ordnung" zu arbeiten, die in der mehrdimensionalen Geometrie vorkommen. Die Determinante ist ein Spezialfall der Grundobjekte, die durch solche Algebren manipuliert werden, bei denen es sich typischerweise um alternierende mehrlineare Formen handelt . Das Schöne daran ist, wie einfach die Berechnungen werden können.

Hier ist alles, was Sie wissen müssen.

Ein Differential ist ein Ausdruck der Form " ". Es ist die Verkettung von " " mit einem beliebigen Variablennamen. $dx_i$ $d$
Eine Einform ist eine lineare Kombination von Differentialen wie oder sogar . Das heißt, die Koeffizienten sind Funktionen der Variablen. $dx_1+dx_2$ $x_2 dx_1 - \exp(x_2) dx_2$
Formulare können mit einem Keilprodukt , geschrieben , "multipliziert" werden . Dieses Produkt ist ein Anti-kommutative (auch als Wechsel ): Für jede zwei Einzelformen und , $\wedge$ $\omega$ $\eta$

$ω \land η = - η \land ω .$ $\omega \wedge \eta = -\eta \wedge \omega.$
Diese Multiplikation ist linear und assoziativ, dh sie funktioniert auf die bekannte Weise. Eine unmittelbare Folge ist , dass , was impliziert , das Quadrat einer one-Form ist immer Null. Das macht die Multiplikation extrem einfach! $\omega \wedge \omega = -\omega \wedge \omega$
Zum Manipulieren der Integranden, die in Wahrscheinlichkeitsberechnungen auftreten, kann ein Ausdruck wie als verstanden werden . $dx_1 dx_2 \cdots dx_{k+1}$ $|dx_1\wedge dx_2 \wedge \cdots \wedge dx_{k+1}|$
Wenn eine Funktion ist, dann ist sein Differential durch Differenzierung gegeben: $y = g(x_1, \ldots, x_n)$

$d y = d g (x_{1}, \dots, x_{n}) = \frac{\partial g}{\partial x_{1}} (x_{1}, \dots, x_{n}) d x_{1} + \dots + \frac{\partial g}{\partial x_{1}} (x_{1}, \dots, x_{n}) d x_{n} .$ $dy = dg(x_1, \ldots, x_n) = \frac{\partial g}{\partial x_1}(x_1, \ldots, x_n) dx_1 + \cdots + \frac{\partial g}{\partial x_1}(x_1, \ldots, x_n) dx_n.$

Die Verbindung mit den Jakobianern ist folgende: der Jakobianer einer Transformation ist bis zum Vorzeichen einfach der Koeffizient von $(y_1, \ldots, y_n) = F(x_1, \ldots, x_n) = (f_1(x_1, \ldots, x_n), \ldots, f_n(x_1, \ldots, x_n))$ , das beim Rechnen auftritt $dx_1\wedge \dots \wedge dx_n$

d y_{1} \land \dots \land d y_{n} = d f_{1} (x_{1}, \dots, x_{n}) \land \dots \land d f_{n} (x_{1}, \dots, x_{n})

$dy_1 \wedge \cdots \wedge dy_n = df_1(x_1,\ldots, x_n)\wedge \cdots \wedge df_n(x_1, \ldots, x_n)$

nach dem Expandieren jedes der als lineare Kombination der in Regel (5). $df_i$ $dx_j$

Beispiel

The simplicity of this definition of a Jacobian is appealing. Not yet convinced it's worthwhile? Consider the well-known problem of converting two-dimensional integrals from Cartesian coordinates $(x, y)$ to polar coordinates $(r,\theta)$ , where $(x,y) = (r\cos(\theta), r\sin(\theta))$ . The following is an utterly mechanical application of the preceding rules, where " $(*)$ " is used to abbreviate expressions that will obviously disappear by virtue of rule (3), which implies $dr\wedge dr = d\theta\wedge d\theta = 0$ .

\begin{aligned} d x d y & = | d x \land d y | = | d (r \cos (θ)) \land d (r \sin (θ)) | \\ = | (\cos (θ) d r - r \sin (θ) d θ) \land (\sin (θ) d r + r \cos (θ) d θ | \\ = | (*) d r \land d r + (*) d θ \land d θ - r \sin (θ) d θ \land \sin (θ) d r + \cos (θ) d r \land r \cos (θ) d θ | \\ = | 0 + 0 + r \sin^{2} (θ) d r \land d θ + r \cos^{2} (θ) d r \land d θ | \\ = | r (\sin^{2} (θ) + \cos^{2} (θ)) d r \land d θ) | \\ = r d r d θ \end{aligned} .

$\eqalign{ dx dy &= |dx\wedge dy| = |d(r\cos(\theta)) \wedge d(r\sin(\theta))| \\ &= |(\cos(\theta)dr - r\sin(\theta)d\theta) \wedge (\sin(\theta)dr + r\cos(\theta)d\theta| \\ &= |(*)dr\wedge dr + (*) d\theta\wedge d\theta - r\sin(\theta)d\theta\wedge \sin(\theta)dr + \cos(\theta)dr \wedge r\cos(\theta) d\theta| \\ &= |0 + 0 + r\sin^2(\theta) dr\wedge d\theta + r\cos^2(\theta) dr\wedge d\theta| \\ &= |r(\sin^2(\theta) + \cos^2(\theta)) dr\wedge d\theta)| \\ &= r\ dr d\theta }.$

The point of this is the ease with which such calculations can be performed, without messing about with matrices, determinants, or other such multi-indicial objects. You just multiply things out, remembering that wedges are anti-commutative. It's easier than what is taught in high school algebra.

Preliminaries

Let's see this differential algebra in action. In this problem, the PDF of the joint distribution of $(X_1, X_2, \ldots, X_{k+1})$ is the product of the individual PDFs (because the $X_i$ are assumed to be independent). In order to handle the change to the variables $Y_i$ we must be explicit about the differential elements that will be integrated. These form the term $dx_1 dx_2 \cdots dx_{k+1}$ . Including the PDF gives the probability element

\begin{aligned} f_{X} (x, α) d x_{1} \dots d x_{k + 1} & \propto (x_{1}^{α_{1} - 1} \exp (- x_{1})) \dots (x_{k + 1}^{α_{k + 1} - 1} \exp (- x_{k + 1})) d x_{1} \dots d x_{k + 1} \\ = x_{1}^{α_{1} - 1} \dots x_{k + 1}^{α_{k + 1} - 1} \exp (- (x_{1} + \dots + x_{k + 1})) d x_{1} \dots d x_{k + 1} . \end{aligned}

$\eqalign{ f_\mathbf{X}(\mathbf{x},\mathbf{\alpha})dx_1 \cdots dx_{k+1} &\propto \left(x_1^{\alpha_1-1}\exp\left(-x_1\right)\right)\cdots \left(x_{k+1}^{\alpha_{k+1}-1}\exp\left(-x_{k+1}\right) \right)dx_1 \cdots dx_{k+1} \\ &= x_1^{\alpha_1-1}\cdots x_{k+1}^{\alpha_{k+1}-1}\exp\left(-\left(x_1+\cdots+x_{k+1}\right)\right)dx_1 \cdots dx_{k+1}. }$

(The normalizing constant has been ignored; it will be recovered at the end.)

Staring at the definitions of the $Y_i$ a few seconds ought to reveal the utility of introducing the new variable

Z = X_{1} + X_{2} + \dots + X_{k + 1},

$Z = X_1 + X_2 + \cdots + X_{k+1},$

giving the relationships

X_{i} = Y_{i} Z .

$X_i = Y_i Z.$

This suggests making the change of variables $x_i \to y_i z$ in the probability element. The intention is to retain the first $k$ variables $y_1, \ldots, y_k$ along with $z$ and then integrate out $z$ . To do so, we have to re-express all the $dx_i$ in terms of the new variables. This is the heart of the problem. It's where the differential algebra takes place. To begin with,

d x_{i} = d (y_{i} z) = y_{i} d z + z d y_{i} .

$dx_i = d(y_i z) = y_i dz + z dy_i.$

Note that since $Y_1+Y_2+\cdots+Y_{k+1}=1$ , then

0 = d (1) = d (y_{1} + y_{2} + \dots + y_{k + 1}) = d y_{1} + d y_{2} + \dots + d y_{k + 1} .

$0 = d(1) = d(y_1 + y_2 + \cdots + y_{k+1}) = dy_1 + dy_2 + \cdots + dy_{k+1}.$

Consider the one-form

ω = d x_{1} + \dots + d x_{k} = z (d y_{1} + \dots + d y_{k}) + (y_{1} + \dots + y_{k}) d z .

$\omega = dx_1 + \cdots + dx_k = z(dy_1 + \cdots + dy_k) + (y_1+\cdots + y_k) dz.$

It appears in the differential of the last variable:

\begin{aligned} d x_{k + 1} & = z d y_{k + 1} + y_{k + 1} d z \\ = - z (d y_{1} + \dots + d y_{k}) + (1 - y_{1} - \dots y_{k}) d z \\ = d z - ω . \end{aligned}

$\eqalign{ dx_{k+1} &= z dy_{k+1} + y_{k+1}dz \\ &= -z(dy_1 + \cdots + dy_k) + (1-y_1-\cdots y_k)dz \\ &= dz - \omega. }$

The value of this lies in the observation that

d x_{1} \land \dots \land d x_{k} \land ω = 0

$dx_1 \wedge \cdots \wedge dx_k \wedge \omega = 0$

because, when you expand this product, there is one term containing $dx_1 \wedge dx_1 = 0$ as a factor, another containing $dx_2 \wedge dx_2 = 0$ , and so on: they all disappear. Consequently,

\begin{aligned} d x_{1} \land \dots \land d x_{k} \land d x_{k + 1} & = d x_{1} \land \dots \land d x_{k} \land z - d x_{1} \land \dots \land d x_{k} \land ω \\ = d x_{1} \land \dots \land d x_{k} \land z . \end{aligned}

$\eqalign{ dx_1 \wedge \cdots \wedge dx_k \wedge dx_{k+1} &= dx_1 \wedge \cdots \wedge dx_k \wedge z - dx_1 \wedge \cdots \wedge dx_k \wedge \omega \\ &= dx_1 \wedge \cdots \wedge dx_k \wedge z. }$

Whence (because all products $dz\wedge dz$ disappear),

\begin{aligned} d x_{1} \land \dots \land d x_{k + 1} & = (z d y_{1} + y_{1} d z) \land \dots \land (z d y_{k} + y_{k} d z) \land d z \\ = z^{k} d y_{1} \land \dots \land d y_{k} \land d z . \end{aligned}

$\eqalign{ dx_1 \wedge \cdots \wedge dx_{k+1} &= (z dy_1 + y_1 dz) \wedge \cdots \wedge (z dy_k + y_k dz) \wedge dz \\ &= z^k dy_1 \wedge \cdots \wedge dy_k \wedge dz. }$

The Jacobian is simply $|z^k| = z^k$ , the coefficient of the differential product on the right hand side.

Solution

The transformation $(x_1, \ldots, x_k, x_{k+1})\to (y_1, \ldots, y_k, z)$ is one-to-one: its inverse is given by $x_i = y_i z$ for $1\le i\le k$ and $x_{k+1} = z(1-y_1-\cdots-y_k)$ . Therefore we don't have to fuss any more about the new probability element; it simply is

\begin{aligned} (z y_{1})^{α_{1} - 1} \dots (z y_{k})^{α_{k} - 1} {(z (1 - y_{1} - \dots - y_{k}))}^{α_{k + 1} - 1} \exp (- z) | z^{k} d y_{1} \land \dots \land d y_{k} \land d z | \\ = (z^{α_{1} + \dots + α_{k + 1} - 1} \exp (- z) d z) (y_{1}^{α_{1} - 1} \dots y_{k}^{α_{k} - 1} {(1 - y_{1} - \dots - y_{k})}^{α_{k + 1} - 1} d y_{1} \dots d y_{k}) . \end{aligned}

$\eqalign{ &(z y_1)^{\alpha_1-1}\cdots (z y_k)^{\alpha_k-1}\left(z(1-y_1-\cdots-y_k)\right)^{\alpha_{k+1}-1}\exp\left(-z\right)|z^k dy_1 \wedge \cdots \wedge dy_k \wedge dz| \\ &= \left(z^{\alpha_1+\cdots+\alpha_{k+1}-1}\exp\left(-z\right) dz\right)\left( y_1^{\alpha_1-1} \cdots y_k^{\alpha_k-1}\left(1-y_1-\cdots-y_k\right)^{\alpha_{k+1}-1}dy_1 \cdots dy_k\right). }$

That is manifestly a product of a Gamma $(\alpha_1+\cdots+\alpha_{k+1})$ distribution (for $Z$ ) and a Dirichlet $(\mathbf\alpha)$ distribution (for $(Y_1,\ldots, Y_k)$ ). In fact, since the original normalizing constant must have been a product of $\Gamma(\alpha_i)$ , we deduce immediately that the new normalizing constant must be divided by $\Gamma(\alpha_1+\cdots+\alpha_{k+1})$ , enabling the PDF to be written

f_{Y} (y, α) = \frac{Γ (α_{1} + \dots + α_{k + 1})}{Γ (α_{1}) \dots Γ (α_{k + 1})} (y_{1}^{α_{1} - 1} \dots y_{k}^{α_{k} - 1} {(1 - y_{1} - \dots - y_{k})}^{α_{k + 1} - 1}) .

$f_\mathbf{Y}(\mathbf{y},\mathbf{\alpha}) = \frac{\Gamma(\alpha_1+\cdots+\alpha_{k+1})}{\Gamma(\alpha_1)\cdots\Gamma(\alpha_{k+1})}\left( y_1^{\alpha_1-1} \cdots y_k^{\alpha_k-1}\left(1-y_1-\cdots-y_k\right)^{\alpha_{k+1}-1}\right).$

— whuber
quelle