Wenn IID sind, dann berechne , wobei

14

Frage

Wenn IID sind, dann berechne , wobei . $X_1,\cdots,X_n \sim \mathcal{N}(\mu, 1)$ $\mathbb{E}\left( X_1 \mid T \right)$ $T = \sum_i X_i$

Versuch : Bitte überprüfen Sie, ob das unten stehende korrekt ist.

Nehmen wir an, wir nehmen die Summe dieser bedingten Erwartungen so, dass Dies bedeutet, dass jedes da IID sind.

\sum i E (X i ∣ T) = E (\sum i X i ∣ T) = T .

$\begin{align} \sum_i \mathbb{E}\left( X_i \mid T \right) = \mathbb{E}\left( \sum_i X_i \mid T \right) = T . \end{align}$

E(Xi∣T)=Tn $\mathbb{E}\left( X_i \mid T \right) = \frac{T}{n}$

X1,…,Xn $X_1,\ldots,X_n$

Somit ist . Ist es richtig? $\mathbb{E}\left( X_1 \mid T \right) = \frac{T}{n}$

— Lernen
quelle

2

Die

Xi $X_i$ ‚s sind nicht iid der Bedingung

T $T$ , sondern haben eine austauschbare gemeinsame Verteilung. Dies impliziert, dass ihre bedingten Erwartungen alle gleich sind (bis

T/n $T/n$ ).

— Jarle Tufto

@JarleTufto: Was meinst du mit "austauschbarer gemeinsamer Verteilung"? Gemeinsame Verteilung von

Xi $X_i$ und

T $T$ ?

— Lernen

2

Dies bedeutet, dass die gemeinsame Verteilung von

X1,X2,X3 $X_1,X_2,X_3$ dieselbe ist wie die von

X2,X3,X1 $X_2,X_3,X_1$ (und allen anderen Permutationen). Siehe en.wikipedia.org/wiki/Exchangeable_random_variables . Oder siehe @whubers Antwort!

— Jarle Tufto

2

Insbesondere ist die Antwort unabhängig von der Verteilung von

X1,…,Xn $X_1,\ldots,X_n$ .

— StubbornAtom

11

Die Idee ist richtig - aber es geht darum, sie etwas strenger auszudrücken. Ich werde mich daher auf die Notation konzentrieren und das Wesentliche der Idee herausstellen.

Beginnen wir mit der Idee der Austauschbarkeit:

Eine Zufallsvariable $\mathbf X=(X_1, X_2, \ldots, X_n)$ ist austauschbar, wenn die Verteilungen der permutierten Variablen $\mathbf{X}^\sigma=(X_{\sigma(1)}, X_{\sigma(2)}, \ldots, X_{\sigma(n)})$ sind für jede mögliche Permutation $\sigma$ .

Offensichtlich iid impliziert austauschbar.

Als eine Frage der Notation, schreibt $X^\sigma_i = X_{\sigma(i)}$ für die $i^\text{th}$ Komponente von $\mathbf{X}^\sigma$ und läßt

T σ = \sum i = 1 n X σ i = \sum i = 1 n X i = T .

$T^\sigma = \sum_{i=1}^n X^\sigma_i = \sum_{i=1}^n X_i = T.$

Sei $j$ ein beliebiger Index und sei $\sigma$ eine Permutation der Indizes, die $1$ zu $j = \sigma(1).$ (Ein solches $\sigma$ existiert, weil man immer nur $1$ und $j.$ tauschen kann ) Die Austauschbarkeit von $\mathbf X$ impliziert

E [X 1 ∣ T] = E [X σ 1 ∣ T σ] = E [X j ∣ T],

$E[X_1\mid T] = E[X^\sigma_1\mid T^\sigma] = E[X_j\mid T],$

weil wir (in der ersten Ungleichung) lediglich $\mathbf X$ durch den gleichverteilten Vektor $\mathbf X^\sigma.$ Das ist der springende Punkt.

Folglich

T = E [T ∣ T] = E [\sum i = 1 n X i ∣ T] = \sum i = 1 n E [X i ∣ T] = \sum i = 1 n E [X 1 ∣ T] = n E [X 1 ∣ T],

$T = E[T \mid T] = E[\sum_{i=1}^n X_i\mid T] = \sum_{i=1}^n E[X_i\mid T] = \sum_{i=1}^n E[X_1\mid T] = n E[X_1 \mid T],$

woher

E [X 1 ∣ T] = 1 n T .

$E[X_1\mid T] = \frac{1}{n} T.$

— whuber
quelle

4

$\newcommand{\one}{\mathbf 1}$ Dies ist kein Beweis (und +1 für @ whubers Antwort), aber es ist eine geometrische Methode, um eine Vorstellung davon zu erhalten, warum $E(X_1 | T) = T/n$ eine vernünftige Antwort ist.

Let $X = (X_1,\dots,X_n)^T$ und $\one = (1,\dots,1)^T$ , um $T = \one^TX$ . Wir konditionieren dann für das Ereignis, dass $\one^TX = t$ für einige $t \in \mathbb R$ , also ist dies wie das Zeichnen von multivariaten Gaußschen, die auf $\mathbb R^n$ aber es werden nur diejenigen betrachtet, die im affinen Raum enden $\{x \in \mathbb R^n : \one^Tx = t\}$ . Dann wollen wir den Durchschnitt der $x_1$ Koordinaten der Punkte kennen, die in diesem affinen Raum landen (egal, dass es sich um eine Teilmenge mit dem Maß Null handelt).

Wir kennen

X \sim N (μ 1, I)

$X \sim \mathcal N(\mu \one, I)$ , haben also einen sphärischen Gaußschen Wert mit einem konstanten Mittelwertvektor und der Mittelwertvektor

μ1 $\mu\one$ liegt auf der gleichen Linie wie der Normalvektor der Hyperebene

xT1=0 $x^T\one = 0$ .

Dies gibt uns eine Situation wie das folgende Bild:

Die Schlüsselidee: Stellen Sie sich zunächst die Dichte über dem affinen Unterraum $H_t := \{x : x^T\one = t\}$ . Die Dichte von $X$ ist symmetrisch um $x_1 = x_2$ da $E(X) \in \text{span } \one$ . Die Dichte ist auch auf $H_t$ symmetrisch, da $H_t$ auch über dieselbe Linie symmetrisch ist, und der Punkt, um den es symmetrisch ist, ist der Schnittpunkt der Linien $x_1 + x_2 = t$ und $x_1 = x_2$ . Dies geschieht für $x = (t/2, t/2)$ .

To picture $E(X_1 | T)$ we can imagine sampling over and over, and then whenever we get a point in $H_t$ we take just the $x_1$ coordinate and save that. From the symmetry of the density on $H_t$ the distribution of the $x_1$ coordinates will also be symmetric, and it'll have the same center point of $t/2$ . The mean of a symmetric distribution is the central point of symmetry so this means $E(X_1 | T) = T/2$ , and that $E(X_1| T) = E(X_2 | T)$ since $X_1$ and $X_2$ can be excahnged without affecting anything.

In higher dimensions this gets hard (or impossible) to exactly visualize, but the same idea applies: we've got a spherical Gaussian with a mean in the span of $\one$ , and we're looking at an affine subspace that's perpendicular to that. The balance point of the distribution on the subspace will still be the intersection of $\text{span }\one$ and $\{x : x^T\one = t\}$ which is at $x=(t/n, \dots, t/n)$ , and the density is still symmetric so this balance point is again the mean.

Again, that's not a proof, but I think it gives a decent idea of why you'd expect this behavior in the first place.

Beyond this, as some such as @StubbornAtom have noted, this doesn't actually require $X$ to be Gaussian. In 2-D, note that if $X$ is exchangeable then $f(x_1, x_2) = f(x_2, x_1)$ (more generally, $f(x) = f(x^\sigma)$ ) so $f$ must be symmetric over the line $x_1 = x_2$ . We also have $E(X) \in \text{span }\one$ so everything I said regarding the "key idea" in the first picture still exactly holds. Here's an example where the $X_i$ are iid from a Gaussian mixture model. All the lines have the same meaning as before.

— jld
quelle

1

I think your answer is right, although I'm not entirely sure about the killer line in your proof, about it being true "because they are i.i.d". A more wordy way to the same solution is as follows:

Think about what $\mathbb{E}(x_{i}|T)$ actually means. You know that you have a sample with N readings and that their mean is T. What this actually means, is that now, the underlying distribution they were sampled from no longer matters (you'll notice you at no point used the fact it was sampled from a Gaussian in your proof).

$\mathbb{E}(x_{i}|T)$ is the answer to the question, if you sampled from your sample, with replacement many times, what would be the average you obtained. This is the sum over all the possible values, multiplied by their probability, or $\sum_{i=1}^{N}\frac{1}{N}x_{i}$ which equals T.

— gazza89
quelle

1

Note that the

xi|T $x_i|T$ can't be i.i.d., as they are constrained to sum to

T $T$ . If you know

n−1 $n-1$ of them, you know the

nth $n^{th}$ one too.

— jbowman

yes, but I did something more subtle, I said if you sampled multiple times with replacement, each sample would be an i.i.d sample from a discrete distribution.

— gazza89

Sorry! Misplaced the comment, it should have been to the OP. It was meant in reference to the statement "It means that each

E(Xi∣T)=Tn $\mathbb{E}\left( X_i \mid T \right) = \frac{T}{n}$ since

X1,…,Xn $X_1,\ldots,X_n$ are IID."

— jbowman