Noch eine zentrale Frage zum Grenzwertsatz


11

Sei {Xn:n1} eine Folge unabhängiger Bernoulli-Zufallsvariablen mit

P{Xk=1}=1P{Xk=0}=1k.
Setze
Sn=k=1n(Xk1k), Bn2=k=1nk1k2
Zeigen Sie, dassSnBn konvergiert in der Verteilung gegen die StandardnormalvariableZdangegen unendlich tendiert.

Mein Versuch ist es, die Lyapunov-CLT zu verwenden, daher müssen wir zeigen, dass es ein δ>0 so dass

limn1Bn2+δk=1nE[|Xk1k|2+δ]=0.

Also setze n k = 1 E | X k - k - 1 | 3 = n k = 1 ( 1δ=1 und B3n=(n k=11

k=1nE|Xkk1|3=k=1n(1k3k2+4k32k4)
Bn3=(k=1n1k1k2)(k=1n1k1k2)

Durch Auswerten für große ns auf dem Computer wird gezeigt, wie beide und B 3 n als n . Aber B 3 n steigt schneller als B 2 n an, so dass n k = 1 E | ist X k - k - 1 | 3k=1nE|Xkk1|3Bn3nBn3Bn2k=1nE|Xkk1|3Bn30 . Kann mir jemand helfen, diese Konvergenz zu beweisen?


7
Dies ist Beispiel 27.3 von Wahrscheinlichkeit und Maß von Patrick Billingsley.
Zhanxiong

Antworten:


10

Es kann lehrreich sein, dieses Ergebnis anhand erster Prinzipien und grundlegender Ergebnisse zu demonstrieren und dabei die Eigenschaften kumulativer Erzeugungsfunktionen auszunutzen (genau wie bei Standardbeweisen des zentralen Grenzwertsatzes). Es erfordert, dass wir die Wachstumsrate der verallgemeinerten harmonischen Zahlen für s = 1 , 2 , verstehen . Diese Wachstumsraten sind bekannt und lassen sich leicht durch Vergleich mit den Integralen n 1 x - s erhalten

H(n,s)=k=1nks
s=1,2,.: Sie konvergieren für s1nxsdx und divergieren ansonsten logarithmisch für s = 1 .s>1s=1

Sei und 1 k n . Per Definition ist die cumulant Erzeugungsfunktion (CGF) von ( X k - 1 / k ) / B n ist ,n21kn(Xk1/k)/Bn

ψk,n(t)=logE(exp(Xk1/kBnt))=tkBn+log(1+1+exp(t/Bn)k).

Die Reihenexpansion der rechten Seite, die sich aus der Expansion von um z = 0 ergibt , hat die Formlog(1+z)z=0

ψk,n(t)=(k1)2k2Bn2t2+k23k+26k3Bn3t3++kj1±(j1)!j!kjBnjtj+.

kkj1|1+exp(t/Bn)k|<1

|exp(t/Bn)1|<k.

(In case k=1 it converges everywhere.) For fixed k and increasing values of n, the (obvious) divergence of Bn implies the domain of absolute convergence grows arbitrarily large. Thus, for any fixed t and sufficiently large n, this expansion converges absolutely.

For sufficiently large n, then, we may therefore sum the individual ψk,n over k term by term in powers of t to obtain the cgf of Sn/Bn,

ψn(t)=k=1nψk,n(t)=12t2++1Bnj(k=1n(k1±(j1)!kj))tjj+.

Taking the terms in the sums over k one at a time requires us to evaluate expressions proportional to

b(s,j)=1Bnjk=1nks

for j3 and s=1,2,,j. Using the asymptotics of generalized harmonic numbers mentioned in the introduction, it follows easily from

Bn2=H(n,1)H(n,2)log(n)

that

b(1,j)(log(n))1j/20

and (for s>1)

b(s,j)(log(n))j/20

as n grows large. Consequently all terms in the expansion of ψn(t) beyond t2 converge to zero, whence ψn(t) converges to t2/2 for any value of t. Since convergence of the cgf implies convergence of the characteristic function, we conclude from the Levy Continuity Theorem that Sn/Bn approaches a random variable whose cgf is t2/2: that is the standard Normal variable, QED.


This analysis uncovers just how delicate the convergence is: whereas in many versions of the Central Limit Theorem the coefficient of tj is O(n1j/2) (for j3), here the coefficient is only O(((log(n))1j/2): the convergence is much slower. In this sense the sequence of standardized variables "just barely" becomes Normal.

We can see this slow convergence in a series of simulations. The histograms display 105 independent iterations for four values of n. The red curves are graphs of standard normal density functions for visual reference. Although there is evidently a gradual tendency towards normality, even at n=1000 (where (log(n))1/20.38 is still sizable) there remains appreciable non-normality, as evidenced in the skewness (equal to 0.35 in this sample). (It is no surprise the skewness of this histogram is close to (log(n))1/2, because that's precisely what the t3 term in the cgf is.)

Figure: histograms for n=30, 100, 300, 1000

Here is the R code for those who would like to experiment further.

set.seed(17)
par(mfrow=c(1,4))
n.iter <- 1e5
for(n in c(30, 100, 300, 1000)) {
  B.n <- sqrt(sum(rev((((1:n)-1) / (1:n)^2))))
  x <- matrix(rbinom(n*n.iter, 1, 1/(1:n)), nrow=n, byrow=FALSE)
  z <- colSums(x - 1/(1:n)) / B.n
  hist(z, main=paste("n =", n), freq=FALSE, ylim=c(0, 1/2))
  curve(dnorm(x), add=TRUE, col="Red", lwd=2)
}

6

You have a great answer already. If you want to complete your own proof, too, you can argue as follows:

Since k=1n1/ki converges for all i>1 and diverges for i=1 (here), we may write

S(n):=k=1n(1k3k2+4k33k4)=k=1n1k+O(1).

By the same argument,

Bn2=k=1n1k+O(1).

Consequently, S(n)/Bn2=O(1) and, thus,

S(n)/Bn3=O(1)(Bn2)1/20,

which is what we wanted to show.


2

First your random variables are not identically distributed if the distributions depend on k ;)

Also I wouldn't use your Bn notation as:

  • capital letters are usually reserved for random variables.
  • it's just the sum of the variances so I would use a notation involving a σ symbol to make this obvious.

Then regarding the question I don't know if this is an exercise or research and what tools you're allowed to use. If you're not trying to re-prove known theorems, I'd just say it's a central limit theorem for independent non-identically distributed but uniformly bounded RV and call it a day. I don't have a good source at hand but it shouldn't be too hard to find one, for example look at /mathpro/29508/is-there-a-central-limit-theorem-for-bounded-non-identically-distributed-random.

Edit : My bad, of course the uniformly bounded condition is not enough, you also need

k=1nσk2
Durch die Nutzung unserer Website bestätigen Sie, dass Sie unsere Cookie-Richtlinie und Datenschutzrichtlinie gelesen und verstanden haben.
Licensed under cc by-sa 3.0 with attribution required.