3
$\begingroup$

What is the difference between convergence in distribution, convergence in probability and almost sure convergence?

References

Convergence of random variables

  • 2
    You're linking a page that directly answers your question. So what is your question, actually?2010-08-08
  • 2
    I've looked at the Wikipedia article which gives *definitions*. I am trying to find a clear and succinct explanation of how they are different. Regardless, one of the biggest benefits of this site is how it brings together different perspectives on the same question2010-08-08
  • 5
    After reading some definition(s) one needs to think. And that's not a kind of work that can be outsourced.2010-08-08

4 Answers 4

5

Convergence in distribution means pointwise convergence of the CDFs. It's not a form of convergence of the random variables per se.

Convergence in probability says that for every $\epsilon > 0$, the probability of the sequence of random variables being more than $\epsilon$ from the limiting random variable goes to $0$. Convergence in probability is also called convergence in measure.

Convergence almost everywhere = almost sure convergence = pointwise convergence of the random variables except possibly on a set of measure zero.

Here is a diagram showing which modes of convergence imply which other modes.

4

For $X_1$, $X_2$, $\ldots$, i.i.d., with finite variance $\sigma ^{2}$, the Central Limit Theorem (convergence in distribution) says that if we wait long enough, that is, take a large enough sample, then $$ \frac{\sqrt n(\overline{X}_n-\mu)}{\sigma}$$ will have a probability distribution which is arbitrarily close to a norm(0,1) distribution. Notice the CLT doesn't say anything about the actual behavior of any particular $\overline{X}$ for large $n$, only that if we observed a whole bunch of them (all at that large $n$) and made a histogram then it would be approximately bell-shaped.

The Weak Law of Large Numbers (convergence in probability) says that if we wait long enough (i.e., take a large enough sample) then the chance that $$ \frac{\overline{X}_n-\mu}{\sigma}$$ is nearby zero can be made arbitrarily high. Note a couple of things:

  1. the $\sqrt{n}$ is gone, and
  2. again, nothing is said about any particular observed $\overline{X}$, only that it would be a pretty safe bet that the above is close to zero.

The Strong Law of Large Numbers (convergence almost surely) says that $$ \frac{\overline{X}_n-\mu}{\sigma}$$ is 100% sure to be arbitrarily close to 0, provided we wait long enough (take a large enough sample). That is strong, indeed. Again, please note a couple of things:

  1. the $\sqrt{n}$ is still gone. In fact, this is the same fellow that was in the WLLN, and
  2. this time, something is said about the particular sequence of $\overline{X}_n$'s that we are watching go by. It's still possible that it would be far away (more than $\epsilon$), but all sequences for which $\overline{X}_n$ never converges to $\mu$ have probability zero.

Of course, nothing is ever guaranteed in probability; we can do no better than 100% sure. Also, the SLLN doesn't say anything about how long we would have to wait to be within $\epsilon$, we would need the Law of the Iterated Logarithm for something like that. Finally, this discussion is about convergence (in distribution/probability/a.s.) to a constant, while the general definition is about convergence to another random variable (or even convergence of two sequences). But we can regain the intuition if we think about differences.

I don't see anything wrong with the other two answers given for this question, I just thought I'd offer another perspective.

1

Consider a series of random variables $X_i$ to $X$. Let $F(x)$ be the cumulative distribution function of $X$ an $F_{n}$ be the cumulative distribution function of $X_i$. Convergence in distribution occurs when $F(x)$ is the pointwise limit of $F_{n}$, ie. $\lim_{n \to \infty} F_{n}(x)=F(x)$, whenever F is continuous at x.

Convergence in probability is defined as follows $\lim_{n \to \infty } Pr(|X_n-X| \ge e)=0$. It is stronger than convergence in distribution - I don't know how to prove this, but I can give the following explanation. For convergence in distribution, we only looked at the cumulative functions, so it didn't matter whether the variables were dependent or not. For convergence in probability, $X$ needs to either be a constant or $X_n$ must be dependent on X.

Almost sure convergence is defined as follows $Pr(\lim_{n \to \infty} X_n=X)=1$. On this MathOverflow thread I asked if there was a way to reduce limits to some kind of canonical form. I obtained the following results where $dif_x=|X_n-X|$:

  • Convergence in probability: $\forall e, d, n>N_1(e,d): dif_x \ge e \text{ with } p < d $
  • Almost sure convergence: ($\forall e, n>N_2(e): dif_x < e) \text{ with } p=1$
  • Alternative form: ($\forall e, n>N_2(e): dif_x \ge e) \text{ with } p=0$

We notice that for convergence in probability, we can reduce $p$ below any arbitrary positive value by choosing values of $n$ high enough, while in the second formula we can reduce it to 0. Additionally, the probability in the first formula looks at individual probabilities, while the second looks at probabilities over all possible values of e and n. From these forms it is clear that Almost Sure Convergence implies Convergence in probability. If convergence in probability is not occurring, then we can find some e, d, with with $p \ge d \forall n$. This alone will cause the $p$ in almost sure convergence to not be 0.

-1

A sequence X1 is. 1,1,2,1,1,2 ... 1,1,2... A sequence X2 is. 2,1,1,2,1,1,... 2,1,1.

Assume a classical setup for the measure.

X1 and X2 converge in distribution. Since both random variables output the value "1" with 2/3 probability, and "2" has probability 1/3.

But X1 will never converge in probability to X2 because as the number of samples increases two thirds of the samples from the random variables will still be different from each other. Obviously they are not getting close to each other as they should in convergence in probability.

  • 0
    Welcome to Math.SE! You may want to read our [guide to notation](http://math.stackexchange.com/help/notation). (Also, it's be nice to mention that the classical setup is the product measure on $\{0,1\}^{\mathbb N}$.)2014-07-17
  • 0
    Even in the classical setup, this is so far from being acceptable...2014-07-17