Every first year graduate student in statistics or economics has to learn the basics of probability and statistics theory. Most of this is fairly understandable – once you grasp the basic concepts behind probability and distribution theory, everything else is applications and details for the most part. There is at least one notable exception to this – the relationship between the various notions of convergence used in probability theory.

Now everyone learns that convergence almost surely implies convergence in probability which, in turn, implies convergence in distribution. Everyone also learns that none of the converse implications hold in general, but I don’t think anyone comes away really grasping the difference between all of these concepts – they learn the “that” but not the “why.” I’m guilty of this too. We memorize which is stronger then go on to use them to talk about central limit theorems, laws of large numbers, and properties of estimators. This is probably fine since these first year courses typically focus on getting comfortable using the main concepts of probability theory rather than on deep understanding – that’s for later courses. But some people can’t wait for those courses, so I’ll try to explain the difference both intuitively and with some math.

Continue reading →