Why must I explain “kurtosis”?
The annual rainfall at Manilla, NSW has changed dramatically decade by decade since the record began in 1883. One way that it has changed is in the amount of rain each year, as shown in this graph that I posted earlier.
Another way, unrelated to the amount of rain, is in its kurtosis. Higher kurtosis brings more rainfall values that are extreme; lower kurtosis brings fewer. We would do well to learn more about rainfall kurtosis.
[A comment on the meaning of kurtosis by Peter Westfall is posted below.]
Describing Frequency Distributions
The Normal Distribution
Many things vary in a way that seems random: pure chance causes values to spread above and below the average.
If the values are counted into “bins” of equal width, the pattern is called a frequency-distribution. Randomness makes this pattern form the unique bell-shaped curve of Normal Distribution.
The values of annual total rainfall measured each year at Manilla have a frequency-distribution that is rather like that. This graph compares the actual distribution with a curve of Normal Distribution.
Moments of a Normal Distribution: (i) Mean, and (ii) Variance
The shape of any frequency-distribution is described in a simple way by a set of four numbers called moments. A Normal Distribution is described by just the first two of them.
The first moment is the Mean (or average), which says where the middle line of the values is. For Manilla annual rainfall, the Mean is 652 mm.
The second moment is the Variance, which is also the square of the Standard Deviation. This second moment says how widely spread or scattered the values are. For Manilla annual rainfall, the Standard Deviation is 156 mm.
Moments of other (non-normal) distributions: (iii) Skewness, and (iv) Kurtosis
The third moment, Skewness, describes how a frequency-distribution may have one tail longer than the other. When the tail on the right is longer, that is called right-skewness, and the skewness value is positive in that case. For the actual frequency-distribution of Manilla annual rainfall, the Skewness is slightly positive: +0.268. (That is mainly due to one extremely high rainfall value: 1192 mm in 1890.)
Kurtosis is the fourth moment of the distribution. It describes how the distribution differs from Normal by being higher or lower in its peak (but see the comments below) or its tails, as compared to its shoulders.
As it was defined at first, a Normal Distribution had the kurtosis value of 3, but I (and Excel) use the convention “excess kurtosis” from which 3 has been subtracted. Then the excess kurtosis value for a Normal Distribution is zero, while the kurtosis of other, non-normal distributions is either positive or negative.
Manilla’s total frequency distribution of annual rainfall has a Kurtosis of -0.427. As shown here (copied from an earlier post), I fitted a curve with suitably negative kurtosis to Manilla’s (smoothed) annual rainfall distribution.
Platykurtic, Mesokurtic, and Leptokurtic distributions
Karl Pearson invented the terms: platykurtic for (excess) kurtosis well below zero, mesokurtic for kurtosis near zero, and leptokurtic for kurtosis well above zero.
The sketch by W S Gosset at the top of this page shows the typical shapes of platykurtic and leptokurtic curves.
(See the Note below: ‘The sketch by “Student”‘.)
In the two graphs that follow, I show how a curve of Normal Distribution can be modified to be leptokurtic (+ve) or platykurtic (-ve) while remaining near-normal in shape. (See the note “Constructing the kurtosis adjuster”)
In both of these graphs, I have drawn the curve of Normal Distribution in grey, with call-outs to locate the mean point and the two “shoulder” points that are one Standard Deviation each side of the mean.
A leptokurtic (+ve) curve
By adding the “adjuster curve” (red) to the Normal curve, I get the classical leptokurtic shape (green) as was sketched by Gosset. It has a higher peak, lowered shoulders, and fat tails. The shape is like that of a volcanic cone: the peak is narrow, and the upper slopes steep. The slopes get gentler as they get lower, but not as gentle as on the Normal Curve.
A platykurtic (-ve) curve
For this construction, the same “adjuster curve” is turned over before adding it to the Normal Curve. The result is like Gosset’s platypus: short tails and a peak flattened to form a broad back.
In arranging to lower the peak and raise the shoulders of the Normal Curve I could not avoid giving the adjusted curve two peaks. Platykurtic curves may grade into bimodal curves. (I modeled Manilla’s historic distribution of annual rainfall as either platykurtic or bimodal (preferred) in this earlier post.)
More discussion of kurtosis
The leptokurtic and platykurtic curves in my graphs are close to curves of Normal Distribution.
The extreme case of a platykurtic distribution is quite different. It is represented by tossing a coin many times and counting +1 for a head and -1 for a tail. The mean is zero, but there is no “peak” there. There are also no “tail” values beyond +1 or -1. The entire weight in the distribution is at +1 and -1, which are the shoulders, spaced at one Standard Deviation from the mean. The excess kurtosis of this distribution is minus 2, the most platykurtic of all.
Among distributions that are leptokurtic, the extreme case is the Student’s t curve with four degrees of freedom. Its kurtosis value is infinite. Despite that, it looks much the same as the “Leptokurtic Curve” in my graph. Leptokurtic distributions are marked mainly by the greater weight of their extreme tails. That is hard to show in a drawing, since the weight of the tails, whether in a mesokurtic Normal Distribution or in a highly leptokurtic distribution, is very small indeed.
For mathematics, statistics, and opinions on Kurtosis, I recommend the wikipedia article “Kurtosis” and the tutorial by Stan Brown.
History of four moments of annual rainfall
A following post shows the history of all four “Moments of Manilla’s 12-monthly Rainfall”.
See also “Relations Among Rainfall Moments”.
Note: The sketch by “Student”
The sketch (with text) was published by William Sealy Gosset, the “Student” who invented the Student’s t-test in statistics. It was a footnote on page 160 in this paper:
Student, 1927. “Errors of Routine Analysis”, Biometrika 19 (1/2), 151-164.
‘Gosset was a friend of both [Karl]Pearson and [R.A.]Fisher, a noteworthy achievement, for each had a massive ego and a loathing for the other. He was a modest man who once cut short an admirer with the comment that “Fisher would have discovered it all anyway.”‘
Note: Constructing the kurtosis adjuster
I have modified a curve of Normal Distribution by adding an arbitrary “Adjuster” curve. The “Adjuster” is sinusoidal, with a peak (or trough) at the mean, symmetrical about the mean, tapered away from the mean, and with wavelength increasing away from the mean.
The zero points where the adjuster changes from positive to negative have been fixed at:
z = (Mean)+/-0.4*(StdDev), confining 15% (approx.) each side of the peak value;
z = (Mean) +/-1.65*(StdDev), confining 5% in each tail
Thus, for the Normal Distribution, the peak (30%) and the two tails (10%) total 40% and the remainder, the shoulders, total 60%.
Using this model, the peak and tails together would be more than 40% for a leptokurtic curve, and less than 40% for a platykurtic curve.
5 thoughts on “Kurtosis, Fat Tails, and Extremes”
The discussion of “peak” and “shoulders” is a misdirection that is not particularly helpful or even correct. In general, kurtosis communicates information about outlier (rare, extreme observation) potential only. Consider a very discrete distribution, like the two-point distribution: peak and shoulders make no sense in this context but kurtosis still tells you about the rare outcome (the outlier). Please point your readers here https://en.wikipedia.org/wiki/Talk:Kurtosis#Why_kurtosis_should_not_be_interpreted_as_.22peakedness.22 for a clear explanation of what is kurtosis, and what information it communicates.
Many thanks for your comment, Peter. You are an acknowledged authority.
Indeed, I intend to use kurtosis as an indicator of extremes rather than of shape as so wittily sketched by Gosset 90 years ago.
good sir…although sheep may be fat-tailed (or not), ‘platy’ is Greek for ‘flat’, not ‘fat’. As I am sure you are aware… DD and D(Mrs)D in Vermont
It is unfortunate that the wrong Greek term has been used to describe the concept. Platykurtic (excess kurtosis <0) does not imply flatness at all. It simply imply that there is a paucity of outliers (rare, extreme observations) as compared to what the normal distribution predicts. The peak can have any shape whatsoever – infinite, U-shaped, bimodal, trimodal, sinusoidal, triangular, anything, including flat.
Blame it on Pearson. He had it wrong in 1905, but everyone just followed along (like sheep!)
In the world of frequency-distributions, “flat” and “fat” are poles apart – one in the peak and the other in the tails.
Pearson, inventing the term platykurtic, combined words meaning flat and curvature, referring to a peak that is broad.
Others, from Gosset on, see that kurtosis is determined by the tails rather than by the peak. Leptokurtic means with tails that are heavy: fat tails, or long tails, or long fat tails.