Rainfall kurtosis vs. HadCRUT4, revised

Patterns of rainfall kurtosis and global temperature.

The kurtosis of annual rainfall at Manilla NSW forms a time-series that matches the time-series of global surface temperature when detrended.

[REVISED:
Earlier posts were based on rainfall data sets that were too small. Estimates of kurtosis and skewness were unstable. For details please read “Rainfall kurtosis matches HadCRUT4” and “Rainfall kurtosis vs. HadCRUT4: scatterplots”.]

The variables

These two climate variables have little in common. Manilla, NSW, is a single station that has a 134-year record of daily rainfall only. That yields estimates of rainfall kurtosis, an indicator of the relative frequency of extreme values.
HadCRUT4 is one of several century-long estimates of near-surface temperature for the whole world. [See Note below: “Data Sources”.]

The visual match of the patterns

The first graph (a dual-axis line chart) shows that these two variables have similar patterns of variation over time.

I found the best visual match by:
* scaling 0.5 units of Manilla rainfall kurtosis to 0.1° of detrended HadCRUT4 temperature;
* aligning the kurtosis value of -0.3 units with the zero of detrended temperature;
* lagging the rainfall by two years.

Features that the two patterns have in common are:
* matching main peaks at 1897, 1942 and 2005, each higher than the one before;
* persistent low values in the 1910’s, 1920’s, 1950’s, 1960’s, 1970’s and early 1980’s;
*some matching minor peaks and troughs.

Regression rainfall kurtosis on HadCRUT4.

The correlation chart

The second graph is a correlation chart. The linear regression of kurtosis on detrended temperature has the reasonable R-squared value of 0.67.
As I have made it a connected scatterplot, you can see how the relation has changed through time. From the first data point in 1898 (in red) both variables decreased together to the lowest temperature in 1910. Both peaked in 1942, having risen since 1920, later falling until 1955-56. The final rise to the highest peak (2005) was continuous from 1984 for temperature, but the rise in kurtosis was not. It fell slightly in 1990, then remained static until 1998.
All rainfall figures actually came two years earlier. [See note below: “Manilla’s 2-year lead”.] The assigned two-year lag not only makes peaks match on the first graph. It sharpens the reversals on the second graph. On a trial connected scatterplot without lag, these reversals had been smooth clockwise curves.

What it means

As evidence of extreme behaviour in climate

It is said that more extremes in climate will occur as the world becomes warmer. The evidence is not strong. Most data sets are overwhelmed by noise, and “extreme” is seldom defined with rigor.
In the present case, I believe that the definition of “extreme” that I use is sound: that is, the kurtosis of a frequency-distribution. The instability of kurtosis when based on my small samples had been an issue. In this revision I have increased the sample population size from 21 to 125.

My rainfall data set that displays more and less extreme behaviour is not general but local. It can merely suggest that data elsewhere may reveal functional relationships.

De-trended global temperature

Continue reading

Rainfall kurtosis vs. HadCRUT4 Scatterplots

These scatterplots and Connected Scatterplots support a relationship between the kurtosis of annual rainfall at Manilla NSW and the de-trended smoothed HadCRUT4 series of global temperatures.

Scatterplot rainfall kurtosis vs. HadCRUT all data

[SUPERSEDED
This post had inadequated data. It is now superseded by a section in the post “Rainfall kurtosis vs. HadCRUT4, revised” of 20 May 2018.]

The raw data, as observed

The first scatterplot compares (y-axis) all the calculated unsmoothed values of kurtosis of annual rainfall at Manilla, NSW with (x-axis) the unsmoothed values of the HadCRUT4 series of global near-surface temperature at those dates.
[I have plotted rainfall values lagged by five years on all of the scatterplots shown. This lagging makes little difference to the first two scatterplots.]

On this first graph, the fitted linear trend barely supports a positive relation of kurtosis to temperature. The slope is low (1.05) and the R-squared only 0.16. There is an aberrant cloud of points in the top left corner.

Scatterplot rainfall kurtosis vs. HadCRUT detrended (all data)

The raw data, HadCRUT4 de-trended

This graph takes a first step towards a better model for the relationship: the secular trend of the temperature series (that is, the global warming) is removed. For comparison, I have not re-scaled the x-axis.
Although still very weak, the relation is much enhanced. The slope (2.35) is twice as steep and the R-squared (0.24) increased by 50%.

Connected Scatterplot rainfall kurtosis vs. HadCRUT all data

Smoothed data, HadCRUT4 de-trended

This third graph uses smoothed data. The HadCRUT4 series is  “decadally-smoothed” (as published) with a 21-point binomial filter to remove high frequency noise. The rainfall data, already damped by its 21-year sampling window, has been further smoothed with a 9-point Gaussian filter.
This graph is a Connected Scatterplot, that shows the trajectory of the rainfall-temperature relation with the passing of time.

Line chart rainfall kurtosis vs. HadCRUT (detrended)Smoothing both data sets has given a much closer relation. The R-squared value is almost doubled again, to 0.43, and the slope is increased to 3.70. The date labels show that the relation before 1910 was different from that at later dates. (This had also been clear in the Dual axis line chart, copied here, from the post “Rainfall Kurtosis Matches HadCRUT4”.)

Connected Scatterplot rainfall kurtosis vs. HadCRUT from 1908

Smoothed data, HadCRUT4 de-trended, from 1908 to 2002

In this final graph, I have discarded the first eleven years. The linear regression based on smoothed values from 1908 to 2002 has a steep slope of 5.21 and a respectable R-squared value of 0.84.

I had prepared similar graphs for lag values of rainfall kurtosis from zero up to nine. The lag value of five years tends to maximise the slope and the R-squared values.
Choice of a five-year lag tends to form hair-pin loops in the trace, while shorter lags give wider clockwise loops and longer lags give wider anti-clockwise loops.
The lag value of five years implies that the Manilla annual rainfall kurtosis value for a given year matches the de-trended HadCRUT value that occurs five years later.

[Back to the main post on this topic: “Rainfall kurtosis matches HadCRUT4”.]

Rainfall kurtosis matches HadCRUT4

Line chart rainfall kurtosis vs. HadCRUT (detrended)

The kurtosis of annual rainfall at Manilla NSW forms a time-series that matches the time-series of global surface temperature when de-trended.

[SUPERSEDED
This post had inadequate data. It is now superseded by the post “Rainfall kurtosis vs. HadCRUT4, revised” of 20 May 2018.]

Features of the data

Data sources, noted on the graph, are specified below. The best match is achieved by decadal smoothing, by scaling 1.0 units of kurtosis to 0.16 degrees of temperature, and by lagging the rainfall data five years.

Closeness of the match

Although both variables have irregular traces, their patterns are almost the same. They begin and end very high, have a broad peak near 1943, and are low in the 1910’s, 1920’s, 1950’s, 1960’s and 1970’s.
The match is very close for ninety years from 1915 to 2005, except for one decade (at 1972). In all this time, both the values and the slopes (as scaled) agree. [See the Note below “1991-1992”.]

Before 1915, the patterns do not match well, but they remain similar. Both traces descend rapidly together from 1903 to 1910. The initial peak in the rainfall trace at 1903 (actually 1898) is similar in height (as scaled) to a peak of the de-trended temperature trace just off the graph at 1879.

Discovering the pattern match

I was seeking a robust measure of the occurrence of extreme values in annual rainfall at Manilla, NSW. As kurtosis is just such a measure, I calculated it. I then plotted out the time-series, as shown here. It reminded me of the well-established time-series of smoothed HadCRUT4 global near-surface temperature. In particular, I recalled a locally-dominant peak near 1940.

Line chart rainfall kurtosis vs. HadCRUT
Simply reconciling the vertical scales of the two time-series gave me the second graph.
While not matching in details, the two curves remain very close from 1940 to 1995. Matching over the whole rainfall record is prevented by a difference in trend. While the rainfall kurtosis has no trend, the HadCRUT4 curve has a secular trend rising at half a degree per century (known as “global warming”).
To extend and improve the match, I subtracted the linear trend from the global temperature curve, and lagged the rainfall points by five years. The first graph is the closely-matching result.

What it means

As evidence of extreme behaviour in climate

It is said that more extremes in climate will occur as the world becomes warmer. The evidence is not strong. Most data sets are overwhelmed by noise, and “extreme” is seldom defined with rigor.
In the present case, I believe that the definition of “extreme” that I use is sound: that is, the kurtosis of a frequency-distribution. Only the instability of kurtosis when based on small samples is an issue.

My rainfall data set that displays more and less extreme behaviour is not general but local. It can merely suggest that data elsewhere may reveal functional relationships.

Connected Scatterplot rainfall kurtosis vs. HadCRUT from 1908A very strong and persistent empirical relationship is shown by the graphical logs above. In another post, “Rainfall Kurtosis vs. HadCRUT4  Scatterplots”, I show scatterplots like this in support of it.

De-trended global temperature

This strong link between local annual rainfall kurtosis and global climate has a surprising feature. Although this extreme behaviour seems to relate to global temperature, it does not relate to global warming! It relates to a temperature trace from which the global warming trend has been removed. Times of high kurtosis, denoting enhanced extremes, correspond to times when the global temperature was highest above trend. Such times occurred not only in the twenty-first century, but equally in the nineteenth century. There was another (widely-known), lower peak in de-trended global temperature near 1940: at that time also kurtosis was above normal.

Should global temperature remain static for a time, it would be falling rapidly below its rising trend. According to this data set, that should bring reduced extreme behaviour in annual rainfall at Manilla.


Data Sources

(i) Global temperature time-series

From the three available century-long time series of global near-surface temperature I have chosen to use HadCRUT4, published by the British Met Office Hadley Centre. The link is here.

I selected from the section: “HadCRUT4 time series: ensemble medians and uncertainties”.
From this, I downloaded two files:
(i) “Global (NH+SH)/2, annual”;
(ii) “Global (NH+SH)/2, decadally smoothed”.
[The “Decadally smoothed” data supplied is annual data smoothed with a 21-point binomial filter.]
From each data file, I used only the first column: the year date, and the second column: the median value.

I established the secular trend of global warming using the linear trend function in Charts for “Excel”. I found the linear trend of the whole HadCRUT4 annual series data (1850 to 2016) to be:

y = 0.005x – 0.52.

I then subtracted the annual value at the trend line from the decadally smoothed HadCRUT4 value to get the de-trended smoothed value shown on the first graph.

(ii) Kurtosis of Manilla annual rainfall

The rainfall data is that for Manilla Post Office, Station 055031 of the Australian Bureau of Meteorology. Station 055031 functioned without gaps from 1883 to March 2015. Since then, the official record is fragmentary.
I found kurtosis values for annual rainfall by using the (excess) kurtosis function in “Excel”. I used sub-populations of 21 successive years, referred to the median year. I found values for the years 1893 to 2006. I smoothed these values with a 9-point gaussian filter (yielding similar smoothing to that of HadCRUT4). Smoothing reduced the plottable years to those from 1897 to 2002.

Manilla yearly rainfall history: four momentsI posted a line graph of this kurtosis data earlier, in “Moments of Manilla’s Yearly Rainfall History”.


Note: 1991-1992

The most striking match in the graph is that both traces pause at 1991-1992 within a two-decade-long steady rapid rise. That pause in the global temperature series has been attributed with little doubt to the injection into the atmosphere of seventeen million tonnes of sulphur dioxide by the eruption of Mount Pinatubo in the Philippines. That eruption cannot have affected the rainfall kurtosis five years earlier.

Kurtosis, Fat Tails, and Extremes

sketch demonstrating kurtosis

PLATYKURTIC left; LEPTOKURTIC right

Why must I explain “kurtosis”?

Manilla 21-year rainfall mediansThe annual rainfall at Manilla, NSW has changed dramatically decade by decade since the record began in 1883. One way that it has changed is in the amount of rain each year, as shown in this graph that I posted earlier.

Another way, unrelated to the amount of rain, is in its kurtosis. Higher kurtosis brings more rainfall values that are extreme; lower kurtosis brings fewer. We would do well to learn more about rainfall kurtosis.

[A comment on the meaning of kurtosis by Peter Westfall is posted below.]

Describing Frequency Distributions

The Normal Distribution

Many things vary in a way that seems random: pure chance causes values to spread above and below the average.
If the values are counted into “bins” of equal width, the pattern is called a frequency-distribution. Randomness makes this pattern form the unique bell-shaped curve of Normal Distribution.

Histogram of annual rainfall frequency at Manilla NSWThe values of annual total rainfall measured each year at Manilla have a frequency-distribution that is rather like that. This graph compares the actual distribution with a curve of Normal Distribution.

Moments of a Normal Distribution: (i) Mean, and (ii) Variance

The shape of any frequency-distribution is described in a simple way by a set of four numbers called moments. A Normal Distribution is described by just the first two of them.
The first moment is the Mean (or average), which says where the middle line of the values is. For Manilla annual rainfall, the Mean is 652 mm.
The second moment is the Variance, which is also the square of the Standard Deviation. This second moment says how widely spread or scattered the values are. For Manilla annual rainfall, the Standard Deviation is 156 mm.

Moments of other (non-normal) distributions: (iii) Skewness, and (iv) Kurtosis

The third moment, Skewness, describes how a frequency-distribution may have one tail longer than the other. When the tail on the right is longer, that is called right-skewness, and the skewness value is positive in that case. For the actual frequency-distribution of Manilla annual rainfall, the Skewness is slightly positive: +0.268. (That is mainly due to one extremely high rainfall value: 1192 mm in 1890.)
Kurtosis is the fourth moment of the distribution. It describes how the distribution differs from Normal by being higher or lower in its peak or its tails, as compared to its shoulders.
As it was defined at first, a Normal Distribution had the kurtosis value of 3, but I (and Excel) use the convention “excess kurtosis” from which 3 has been subtracted. Then the excess kurtosis value for a Normal Distribution is zero, while the kurtosis of other, non-normal distributions is either positive or negative.

Smoothed rainfall frequency and a platykurtic curveManilla’s total frequency distribution of annual rainfall has a Kurtosis of -0.427. As shown here (copied from an earlier post), I fitted a curve with suitably negative kurtosis to Manilla’s (smoothed) annual rainfall distribution.

Platykurtic, Mesokurtic, and Leptokurtic distributions

Karl Pearson invented the terms: platykurtic for (excess) kurtosis well below zero, mesokurtic for kurtosis near zero, and leptokurtic for kurtosis well above zero.
The sketch by W S Gosset at the top of this page shows the typical shapes of platykurtic and leptokurtic curves.
(See the Note below: ‘The sketch by “Student”‘.)

In the two graphs that follow, I show how a curve of Normal Distribution can be modified to be leptokurtic or platykurtic while remaining near-normal in shape. (See the note “Constructing the kurtosis adjuster”)
In both of these graphs, I have drawn the curve of Normal Distribution in grey, with call-outs to locate the mean point and the two “shoulder” points that are one Standard Deviation each side of the mean.

A leptokurtic curve

A leptokurtic curve

By adding the “adjuster curve” (red) to the Normal curve, I get the classical leptokurtic shape (green) as was sketched by Gosset. It has a higher peak, lowered shoulders, and fat tails. The shape is like that of a volcanic cone: the peak is narrow, and the upper slopes steep. The slopes get gentler as they get lower, but not as gentle as on the Normal Curve.

A platykurtic curve

A platykurtic curve

Continue reading

Global Warming Bent-Line Regression

HadCRUT global near-surface temperatures

HadCRUtemp2lineThis graph, posted with permission, shows a bent line fitted to the HadCRUT annual data series for global near-surface temperature. Professor Thayer Watkins of San Jose State University Department of Economics posted it on his blog about 2009.

HadCRUTsmoothWithout knowing of this work, I constructed the second graph. I used data from the same HadCRUT source, but a data set smoothed by the authors.

In April 2013 I posted it to a forum thread in”weatherzone”.

Next, I added to that graph a logarithmic plot of global carbon emissions, similarly fitted with a series of straight trend lines.

Log from 1850 of world surface air temperature and carbon emissionsThis I included in posts to several forums: in a post to “weatherzone”, in a post to the Alternative Technology Association forum, and finally in a post to this blog.

Both Professor Watkins and I have fitted bent lines to the data. I fitted the lines by eye (for which I was accused of “cherry-picking”). Professor Watkins used an explicit process of Bent-Line Regression, minimising the deviations by the method of least-squares. Like me, he initially chose by eye the dates of the change points where the straight lines meet. But he then adjusted them so as to minimise the least squares deviations.
[See notes below on the method of Bent-Line Regression.]

The trend lines and change points are practically the same in the Thayer Watkins and the “Surly Bond” graphs:
1. (Up to Down) TW: 1881; SB: 1879.
2. (Down to Up) TW: 1911; SB: 1909.
3. (Up to Down) TW: 1940; SB: 1943.
4. (Down to Up) TW: 1970; SB: 1975.
As I said at the time, once straight trend lines are chosen, the dates of change points to fit this data series closely do not allow of much variation.

Relation to the IPO (or PDO) of the Pacific

Not by coincidence, Watkins and I both went on to relate the multi-decadal oscillations of Pacific Ocean temperatures to the global near-surface average temperatures.

My approach

I merely plotted my chosen global temperature change points on to the Pacific graphs (I chose to cite the IPO (Inter-decadal Pacific Oscillation)). In two posts I noted (i) the way the change points in the HadCRUT global temperature series were close to those in the IPO, and (ii) the way the IPO seemed able to explain why the trend in global warming was “bent” in 1943 and 1975 but, in that case, could only sharpen the bends of 1910 and 1880.

Professor Watkins’ approach

AGT_PDO7Professor Watkins did a separate Bent Line Regression Analysis on the Pacific graphs (He chose to cite the earlier-developed PDO (Pacific inter-Decadal Oscillation)). His analysis “A Major Source of the Near-Sixty Year Cycle in Average Global Temperatures is the Pacific (Multi)Decadal Oscillation” is here.

He admits the match is poor, with various lags and a different period. He concludes:
“Thus while the Pacific (Multi)Decadal Oscillation appears to be involved in the cycles of the average global temperature there have to be other factors also involved.”

The significance of the IPO

Continue reading

HadCRUT Global Temperature Smoothing

Graph of recent HadCRUT4

As a long-term instrumental record of global temperature, the HadCRUT4 series may be the best we have. [See Ole Humlum’s blog in the notes below.]
I like to use the published smoothed annual series of HadCRUT4.  I find that this smoothing gets rid of the “noise” that makes graphs about global warming needlessly hard to read. I used the smoothed HadCRUT series to point out the curious inverse relation between the rate of warming and the rate of growth of carbon emissions in this post from 2014.  I will refer again to that post in discussing the use of bent-line regression to describe global warming.

The Met Office Hadley Centre published the smoothing procedure that they used for the time series of smoothed annual average temperature in the HadCRUT3 data set. The smoothing function used is a 21-point binomial filter. The weights are specified in the link above.
The authors discuss the fudge that they use to plot smoothed values up to the current year, even though a validly smoothed value for that year would require ten years of data from future years. Their method is to continue the series by repeating the final value. They had added to the uncertainty by using a final value from just part of a year.
They relate how this procedure had caused consternation when the smoothed graph published in March 2008 showed a curve towards cooling, due to the final value used being very cool.
They show the effect by displaying the graph for that date.
They maintain that the unacceptable smoothed curve (because it shows cooling, not warming) is due mainly to using a final value from an incomplete year, saying:
“The way that we calculate the smoothed series has not changed except that we no longer use data for the current year in the calculation.”
That web-page is annotated:
“Last updated: 08/04/2008 Expires: 08/04/2009”
However, this appears to be the current procedure, used with the HadCRUT4 data set.

For my own interest, I plotted the values from 1990 to 2016 of the annual series of HadCRUT4, averaged over northern and southern hemispheres. [Data sources below.]

On my graph (above), all points 1990 to 2016 are as sourced. I have plotted raw values 2017 to 2026 (uncoloured) as I believe they are used in the smoothing procedure. I have also left uncoloured the smoothed data points from 2007 to 2016, to indicate that their values are not fully supported by data.

I agree with Ole Humlum that it is very good of the Met Office to come clean on the logical shortcomings of their procedure for smoothing, but it would be even better if they ceased plotting smoothed points when the smoothing depends on data points for future years.
In my monthly series of parametric plots of smoothed monthly values of climate anomaly variables, I have faced the same problem. I smooth using a 13-point Gaussian curve. My solution is to plot “fully-smoothed” data points (in colour) up to six months ago. That gives a consistent mapping up to that date. The fifth month before now (plotted uncoloured) is smoothed with an 11-point Gaussian and so on, up to the latest month with a necessarily unsmoothed value. A recent example of my parametric plots is Cycling into drought”.


Notes

1.
Ole Humlum’s blog “Climate4you”

[See: Index\Global Temperature\Recent global air temperature change, an overview\]

2.
HadCRUT4 data
Source of raw annual values:

Source of smoothed annual values:

El Niño and My Climate

ENSO and Manilla NSW temperature anomalies over sixteen years

Temperature

The first graph shows that the temperature at Manilla NSW agreed very closely with El Niño and La Niña temperatures for a good part of the last sixteen years.
The El Nino – Southern Oscillation (ENSO) is shown by NINO3.4 monthly anomaly values, and temperature at Manilla, NSW is smoothed monthly mean daily maximum temperature anomalies. (See the Note below.)
Values of Manilla temperatures agree with those of ENSO through the major temperature peaks and troughs in the spring seasons of 2002, 2006, 2007, 2009, and 2010. In the two highest peaks of 2002 and 2009 and the deep trough of 2010, Manilla temperature extremes were more than a month ahead of ENSO temperature extremes.
Since mid-2011, the two curves do not agree well:
* A La Nina in summer 2011-12 that was very weak produced the deepest of all troughs in Manilla temperature.
* An El Nino in winter 2012 resulted in heat at Manilla, but not until four months later.
* In spring 2013, when there was no El Nino at all, Manilla had a heat wave just like those with the El Nino’s of 2002 and 2009, .
The record for ENSO since January 2013 is unlike that earlier this century: it flutters rather than cycles.
To show slower changes, I have drawn cubic trend lines for both of the variables. These also agree closely, with ENSO going from a maximum (2004) to a minimum (2011) seven years later. Manilla temperature trends remained ahead of ENSO temperature trends by one or two years.

Rainfall

ENSO and Manilla NSW rainfall anomalies over sixteen years.

Continue reading