About GHCN Temperature Data

A post below this one presents a comparison between unadjusted and adjusted GHCN temperature data. This article provides some background information on these data and how they are treated in Temperature Trend Analysis methodology.

GHCN acts as a global repository for surface weather station records submitted to it by National Weather Services (NWS). Each NWS reviews local records according to their procedures and certifies that the data accurately represent the weather experienced in their jurisdiction. The GHCN version 3 qcu file is composed of these data (qcu signifies quality controlled unadjusted.) Various bloggers, ranging from E.M. Smith (chiefio) to Nick Stokes (moyhu) are satisfied that this file is close to the data submitted by NWS agencies.

The quality control consists of attaching flags to values appearing in the file. Because my home computer has limited power, I worked with the Taverage monthly datasets. There a monthly value is flagged with an “a” if 1 daily value is missing in calculating the average, “b” is 2 dailies missing, and so on up to 9 omissions. 10 or more missing dailies and the month is assigned a “-9999”, indicating a blank for the month. An additional column beside each month identifies outlier values.

My principle is to include all data unless there is good reason to exclude. The data preparation procedure involves unzipping the downloaded file and opening it as a word document. The station records of interest are copied into a new word document, which my notebook can handle without processing delays. The text data is then put into an excel workbook, spread into cells, -9999s converted to blanks, flags and additional columns are removed.

My data quality assurance practices include scrutinizing each value greater than 2 Standard deviations away from mean. I use CUSUM and first differences to test for step changes in the record, which would suggest a non-climatic change in the data (e.g. Change of equipment, procedure or location). In the US CRN#1 dataset I found no step changes, and the outlier values were few. I tested excluding some high or low values, but found no discernible effect on the slopes

The same procedure was followed for the qca file (quality controlled adjusted). This meant adding two additional sheets into each station’s data workbook, examples of which are provided through links below.

In the US CRN#1 unadjusted workbook, there is a sheet for each station with the data pasted into a template that calculates several measures. The basic analysis is to compute the slopes for each month (Jan, Feb, etc.) over the lifetime of that station. The 12 slopes are then averaged for the station trend. In addition, trends are calculated for several shorter periods of interest, again by combining 12 monthly slopes for that station period.  A summary page brings together results from all the stations and generates averages of trends for the set of stations, by months, and by periods of years.

Data workbooks for two stations are provided here:
350412 Baker City, Oregon 350412
417945 San Antonio, Texas 417945

Adjustments Multiply Warming at US CRN1 Stations

A study of US CRN1 stations, top-rated for their siting quality, shows that GHCN adjusted data produces warming trends several times larger than unadjusted data.

The unadjusted files from ghcn.v3.qcu have been scrutinized for outlier values, and for step changes indicative of non-climatic biases. In no case was the normal variability pattern interrupted by step changes. Coverages were strong, the typical history exceeding 95%, and some achieved 100%.(Measured by the % of months with a reported Tavg value out of the total months in the station’s lifetime.)

The adjusted files are another story. Typically, years of data are deleted, often several years in a row. Entire rows are erased including the year identifier, so finding the missing years is a tedious manual process looking for gaps in the sequence of years. All stations except one lost years of data through adjustments, often in recent years. At one station, four years of data from 2007 to 2010 were deleted; in another case, 5 years of data from 2002 to 2006 went missing. Strikingly, 9 stations that show no 2014 data in the adjusted file have fully reported 2014 in the unadjusted file.

It is instructive to see the effect of adjustments upon individual stations. A prime example is 350412 Baker City, Oregon.

Over 125 years GHCN v.3 unadjusted shows a trend of -0.0051 C/century. The adjusted data shows +1.48C/century. How does the difference arise? The coverage is about the same, though 7 years of data are dropped in the adjusted file. However, the values are systematically lowered in the adjusted version: Average annual temperature is +6C +/-2C for the adjusted file; +9.4C +/-1.7C unadjusted.

How then is a warming trend produced? In the distant past, prior to 1911, adjusted temperatures decade by decade are cooler by more than -2C each month. That adjustment changes to -1.8C 1912-1935, then changes to -2.2 for 1936 to 1943. The rate ranges from -1.2 to -1.5C 1944-1988, then changes to -1C. From 2002 onward, adjusted and unadjusted values are the same.

Some apologists for the adjustments have stated that cooling is done as much as warming. Here it is demonstrated that by cooling selectively in the past, a warming trend can be created, even though the adjusted record ends up cooler on average over the 20th Century.

A different kind of example is provided by 417945 San Antonio, Texas. Here the unadjusted record had a complete 100% coverage, and the adjustments deleted 262 months of data, reducing the coverage to 83%. In addition, the past was cooled, adjustments ranging from -1.2C per month in 1885 gradually coming to -0.2C by 1970. These cooling adjustments were minor, only reducing the average annual temperature by 0.16C. Due to deleted years of data, San Antonio went from an unadjusted trend of +0.30C/century to an adjusted trend of +0.92C/century, tripling the warming at that location.

The overall comparison for the set of CRN1 stations:

Area FIRST CLASS US STATIONS
History 1874 to 2014
Stations 23
Dataset Unadjusted Adjusted
Average Trend 0.18 0.76 °C/Century
Std. Deviation 0.66 0.54 °C/Century
Max Trend 1.18 1.91 °C/Century
Min Trend -2.00 -0.48 °C/Century
Ave. Length 119 Years

These stations are sited away from urban heat sources, and the unadjusted records reveal a diversity of local climates, as shown by the deviation and contrasting Max and Min results. Six stations showed negative trends over their lifetimes.

Adjusted data reduces the diversity and shifts the results toward warming. The average trend is 4 times warmer, only 2 stations show any cooling, and at smaller rates. Many stations had warming rates increased by multiples from the unadjusted rates. Whereas 4 months had negative trends in the unadjusted dataset, no months show cooling after adjustments.

Periodic Rates from US CRN1 Stations

°C/Century °C/Century
Start End Unadjusted Adjusted
1915 1944 1.22 1.51
1944 1976 -1.48 -0.92
1976 1998 3.12 4.35
1998 2014 -1.67 -1.84
1915 2014 0.005 0.68

Looking at periodic trends within the series, it is clear that adjustments at these stations increased the trend over the last 100 years from flat to +0.68 C/Century. This was achieved by reducing the cooling mid-century and accelerating the warming prior to 1998.

Surfacestations.org provides a list of 23 stations that have the CRN#1 Rating for the quality of the sites. I obtained the records from the latest GHCNv3 monthly qcu report, did my own data quality review and built a Temperature Trend Analysis workbook. I made a companion workbook using the GHCNv3 qca report. Both datasets are available here:ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v3/

As it happens, the stations are spread out across the continental US (CONUS): NW: Oregon, North Dakota, Montana; SW: California, Nevada, Colorado, Texas; MW: Indiana, Missouri, Arkansas, Louisiana; NE: New York, Rhode Island, Pennsylvania; SE: Georgia, Alabama, Mississippi, Florida.

In conclusion, it is not only a matter of concern that individual station histories are altered by adjustments. But also the adjusted dataset is the one used as input into programs computing global anomalies and averages. This much diminished dataset does not inspire confidence in the temperature reconstruction products built upon it.

Update

In response to a comment, this update shows the effect of GHCN adjustments on each of the 23 stations. The average station was warmed by +0.58 C/Century, from +.18 to +.76, comparing adjusted to unadjusted records.

19 station records were warmed, 6 of them by more than +1 C/century. 4 stations were cooled, most of the total cooling coming at one station, Tallahassee.

So for this set of stations, the chance of adjustments producing warming is 19/23 or 83%.

Unadjusted Adjusted Adjusted – Unadjusted
Years in Stn Trends Stn Trends Stn Trends
Record  °C/Century  °C/Century  °C/Century
351862 CORVALLIS 125 0.38 1.05 0.67
350412 BAKER CITY 125 -0.01 1.48 1.49
51564 CHEYENNE WELLS 118 0.84 1.18 0.34
83186 FT MYERS 121 1.18 1.05 -0.12
121873 CRAWFORDSVILLE 115 -2.00 -0.43 1.57
97847 SAVANNAH 141 -0.09 0.56 0.65
42941 FAIRMONT 93 0.90 1.91 1.01
48702 SUSANVILLE 119 0.04 0.84 0.81
80211 APALACHICOLA 111 -0.07 0.95 1.02
86997 PENSACOLA 135 0.27 0.10 -0.17
88758 TALLAHASSEE 123 0.07 -0.48 -0.54
160549 BATON ROUGE 122 -0.07 0.74 0.81
226177 NATCHEZ 121 -0.78 0.59 1.37
238466 TRUMAN DAM & RSVR 122 -0.56 0.62 1.19
245690 MILES CITY 123 0.28 0.31 0.03
269171 WINNEMUCCA 137 0.39 1.11 0.71
308383 SYRACUSE 112 0.78 0.73 -0.05
322188 DICKINSON 120 0.59 0.69 0.11
325479 MANDAN 102 0.43 0.68 0.25
369728 WILLIAMSPORT 120 0.10 1.01 0.92
376698 PROVIDENCE 130 0.68 1.25 0.56
417945 SAN ANTONIO 130 0.30 0.92 0.62
15749 MUSCLE SHOALS 74 0.54 0.58 0.04
Averages 0.18 0.76 0.58

The excel workbooks with data and analyses are provided for your interest and review.

US CRN1 Adjusted TTA 2014 US CRN1 Unadjusted TTA2 2014