Station files

Update: 23.1.04 It has come to my attention that the station count files in CRU TS 2.0 are not as helpful as they could be. These station counts are of all stations with a valid absolute datum within range of the grid-box. Stations without a normal for 1961-90 or that overlap another station may have data discarded before gridding. A better version of this information (i.e. only stations that contribute to the final grids) will be provided for the next version of the data-set (CRU TS 2.1).


Introduction The issue of using CRU TS 2.0 for time-series analysis has been addressed more explicitly and fully. We recognise that many users would appreciate information on the changing network of stations that contribute to the value at an individual grid-box. While we are not able to release the station data (we are subject to non-disclosure agreements), what we can do is provide information on the number of stations within range of a grid-box.

Therefore we have released additional files that record the number of stations within range of a grid-box. This page documents these station files.

Availability
These station files are available when users download the data files. See the application page for access to all the files.

Format
These station files are in the same format as the data files. They are large! They may be downloaded as one file per climate variable, or as ten decadal files for each climate variable.

Information
See the time-series analysis page for the purposes for which we envisage these station files being used, and the background to their construction.

It is important to recognise that the station files do not record the number of stations contributing information to a grid-box, but the number of stations within range of a grid-box. One may think of these files as showing the number of stations with information upon which the grid-box may draw, if necessary.

The 'range' referred to is the correlation decay distance assumed to apply to the climate variable. These correlation decay distances were published in New et al (2000):

variable correlation decay distance
cloud cover 600km
DTR 750km
precipitation 450km
temperature 1200km
vapour pressure 1000km

Explanation
The reason why we used the correlation decay distances to define the range is as follows. In constructing a monthly grid, we base the interpolation on the set of available station anomalies. The interpolation is spatially complete. It would be possible, if we took no additional action, for a grid-box (X) many thousands of km from the nearest station (Y), to take the value of that station. This would be unreasonable, because there is no reason to suppose that grid-box X is related to station Y. We assume instead that - if there is no adjacent station information available - the best estimate of grid-box X is the long-term average value; in other words, a zero anomaly.

To implement this assumption we add 'blank stations' with zero anomalies where there are large areas without genuine stations; thus, when we subsequently interpolate, areas without station information are 'relaxed to the climatology'. (This has important implications for time-series analysis.) We judge an area to be sufficiently large to require blank stations when the correlation decay distance is exceeded. The correlation decay distance provides an approximate measure of the area over which the station anomaly provides some reasonable indication of the likely anomaly.

Therefore the correlation decay distance may be used as an approximate measure of the network of stations that are within range of a grid-box, and which may contribute to the value at that grid-box. The station files were produced by interrogating the station databases for stations that contributed to the climate grids, and counting all the stations within range of each grid-box, for each of the 1200 months in 1901-2000.