Format of CRU TS Observation files CRU TS observation files are monolithic, with one file per variable. The files are text files; they can be opened with any sufficiently capable text editor. Each station record consists of one header line, one normals line, then a line of monthly data for each year of that station's timespan. Thus the records are of variable length; though information in the header line, (start and end years) allows record length to be determined. Station records are concatenated together, one after the other. Observations themselves are in integer format; this is a legacy format designed to save storage space. Typically, actual observations may be acquired by dividing by 10; the exceptions are WET days and FRS days, which must be divided by 100. Here is a typical station record for precipitation, showing the headers plus the first few years of observations: 0305900 5750 -420 4 INVERNESS UK 1781 1994 6190-9999-9999-9999-9999-9999-9999-9999-9999-9999-9999-9999-9999 1781-9999-9999-9999-9999-9999-9999-9999 686 348 135-9999-9999 1782 559 439 665 765 930 559 315 1676 800 813 787 84 1783 914 432 686 216 203 787 584 889 762 635 241 279 The header line breaks down like this (format in square brackets): 0305900 5750 -420 4 INVERNESS UK 1781 1994 _______ [i7] WMO code, in this case 03 059, packed with least-significant zeros _____ [i5] Latitude, in degrees x100 (this is 57.5°N) ______ [i6] Longitude, in degrees x100 (this is 4.2°W) ____ [i4] Altitude, in m. ____________________ [a20] Station Name _____________ [a13] Country Name ____ [a4] Start Year ____ [a4] End Year Each field in the header is separated by a single space, so in Fortran terms the format is '(i7,1x,i5,1x,i6,1x,i4,1x,a20,1x,a13,2(1x,i4))'. The normals line looks like this: 6190-9999-9999-9999-9999-9999-9999-9999-9999-9999-9999-9999-9999 ____ [i4] Always 6190, the normals period for CRU TS _____ [i5] January normal, here the missing value (-9999) is substituted _____ [i5] February normal, as above (etc) Fields in the normals line, and subsequent data lines, are not separated by spaces. Again, this is a legacy format to save storage space. So, in Fortran the format is '(i4,12i5)'. Note that the normals line is mainly unpopulated, as normals are generally constructed from the observations at run time. Finally, the first data line: 1781-9999-9999-9999-9999-9999-9999-9999 686 348 135-9999-9999 ____ [i4] The year of observation _____ [i5] January observation, here the missing value (-9999) is substituted _____ [i5] February observation, as above (etc) _____ [i5] August observation, 68.6mm _____ [i5] September observation, 34.8mm (etc) As before, the format is '(i4,12i5)'. Reading the files: Start of loop -> read header (finish at EOF) -> read start + end years from header, calculate 'n. years' -> read normals line -> read 'n. years' lines of observations End of loop As usual, please direct any questions or bug reports to: Ian Harris Climatic Research Unit School of Environmental Sciences University of East Anglia UK NR2 4HG i.harris@uea.ac.uk