NetCDF variable attributes for climate indices CF - IS-ENES

Climate Information Platform for Copernicus
Data variable attributes for climate indices
or
Climate & Forecasting (CF) convention —
how far will it help us ?
Lars Bärring, Rossby Centre SMHI
with input from Milka Radojevic, CERFACS
2nd Joint CLIPC/IS-ENES2 Workshop on Metadata/DRS for climate indices
Brussels 17 October 2016
Climate Information Platform for Copernicus
Disclaimer
Examples in this presentation (and my experience) are mainly
drawn from thinking about ETCCDI and ET-SCI indices
But I am very well aware of the ECA indices and other initiatives.
The overall ambition is of course that the outcome of this workshop
should be seen as a generally applicable resource
Lars Bärring, SMHI Rossby Centre. Presentation at the 2nd Joint CLIPC/IS-ENES2 Workshop on Metadata/DRS for climate indices, Brussels 17 October 2016
Climate Information Platform for Copernicus
Issues
 Since the 1st workshop in February 2016, many issue have been solved,
and have been – or are being – implemented in codes
… but several remain:
 Index names (as widely known) vs. netCDF variable names
 One index („one file‟) – several alternative thresholds (for users‟ exploratory analyses)
 CF standard names for climate indices based on variables ( CF peculiarities ???)
 canonical units
 standard name of the threshold variable itself
 climatological time axis
 More complex indices are difficult (impossible?) to handle without substantial work
and revisions to the CF convention
Lars Bärring, SMHI Rossby Centre. Presentation at the 2nd Joint CLIPC/IS-ENES2 Workshop on Metadata/DRS for climate indices, Brussels 17 October 2016
Climate Information Platform for Copernicus
Index names (as widely known) vs. netCDF variable names
 CF convention does not standardize variable names, but ….
 “Variable […] names should begin with a letter and be composed of letters,
digits, and underscores.”
-- don‟t use hyphens, and don‟t begin with a digit
-- don‟t use underscores (because of DRS syntax)
 “Case is significant in netCDF names, but it is recommended that names
should not be distinguished purely by case, i.e., if case is disregarded, no two
names should be the same.”
 CMIP5/6 and CORDEX only use names in lowercase
-- contrary to what is customary for many indices
 During the first workshop (and after) we came a long way towards extending
and generalising existing index names (acronyms) and align them with CF
convention requirements
 To discuss: always lowercase or partly uppercase
e.g. TX90p vs. tx90p,
TNltm10 vs. tnltm10, etc….
Lars Bärring, SMHI Rossby Centre. Presentation at the 2nd Joint CLIPC/IS-ENES2 Workshop on Metadata/DRS for climate indices, Brussels 17 October 2016
Climate Information Platform for Copernicus
One index (one file) – variable thresholds (for exploration)
Currently the index names are focussing on a fixed threshold, e.g. SU (>25°C),
which has the generic name TXgt25, or even more general TXgt#, where “#” is a
number (with “m” instead of “-” for negative numbers)
The CF conventions allows to store in one file indices where the threshold constant
varies. This means that the “#” is in fact a range of numbers.
To discuss: how to handle this ?
Lars Bärring, SMHI Rossby Centre. Presentation at the 2nd Joint CLIPC/IS-ENES2 Workshop on Metadata/DRS for climate indices, Brussels 17 October 2016
Climate Information Platform for Copernicus
CF core concepts with respect to the data variable
 The attribute standard_name should be attached to the data variable
 highly desirable (but not required) – it provides a standardised and precise
definition of the data
 standard names are controlled and comes with a canonical unit and a detailed
description
 are often linked to cell methods that provide a brief standardised explanation
of how the variable is calculated (over time and space), additional free text
(guidelines apply)
 The standard name (and cell method) convey essential information in a
standardised way.
 The standard name is not always easy and straightforward to understand in wider
circles. Therefore the attribute long_name is intended to “contain a long
descriptive name which may, for example, be used for labeling plots”. It is used by
data and discovery services like ESGF, CLIPC portal, climate4impact.eu portal
To discuss: do we want to recommend harmonised long names?
Lars Bärring, SMHI Rossby Centre. Presentation at the 2nd Joint CLIPC/IS-ENES2 Workshop on Metadata/DRS for climate indices, Brussels 17 October 2016
Climate Information Platform for Copernicus
Relevant standard names involving thresholds
Canonical Units
air_temperature_threshold
K
integral_of_air_temperature_deficit_wrt_time
Ks
integral_of_air_temperature_excess_wrt_time
Ks
number_of_days_with_air_temperature_above_threshold
1
number_of_days_with_air_temperature_below_threshold
1
number_of_days_with_lwe_thickness_of_precipitation_amount_above_threshold
1
number_of_days_with_surface_temperature_below_threshold
1
number_of_days_with_wind_speed_above_threshold
1
spell_length_of_days_with_air_temperature_above_threshold
day
spell_length_of_days_with_air_temperature_below_threshold
day
spell_length_of_days_with_lwe_thickness_of_precipitation_amount_above_threshold
day
Lars Bärring, SMHI Rossby Centre. Presentation at the 2nd Joint CLIPC/IS-ENES2 Workshop on Metadata/DRS for climate indices, Brussels 17 October 2016
Climate Information Platform for Copernicus
Units
 Any unit that is compatible with the canonical unit given in the definition
of the standard name can be used as long as it is recognised by
UDUNITS2 package
http://imgs.xkcd.com/comics/degrees.png
 In the index (output) files the unit attribute should always be “useroriented”, e.g. mm not kg m-2 s-1 or m, °C not K, degree-days not K s
Lars Bärring, SMHI Rossby Centre. Presentation at the 2nd Joint CLIPC/IS-ENES2 Workshop on Metadata/DRS for climate indices, Brussels 17 October 2016
Climate Information Platform for Copernicus
Units
 Some canonical units are less useful (nor entirely consistent):
 “number_of_days_..._above(below)_threshold” variables have unit “1”
(dimensionless) as the unit “days” is part of the standard name …
 … but this „rule‟ does not apply to “spell_length_of_days_..._threshold”
variables that have unit “day”
o How to solve
o Just go with existing CF rules
 inconsistent from user perspective
o Try to change the CF convention
 will take time (if at all possible)
o One solution is to recommend an extra attribute (“legend_unit”)
 Proposal to discuss:
In the index (output) files the unit attribute should if possible and the legend_unit
attribute should always be “user-oriented”, e.g.
mm not kg m-2 s-1 or m,
°C not K,
degree-days not K s
Lars Bärring, SMHI Rossby Centre. Presentation at the 2nd Joint CLIPC/IS-ENES2 Workshop on Metadata/DRS for climate indices, Brussels 17 October 2016
Climate Information Platform for Copernicus
Units
 Precipitation input data may be associated with different standard names, e.g. related to
a flux (kg m-2 s-1) or to a depth (m) that are dimensionally incompatible. Software need to
handle this.
Standard Name
lwe_precipitation_rate
"lwe" means liquid water equivalent.
Canonical
Units
Unit conversion in ICCLIM
(assuming daily input data)
m s-1
*1000 ( mm s-1) *84600 ( mm)
m
*1000 ( mm)
kg m-2
*1 ( mm)
kg m-2 s-1
*84600 ( mm)
lwe_thickness_of_precipitation_amount
"lwe" means liquid water equivalent.
"Amount" means mass per unit area. The construction
lwe_thickness_of_X_amount or _content means the
vertical extent of a layer of liquid water having the same
mass per unit area.
precipitation_amount
"Amount" means mass per unit area.
precipitation_flux
In accordance with common usage in geophysical
disciplines, "flux" implies per unit area, called "flux
density" in physics.
Lars Bärring, SMHI Rossby Centre. Presentation at the 2nd Joint CLIPC/IS-ENES2 Workshop on Metadata/DRS for climate indices, Brussels 17 October 2016
Climate Information Platform for Copernicus
Thresholds are essential for many indices
 The „simple threshold indices‟ (SU, FD, TR, HDDHEAT, …) have been resolved
 in line with the CF convention
 legend_unit is recommended as additional attribute (cf. previous slide) ??
 The threshold variable must have the same standard name as the input data
(after any unit transformation) e.g. air_temperature, lwe_thickness_of_precipitation
except degree-day indices where it is air_temperature_threshold
 Indices involving less/greater or equal than ( or ), like SU30, SU35, are not
formally (but informally ???) accepted (cf. http://mailman.cgd.ucar.edu/pipermail/cf-metadata/2014/057605.html), but
leaning towards the construct “_at_or_above_threshold”. E.g. SU35 (Tmax  35°C)
would be “number_of_days_with_air_temperature_at_or_above_threshold”
 All indices using thresholds based on climatology of daily
percentiles have not been resolved (so far) because of limitations
induced by the standard name descriptions
Lars Bärring, SMHI Rossby Centre. Presentation at the 2nd Joint CLIPC/IS-ENES2 Workshop on Metadata/DRS for climate indices, Brussels 17 October 2016
Climate Information Platform for Copernicus
Percentile based indices -- two issues
 The threshold is a per-gridcell climatology of the daily p-th percentile based
on some (30 yr) reference period
 The threshold variable is a 3d array (time,y,x) that in turn is related to the
percentile threshold constant.
 This is not covered by the CF convention, because of the standard name
descriptions. E.g.:
Air temperature is the bulk temperature of the air, not the surface
(skin) temperature. A variable whose standard name has the form
number_of_days_with_X_below|above_threshold is a count of the number
of days on which the condition X_below|above_threshold is satisfied.
It must have a coordinate variable or scalar coordinate variable with
the a standard name of X to supply the threshold(s). It must have a
climatological time variable, and a cell_methods entry for within days
which describes the processing of quantity X before the threshold is
applied. A number_of_days is an extensive quantity in time, and the
cell_methods entry for over days should be "sum".
Lars Bärring, SMHI Rossby Centre. Presentation at the 2nd Joint CLIPC/IS-ENES2 Workshop on Metadata/DRS for climate indices, Brussels 17 October 2016
Climate Information Platform for Copernicus
1:It
must have a coordinate variable or scalar coordinate
variable with the a standard name of X to supply the threshold(s)
 We have recently learned that a coordinate variable can be multi-dimensional,
with a 3-dim threshold (time, y, x) variable as coordinate variable --- so OK??
 Should it be a „normal‟ coordinate variable or an ancillary coordinate variable ?
 Now, the threshold variable has pre-defined standard name and other key attributes,
but it has to have a coordinate variable that specifies the percentile value.
 There is no standard name for quantiles
Alternatively
 One could argue that two-layer threshold variables are not needed
 But then the standard name description need to be changed to allow threshold
variables that have a different standard name than the input variable, e.g. “quantile”
 and the 3-dim percentile variable should be stored anyway, as an ancillary variable
Lars Bärring, SMHI Rossby Centre. Presentation at the 2nd Joint CLIPC/IS-ENES2 Workshop on Metadata/DRS for climate indices, Brussels 17 October 2016
Climate Information Platform for Copernicus
Percentile based indices -- two issues
 The threshold is a per-gridcell climatology of the daily p-th percentile based on
some (30 yr) reference period
 The threshold variable is a 3d array (time,y,x) that in turn is related the percentile
threshold constant.
 This is not covered by the CF convention, because of the standard name
descriptions. E.g.:
Air temperature is the bulk temperature of the air, not the surface
(skin) temperature. A variable whose standard name has the form
number_of_days_with_X_below|above_threshold is a count of the number of
days on which the condition X_below|above_threshold is satisfied. It
must have a coordinate variable or scalar coordinate variable with the
a standard name of X to supply the threshold(s). It must have a
climatological time variable, and a cell_methods entry for within days
which describes the processing of quantity X before the threshold is
applied. A number_of_days is an extensive quantity in time, and the
cell_methods entry for over days should be "sum".
Lars Bärring, SMHI Rossby Centre. Presentation at the 2nd Joint CLIPC/IS-ENES2 Workshop on Metadata/DRS for climate indices, Brussels 17 October 2016
Climate Information Platform for Copernicus
2: It must have a climatological time variable
 A climatological time axis is a special kind of time axis intended to take care of
climatologies of e.g. diurnal or seasonal cycles where the average (etc.) is taken
over several disjoint time periods
 This is not relevant/helpful because most indices are calculated over a sequence
of consecutive periods, in the same ways as an ordinary average. Thus a
standard time axis is more appropriate
To discuss: how to handle these two issues ?
-- is there a way around ?
-- can we influence the CF governors to change this ?
-- do we need/want break the CF compliance ?
Lars Bärring, SMHI Rossby Centre. Presentation at the 2nd Joint CLIPC/IS-ENES2 Workshop on Metadata/DRS for climate indices, Brussels 17 October 2016
Climate Information Platform for Copernicus
If these issues are solved, then remains only
 GSL
– complex start/end condition
--NH/SH climatological year
 TX#TN#, TXTNgt#p, TXB#TNB#, TXTNlt#p
-- involves multiple input variables
 WSDI, WSDI#, CSDI, CSDI#
-- total number of days in all spells longer than a certain threshold
-- i.e. neither a number_of_days… nor a spell_length_of_days… index
 SPI, SPEI
HWN, HWF, HWM
-- complex calculation and or involving multiple variables
 ECA indices that go beyond ETCCDI and ET-SCI
 indices that record the beginning/end of a season, or first/last occurrence of
some event (e.g. frost, snow): “time elapsed since [beginning of the year]”
To discuss (if time allows): The CF convention cannot cover
everything -- how do we proceed ?
Lars Bärring, SMHI Rossby Centre. Presentation at the 2nd Joint CLIPC/IS-ENES2 Workshop on Metadata/DRS for climate indices, Brussels 17 October 2016
Climate Information Platform for Copernicus
To sum up – issues to work on today
 Index names (as widely known) vs. netCDF variable names
names always lowercase or partly uppercase ?
 One index („one file‟) – several alternative thresholds (for users‟ exploratory analyses)
index naming when there are multiple thresholds in a file
 CF standard names for climate indices based on variables ( CF peculiarities ???)



harmonised long names ?
canonical units
user-friendly units, legend_units
standard name of the threshold variable itself
discuss how to solve the mechanics of 3-dim percentile thresholds
climatological time axis
discuss …
 What to do when the indices go beyond what the CF convention can handle
discuss…
Lars Bärring, SMHI Rossby Centre. Presentation at the 2nd Joint CLIPC/IS-ENES2 Workshop on Metadata/DRS for climate indices, Brussels 17 October 2016