Selecting an optimum subset of climate models for climate impact studies Renate A. I. Wilcke1 1 Rossby Centre, Swedish Meteorological and Hydrological Institute, Sweden 2 Lars Bärring1,2 Thomas Mendlik3 Centre for Environment and Climate Research, Lund University, Sweden EMS2014-365 @ EMS, Prague, 10.10.2014 3 Wegener Center for Climate and Global Change, University of Graz, Austria Introduction Method Experiments Summary Motivation Motivation Climate model output data are applied within climate impact studies 2 / 16 [email protected] Selecting an optimum subset Introduction Method Experiments Summary Motivation Motivation Climate model output data are applied within climate impact studies 2 / 16 [email protected] Selecting an optimum subset Introduction Method Experiments Summary Motivation Motivation Climate model output data are applied within climate impact studies Common issue: Climate model ensemble is bigger than what can/want/will be used by impact modellers 2 / 16 [email protected] Selecting an optimum subset Introduction Method Experiments Summary Motivation Idea Question: Output from how many climate models should be used? Question: Which climate models should be used? We know that different climate models perform differently in different seasons and regions for different variables. Answers should be applicable for different specific impact studies Selection method should be flexible and objective 3 / 16 [email protected] Selecting an optimum subset Introduction Method Experiments Summary SVD Silhouettes Clustering Selection Selection process – sketch singular vector decomposition new variables PCs final selection: cluster means 4 / 16 [email protected] Selecting an optimum subset silhouette values hierarchical clustering Introduction Method Experiments Summary SVD Silhouettes Clustering Selection Selection process – sketch 4 / 16 [email protected] Selecting an optimum subset Introduction Method Experiments Summary SVD Silhouettes Clustering Selection Selection process – sketch singular vector decomposition new variables PCs 4 / 16 [email protected] Selecting an optimum subset Introduction Method Experiments Summary SVD Silhouettes Clustering Selection SVD To gain uncorrelated variables and reduce their number without loosing information. Transformation to an orthogonal vector space. Standardized data matrix X consists of m models and n variables Where variables are all combinations of climate change signals (CCS) of climate variables, indices, for different regions and seasons. var1(sea1, reg1) var2(sea1, reg2) Mod1 a1,1 a1,2 Mod2 a ··· 2 , 1 X= ··· a3,1 ··· Modm am , 1 ··· 5 / 16 [email protected] Selecting an optimum subset ··· ··· ··· ··· ··· varn a1,n a2,n a3,n am , n Introduction Method Experiments Summary SVD Silhouettes Clustering Selection Selection process – sketch singular vector decomposition new variables PCs 6 / 16 [email protected] Selecting an optimum subset Introduction Method Experiments Summary SVD Silhouettes Clustering Selection Selection process – sketch singular vector decomposition new variables PCs 6 / 16 [email protected] Selecting an optimum subset silhouette values Introduction Method Experiments Summary SVD Silhouettes Clustering Selection Silhouette values Purpose: Automated suggestion for optimum size of subset to select, i. e. gaining k value for tree-cutting in hierarchical clustering 7 / 16 [email protected] Selecting an optimum subset Introduction Method Experiments Summary SVD Silhouettes Clustering Selection Silhouette values Purpose: Automated suggestion for optimum size of subset to select, i. e. gaining k value for tree-cutting in hierarchical clustering Silhouettes (Rousseeuw, 1987): a distance measure between a point and all other points in different clusters. testing for different number of clusters choosing the highest average silhouette value Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math., 20(0):53–65. 7 / 16 [email protected] Selecting an optimum subset Introduction Method Experiments Summary SVD Silhouettes Clustering Selection Silhouette values choosing the highest average silhouette value 0.8 0.6 0.4 4 PCs (83%) 0.2 testing for different number of clusters 2 PCs (63%) 3 PCs (75%) 0.0 Silhouettes (Rousseeuw, 1987): a distance measure between a point and all other points in different clusters. average silhouette value 1.0 Purpose: Automated suggestion for optimum size of subset to select, i. e. gaining k value for tree-cutting in hierarchical clustering 2 4 6 8 clusters Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math., 20(0):53–65. 7 / 16 [email protected] Selecting an optimum subset 10 Introduction Method Experiments Summary SVD Silhouettes Clustering Selection Selection process – sketch singular vector decomposition new variables PCs 8 / 16 [email protected] Selecting an optimum subset silhouette values Introduction Method Experiments Summary SVD Silhouettes Clustering Selection Selection process – sketch silhouette values singular vector decomposition new variables PCs 8 / 16 [email protected] hierarchical clustering Selecting an optimum subset Introduction Method Experiments Summary SVD Silhouettes Clustering Selection distance representing similarity 9 / 16 [email protected] Selecting an optimum subset EC−EARTH_RCA4 NorESM1−M_RCA4 M−MPI−ESM−LR_RCA4 EC−EARTH_RACMO22E EC−EARTH_HIRHAM5 GFDL−GFDL−ESM2M_RCA4 CanESM2_RCA4 CERFACS−CNRM−CM5_RCA4 clusters build by minimizing distances HadGEM2−ES_RCA4 bottom-up, starting with m different clusters MIROC5_RCA4 Hierarchical clustering is applied to the euclidean distances between the PCs IPSL−CM5A−MR_RCA4 Hierarchical clustering Introduction Method Experiments Summary SVD Silhouettes Clustering Selection Selection process – sketch silhouette values singular vector decomposition new variables PCs 10 / 16 [email protected] hierarchical clustering Selecting an optimum subset Introduction Method Experiments Summary SVD Silhouettes Clustering Selection Selection process – sketch silhouette values singular vector decomposition new variables PCs hierarchical clustering final selection: cluster means 10 / 16 [email protected] Selecting an optimum subset Introduction Method Experiments Summary SVD Silhouettes Clustering Selection Selection from clusters clusters with only 2 members which equally representing the mean: arbitrary choise 11 / 16 [email protected] 5 0 NorESM1−M_RCA4 EC−EARTH_RCA4 M−MPI−ESM−LR_RCA4 PC2 22.42% models closest to those means are selected EC−EARTH_RACMO22E IPSL−CM5A−MR_RCA4 EC−EARTH_HIRHAM5 MIROC5_RCA4 −5 multidimensional means can represent clusters CanESM2_RCA4 HadGEM2−ES_RCA4 GFDL−GFDL−ESM2M_RCA4 −10 each cluster represents a group which is distinct from the others −10 −5 CERFACS−CNRM−CM5_RCA4 0 PC1 40.4% Selecting an optimum subset 5 10 Introduction Method Experiments Summary Set-up Set-up Results Experiments set-up Variables: Climate change signals (CCS) of climate variables and indices for different seasons and regions Models: 11 RCM-GCM runs from EURO-CORDEX, rcp8.5 12 / 16 [email protected] Selecting an optimum subset Introduction Method Experiments Summary Set-up Set-up Results Experiments set-up Variables: Climate change signals (CCS) of climate variables and indices for different seasons and regions Models: 11 RCM-GCM runs from EURO-CORDEX, rcp8.5 To simulate different impact applications, we tested different . . . combinations of climate variables (tas, pr, hurs, . . . ) and indices (degree days, frost days, . . . ) seasons (DJF, MAM, . . . ) CCS periods (near, mid, far future) 12 / 16 [email protected] Selecting an optimum subset Introduction Method Experiments Summary Set-up Set-up Results Experiments set-up CCS: T2m, pr, wet-day freq., 2 degree days 2069-2098 vs. 1971-2000 all seasons 6 regions in northern Europe 13 / 16 [email protected] Selecting an optimum subset Introduction Method Experiments Summary Set-up Set-up Results Experiments set-up 0.6 2 PCs (62%) 3 PCs (74%) 4 PCs (81%) 0.0 suggestion: 4 clusters 0.4 6 regions in northern Europe 0.2 all seasons average silhouette value 2069-2098 vs. 1971-2000 0.8 1.0 CCS: T2m, pr, wet-day freq., 2 degree days 2 4 6 clusters 13 / 16 [email protected] Selecting an optimum subset 8 10 Introduction Method Experiments Summary Set-up Set-up Results Experiments set-up CCS: T2m, pr, wet-day freq., 2 degree days tas, pr, ind 2069-2098 vs. 1971-2000 13 / 16 [email protected] EC−EARTH_RACMO22E EC−EARTH_RCA4 EC−EARTH_HIRHAM5 NorESM1−M_RCA4 M−MPI−ESM−LR_RCA4 GFDL−GFDL−ESM2M_RCA4 MIROC5_RCA4 Selecting an optimum subset CERFACS−CNRM−CM5_RCA4 hierarchical clustering IPSL−CM5A−MR_RCA4 suggestion: 4 clusters CanESM2_RCA4 6 regions in northern Europe HadGEM2−ES_RCA4 all seasons Introduction Method Experiments Summary Set-up Set-up Results Experiments set-up CCS: T2m, pr, wet-day freq., 2 degree days tas, pr, ind CanESM2_RCA4 EC−EARTH_RACMO22E NorESM1−M_RCA4 4 all seasons 6 2069-2098 vs. 1971-2000 HadGEM2−ES_RCA4 final selection/suggestion: RCA4 with CanESM2, IPSL, CNRM, and HIRHAM5 with EC-EARTH 0 −2 PC2 23.03% −4 IPSL−CM5A−MR_RCA4 −6 hierarchical clustering EC−EARTH_RCA4 GFDL−GFDL−ESM2M_RCA4 MIROC5_RCA4 −8 suggestion: 4 clusters EC−EARTH_HIRHAM5 M−MPI−ESM−LR_RCA4 2 6 regions in northern Europe −10 −5 CERFACS−CNRM−CM5_RCA4 0 PC1 38.88% 13 / 16 [email protected] Selecting an optimum subset 5 Introduction Method Experiments Summary Set-up Set-up Results Experiments set-up CCS: T2m, pr, wet-day freq., 2 degree days tas, pr, ind CanESM2_RCA4 EC−EARTH_RACMO22E NorESM1−M_RCA4 4 all seasons 6 2069-2098 vs. 1971-2000 HadGEM2−ES_RCA4 final selection/suggestion: RCA4 with CanESM2, IPSL, CNRM, and HIRHAM5 with EC-EARTH 0 −2 PC2 23.03% −4 IPSL−CM5A−MR_RCA4 −6 hierarchical clustering EC−EARTH_RCA4 GFDL−GFDL−ESM2M_RCA4 MIROC5_RCA4 −8 suggestion: 4 clusters EC−EARTH_HIRHAM5 M−MPI−ESM−LR_RCA4 2 6 regions in northern Europe −10 −5 [email protected] 0 PC1 38.88% !OBS! Arbitrary selection if only 2 cluster members 13 / 16 CERFACS−CNRM−CM5_RCA4 Selecting an optimum subset 5 Introduction Method Experiments Summary Set-up Set-up Results Results CCS of temperature is dominating the clustering in all combinations always the same two basic clusters tas domination is not visible in the finally selected models Influence of CCS of clim. indices is bigger at end of century Choice of period influences the final selection 14 / 16 [email protected] Selecting an optimum subset Introduction Method Experiments Summary Summary Outlook Summary Model “quality” control should be done before hand Selection is period dependent When tas included, it dominates the clustering Selection is application dependent (there is no ONE answer) Selection result is just a suggestion → do not use it blindly Assuming the models work equally well in projecting possible future climates 15 / 16 [email protected] Selecting an optimum subset Introduction Method Experiments Summary Summary Outlook Outlook Selection needs to be compared to when used on bias corrected data More different regions needs to be tested Include information about single periods (non CCS) Direct test with impact models Study will result in an automated web application to support climate and impact researchers 16 / 16 [email protected] Selecting an optimum subset
© Copyright 2026 Paperzz