CASE FOR SUPPORT: e-Science Round 2 Grid ENabled Integrated Earth system model (GENIE): A Grid-based, modular, distributed and scaleable Earth System Model for long-term and paleo-climate studies Summary Whole Earth System modelling requires the integration of a number of specialised components. Current computing technologies are not well suited for constructing, executing and effectively utilising such a model. However, the Grid and associated component-based application construction techniques should provide a natural solution. To achieve this, a structured, multi-disciplinary and multi-institutional collaboration is needed for model development and use, and to share the large volumes of output data from integrated simulation runs. We propose to challenge use of the Grid to unify widely distributed UK expertise, and generate a new kind of Earth System Model (ESM). Our scientific focus is on long-term and paleo-climate change, especially through the last glacial maximum (~20kyr BP) to the present interglacial, and the future long-term response of the Earth system to human activities. A realistic ESM for this purpose must include models of the atmosphere, ocean, sea-ice, marine sediments, land surface, vegetation and soil, ice sheets and the energy, biogeochemical and hydrological cycling within and between components. We propose to develop, integrate and deploy a Grid-based system which will allow us: (i) to flexibly couple together state-of-the-art components to form a unified ESM, (ii) to execute the resulting ESM on the Grid, (iii) share the distributed data produced by simulation runs, and (iv) to provide high-level open access to the system, creating and supporting virtual organisations of Earth System modellers. The project will deliver both a flexible Grid-based architecture, which will provide substantial long-term benefits to the Earth system modelling community (and others who need to combine disparate models into a coupled whole), and also new scientific understanding from versions of the ESM generated and applied in the project. The components will be supplied by recognised centres of excellence at Reading, SOC, UEA, CEH and Bristol (all university departments being graded 5 or 5* in the 2001 RAE). The Grid-based architecture will leverage significant ongoing activity and experience in the e-Science centres at Southampton and Imperial College (both 5*). The project will fill important gaps in an emerging spectrum of Earth System Models, and represents a rare example of using the Grid for a truly multidisciplinary modelling activity. Technological Challenge (Grid Stretch) Earth system science is by its nature interdisciplinary. No conventional disciplinary institute is capable of delivering all the expertise necessary to develop a complete Earth system model. There are two approaches to solving this problem. The ‘conventional’ approach is to form a new interdisciplinary research institute and transfer expertise (e.g. Potsdam Institute for Climate Impact Research, PIK). Alternatively, the Grid enables the creation of a virtual organisation (Foster et al. 2001) that links the necessary resources and expertise and facilitates the sharing of results that are obtained. This new approach has the advantages of lower cost and greater flexibility. Participants continue to work within centres of excellence in their disciplines, thus benefiting from access to a wide base of specialist knowledge. Furthermore, new ideas and disciplines can be engaged and integrated with comparative ease. To realise this new approach we will significantly stretch and extend existing Grid technologies to: • Encapsulate existing state-of-the-art, computationally efficient models as components, enabling them to be coupled together effectively to produce a unified Earth system model. • Provide efficient execution strategies for such a system, ranging from a single run at a single location to executions automatically distributed and coordinated across physically distributed resources 1 • • Develop user-level access to such a system that will enable Earth system scientists to explore varying scenarios and perform modelling experiments without needing to be concerned with the low–level details of the models employed or of their implementation Provide a framework to collaboratively share and post-process all the distributed data produced by such simulations GENIE Scenario A simplified example of how the GENIE system we envisage will be used is as follows: The system will be accessed via a portal, which will allow a user to compose, execute and analyse the results from an Earth system simulation. After authenticating themselves with the portal, a user will have access to a library of components that can each model different aspects of the Earth system (for example, ocean, atmosphere) at different resolutions. Intelligent selection from the library is made possible by reference to metadata supplied by the component author. The selected components, along with suitable mesh conversion tools to allow data exchange at model boundaries, other data necessary to initialise the model and an event queue to sequence the data exchange between the components and specify how often to archive data, are composed. From this an intelligent meta-scheduler determines the resource requirements and maps the processing required to a distributed Grid of compute resources using middleware such as Globus and Condor. At runtime each component produces distributed data, which can be monitored during execution and is also archived automatically as specified by the user. From the portal it is possible to browse this archive of results using post-processing visualization tools and re-use results from the archive to seed new calculations. How does the Grid activity enable the science? The Grid-enabled, component-based open modular framework that we propose will provide unique capabilities to Earth system scientists, enabling for the first time, realistic whole Earth system simulations at a range of spatial and temporal resolutions to be constructed, executed and analysed. The Grid activity will enable the science in the following ways: • It is the best way to construct a holistic model able to incorporate all sub-systems thought to be capable of influencing long-term and paleo-climatic change. • It allows the system to be used in a variety of scenarios without recoding or internal modification. • It supports an open community of Earth system scientists, enabling new models to be incorporated into the framework and existing models combined in a variety of ways to test alternative hypotheses. • It is the most cost-effective way of achieving the computing power required to perform long-term (multi-millennial) simulations of the complete Earth system at moderate resolution, or for parameterspaces studies of shorter timescales. • It will facilitate the collaborative sharing of the data produced by distributed simulations for the benefit of the whole community. Scientific Research Challenge The scientific driver for this project is to understand the astonishing and, as yet, unexplained natural variability of past climate in terms of the dynamic behaviour of the Earth as a whole system. Such an understanding is an essential pre-requisite to increase confidence in predictions of long-term future climate change. The figure on the left shows the changes in carbon dioxide, temperature and methane over the last four glacial cycles recorded in the Vostok ice core (Petit et al. 1999). The causes of these major glacial-interglacial cycles that have dominated the 2 past few million years of Earth history remain highly uncertain. However, it is clear that changes in many components of the Earth system appear to have amplified rather weak orbital forcing. These include: land ice, sea ice and vegetation cover affecting Earth’s albedo (reflectivity), CO2, CH4 and water vapour affecting the ‘greenhouse effect’, and ocean circulation affecting heat transport. Previous modelling and data studies (e.g. Imbrie et al. 1993, Shackleton, 2000, Berger et al. 1998) have revealed that non-linear feedbacks are important, and that these feedbacks extend beyond the physical subsystem to include biological and geochemical processes. For example, changes in the marine carbon cycle (Watson et al. 2000) and terrestrial vegetation cover (de Noblet et al. 1996, Claussen et al. 1999) are fundamental contributors to past climate change. Hence our working hypothesis is that realistic simulations of long-term climate change requires a complete Earth system model that includes, as a minimum, components representing the atmosphere, ocean, sea-ice, marine sediments, land surface, vegetation, soil, and ice sheets and the energy, hydrological and biogeochemical cycling within and between components. The model must be capable of integration over multi-millennial time-scales. The design of the system will allow other components, such as atmospheric chemistry, to be added at a later stage. At present, state-of-the-art models of the essential components of the Earth climate system exist mostly as separate entities. Where several components have been coupled, as in the more elaborate versions of the Hadley Centre model (Cox et al. 2000), they are computationally too demanding for long-term or ensemble simulations. Conversely, existing efficient models of the complete system (Petoukhov et al. 2000) employ highly idealised models of the individual components, with reduced dimensionality and low spatial resolution. Our objectives are to build a model of the complete Earth system which is capable of numerous long-term (multi-millennial) simulations, using components which are traceable to state-of-the-art models, are scaleable (so that high resolution versions can be compared with the best available, and there is no barrier to progressive increases in spatial resolution as computer power permits), and modular (so that existing models can be replaced by alternatives in future). Data archiving, sharing and visualisation will be integral to the system. The model will be used to quantitatively test hypotheses for the causes of past climate change and to explore the future long-term response of the Earth system to human activities. All of the necessary component models have already been developed within the NERC community (representing a considerable investment of resources) and by our collaborators at the Hadley Centre. Further work will be required to produce compatible, computationally efficient components for Grid coupling and to represent the hydrological and biogeochemical cycling within and between components. Our initial scientific focus will be on one fundamental transition of the Earth system: from the last glacial maximum to the present interglacial warm period (the Holocene). This interval has been chosen because it encapsulates both gradual and rapid climate changes, and high-resolution data records exist against which to test the model. The figure on the left is a high resolution snow accumulation and temperature record from the Greenland ice core, showing the rich behaviour of the Earth system during the last deglaciation (Kapsner et al. 1995). Specifically, we will use the Earth System model to investigate: • The timing of the Bolling-Allerod warm phase: General Circulation Model (GCM) based simulations using a simple ocean model suggest that this warming occurs earlier than would be expected from orbital theory alone. 3 • The magnitude and extent of the Younger-Dryas cold phase, and the anti-phase climate variations recorded in Antarctic and Greenland ice cores (Blunier et al. 1998): The links between the hemispheres and the degree to which the Younger-Dryas extended beyond the Atlantic remain uncertain. • The changes in vegetation and carbon storage during the Holocene (Clausen et al., 1999) • The minimum complexity (in terms of system components, processes within components, and resolution) required to simulate these changes in the system • The predictability (or otherwise, due to chaotic behaviour) of the fully coupled system. Will small changes in initial conditions result in major changes to the glacial-interglacial transition? • The robustness of predictions of carbon cycle feedback on global warming (Cox et al. 2000; Lenton 2000), and long-term (multi-millennial) projections of climate change and carbon cycling (Archer et al. 1998) How the research underpins NERC’s broader vision of Earth System Science The draft NERC Science and Innovation Strategy document explicitly discusses the need to understand the behaviour of the Earth system revealed by the Vostok ice core, and highlights the need for a new Earth System science approach. Our project will provide the innovative methodologies desired, and address many of the detailed research needs in the Key Science Themes of Climate Change and Biogeochemical Cycles. Our aim is to help realise the NERC vision for the development of coupled Earth System Models. GENIE will provide an ideal tool for investigating rapid changes in climate such as the Dansgaard-Oeschger and Heinrich events during the ice ages. This is one focus of the RAPID climate change thematic programme and the model will be available in the latter stages of that programme. GENIE will provide a superior alternative to the climate models currently used in integrated assessment models. The development of an Integrated Assessment Model is a core programme of the Tyndall Centre. The manager of that programme, Dr. Jonathan Koehler has agreed to collaborate. GENIE will contribute to the new programme on Quantifying the Earth System (QUEST), especially the hierarchy of coupled models and the proposed ‘virtual laboratory’. Grid training is an important component of this project and it will produce a group of Grid-aware environmental scientists. Methodology and Detailed Plan of Research In the following plan of research we identify specific work-packages, e.g. “(EMBM1)” and the timetable for executing these is given in Appendix 1. Steps toward a complete model In GENIE, computational ‘components’ will correspond to models of ‘components’ of the Earth system. These components will be developed from existing code, with the addition of meta-data that enables their flexible interfacing. GENIE will provide a methodology for coupling components in order to test hypotheses concerning the processes and feedback mechanisms that are important for long-term changes to the environment. The system will be flexible and allow users to add new components and evaluate their importance within the whole Earth System. However, to develop such a system we must start from a set of exemplar components. These components have been chosen because they (i) satisfy the scientific aims of the proposal for a model capable of simulating long-term change, (ii) have physical representations that can be directly traced to more complex components used within General Circulation Models, (iii) already exist and are well tested so that this proposal can focus on their coupling and Grid-enabling. The development programme is structured around three milestones: Component Set 1 (“GENIE-Trainer”, 12 months): To enable a rapid development of the key Grid techniques, a simple set of model components will be made available at the start of the proposal. These will be the appropriate basis for the development and 4 implementation of the basic set of Grid technologies and facilitate training of environmental scientists in the techniques required to Grid-enable other components in the system. The initial model will consist of just two components: (a) an energy-moisture balance atmosphere model coupled to (b) a 3-D ocean model at very low-resolution. This 2-part model was developed as part of the NESMI (NERC Earth System Modelling Initiative), is available now, and will provide a simple core test-bed for using the Grid. Component Set 2 (“GENIE-Mini”, 18 months): The next step will be to couple (b) the 3D ocean component to (c) a 3D planetary wave atmosphere, (d) the land surface, and (e) sea-ice. This will require scientific effort defining the coupling method between the atmosphere, land surface and sea-ice components. The 3D ocean component will be the same as used in component set 1 but at higher resolution, and with the inclusion of (f) marine biogeochemistry. Component set 2 will also help us learn about the necessary methods and implications of flexibility, modularity and scalability because we will have two alternative atmosphere components (energy-moisture balance and 3D planetary wave atmosphere), and two alternative resolutions of the ocean. Much of the initial assembly and interfacing of the atmosphere, ocean and sea-ice will have been undertaken as part of a recently funded NERC COAPEC project (NER/T/S/2001/00191). Component Set 3 (“GENIE-Grid”, 30 months): The final step will be to couple the remaining components, (g) marine sediments, and (h) ice-sheets, and include more complete representations of biogeochemical and hydrological cycling. This will require scientific and technological effort in defining and achieving the asynchronous coupling of ice sheets and marine sediments to the rest of the model. Details of Earth system modelling components The following components will be integrated in various realisations of the GENIE model. All components are comparable to, or significant advances over, most existing intermediate complexity models. (a) Energy-moisture balance atmosphere (EMBM): A standard 2-D diffusive model of atmospheric heat and moisture transport, incorporating radiation and bulk transfer formulae for air-sea and air-land surface fluxes of heat and moisture (Weaver et al. 2001) will be used in the “GENIE-Trainer”. SOC will isolate the source code for this component (currently tied to the ocean) and add meta-data (EMBM1). (b) 3-D Ocean (OCEAN): A 3-D, non-eddy-resolving, frictional geostrophic model (Edwards et al 1998, Edwards and Shepherd in press) will be used throughout. This allows much longer time-steps than a conventional ocean GCM by neglecting acceleration and momentum transport, to obtain the large-scale, long-term circulation only. More than 5 man-years of effort have been invested in developing this model. It was coupled to the energy-moisture balance atmosphere as part of NESMI (R. Marsh, SOC). SOC will isolate a coarse-resolution version (18x18 longitude-latitude grid points and 8 depth levels) and add metadata (OCEAN1) for inclusion in the “GENIE-Trainer”. A variable-resolution version with more sophisticated coupling to the 3-D atmosphere will be produced for “GENIE-Mini” (OCEAN2). (c) 3-D Atmosphere (ATMOS): A 3-D, non-transient-eddy-resolving, stationary wave model (Valdes & Hoskins 1989) will be used in “GENIE-Mini” and “GENIE-Grid”. This uses the same equation set as in a conventional atmospheric GCM but allows a much longer time-step (~1 month rather than 30 minutes), by parameterising baroclinic instability. This model was developed during a 3-year NERC-funded project and subsequently used in several other NERC and EU projects. Moisture transport, clouds and precipitation are being included using conventional methods, in a project funded by COAPEC (NER/T/S/2001/00191). Reading will isolate the code, help define coupling to the ocean, land surface and sea-ice and add appropriate meta-data for inclusion in “GENIE-Mini” (ATMOS1). Then a variableresolution version will be enabled and scaleability issues addressed (ATMOS2). (d) Land surface, hydrology and biogeochemistry (LAND): A simplified version of the MOSES landsurface scheme (Cox et al. 1999), already developed by P. M. Cox of the Hadley Centre, will be used in “GENIE-Mini”. This allows a longer time step than full MOSES (~12 hours rather than 30 minutes) by excluding fast processes (e.g. canopy interception). The ‘TRIFFID’ model from the Hadley GCM will be 5 used to capture vegetation dynamics and their effect on land surface properties. CEH Wallingford will isolate the source codes, help define the coupling to the atmosphere and runoff to the ocean, and add meta-data (LAND1). Next, carbon and nitrogen cycling will be switched on and biogeochemical coupling to the atmosphere and ocean included (LAND2). Finally, a fully ‘traceable’ (to the GCM) land-surface scheme will be developed for “GENIE-Grid” (LAND3). This will retain all MOSES processes but explicitly time-average to achieve a fast version with a long time-step. (e) Sea-ice (ICE): A 2-D sea-ice model, incorporating standard thermodynamics (Hibler 1979) and elastic-viscous-plastic dynamics (Hunke and Dukowicz 1997) will be included in “GENIE-Mini” and “GENIE-Grid”. At the time of writing, as part of the aforementioned COAPEC project, the energymoisture balance atmosphere (a) and ocean (b) models, have been coupled to a thermodynamic free-drift version of this sea ice model (R. Marsh, SOC). SOC will isolate the code, help define coupling to the atmosphere and ocean, add meta-data, and include this component in “GENIE-Mini” (ICE1). (f) Ocean biogeochemistry (BIO): Existing representations of marine carbon and nutrient (phosphate, silicic acid, and iron) cycling (Ridgwell 2001) that have been successfully applied to questions of past (Watson et al. 2000) and future carbon cycle behaviour, will be integrated into the 3-D ocean model by UEA (BIO1). This work has already begun as part of the Tyndall Centre Integrated Assessment Model programme. Next, the oceanic nitrogen cycle and its influence on carbon cycling, and the fractionation of a variety of stable (13C/12C, 15N/14N, 30Si/28Si, 87Sr/86Sr) and radiogenic (14C/12C) isotopes, and trace elements (Ge/Si, Cd/Ca) will be included (BIO2). This will facilitate the simulation of paleooceanographic records and thus aid model testing. (g) Marine sediments (SEDS): A model of the interaction of the deep-sea geochemical sedimentary reservoir with the overlying ocean will be included in “GENIE-Grid”. This will be derived from an existing representation of opal diagenesis developed at UEA (Ridgwell 2001) and from standard schemes for dissolution of calcium carbonate and remineralisation of organic matter (SEDS1). It will enable GENIE to capture the ‘slow’ (>1 thousand years) response of the ocean carbon cycle to perturbation, and facilitate model testing by comparing predicted sediment core records with actual records. UEA will define the asynchronous coupling of the sediments to the ocean, and add meta-data to the component code (SEDS2). (h) Ice sheets (SHEET): An existing ice sheet model (Payne 1999) will be applied to simulate glacial maximum ice sheets and deglaciation in “GENIE-Grid”. It will operate at finer spatial scales (currently scaleable over a range 20-100+ km grid cells), but longer (decadal) time steps than other components. The model has been developed under 2, 2-year NERC grants totalling ~£170k. It now includes fast ice flow and can be used to study large-scale surging, important in phenomena such as Heinrich events. The code is currently being parallelised and this will be complete at the start of the project. Bristol will first develop a stand-alone simulation of full glaciation (SHEET1). Then the atmosphere (and ocean) will be coupled in an asynchronous fashion and the coupled model implemented at coarse resolution to aid massbalance parameterisation (SHEET2). Spatial and temporal changes in ice-sheet extent and thickness, surface albedo, topographic blocking effects on atmospheric circulation, and the output of freshwater to the ocean will all be simulated. Finally, the model will be implemented at finer resolution to capture the flow physics more accurately (SHEET3). Grid technologies required The GENIE system will require the application and development of a number of Grid technologies. (1) Component Wrapping (WRAP): A component repository will be set up to store the meta-data and source code relating to the ‘wrapped’ science components (see below). The science source code (e.g. FORTAN77/90, C & C++) will be packaged into individual software components. IC and Soton will develop an XML Schema to capture the meta-data for the science components (WRAP 1). This will include the exposed methods and argument, the input, the output and behaviour of the component, as well as describing the scientific capability of the component. IC will develop a simple GUI to define the XML and allow a scientist to integrate a science module within a component through a few lines of manually written code. Soton will use the XML Schema to define the database structure to allow automatic 6 insertion and extraction of the generated datasets (see DB below). Once the initial XML Schemas have been defined, further work packages will wrap the “component set 1” science modules: Atmosphere at IC (WRAP 2), and Ocean at Soton (WRAP 3) and write the data interchange modules (WRAP4: IC and Soton) which will allow components to exchange information at domain boundaries taking into account coordinate system transforms. (2) Computation (COMP): Computational resources within the collaboration will be formed into a virtual organisation using middleware such as Globus. These resources will comprise traditional supercomputers, Beowulf clusters and Condor pools. The wrapped “component set 1” modules will be tested on computational resources at Soton and IC to verify their functionality and the basic integrity of the framework for GENIE Trainer (COMP1 and COMP2 – IC and Soton respectively). Work will continue to wrap the “component set 2” science modules using the technology developed in WRAP 1. This will ensure the prototype wrapping technology is viable for use by the environmental science team. IC will integrate its science modules (described using the XML Schema) into the application framework using wrapping code generated in Java. Soton will use the same XML Schema to wrap science modules using web services technology. We envisage using Web services as the communication mechanism between resources while using Java wrapped components within a computational resource. Such distinctions will be transparent to the end user, but mirrors closely recent developments in Grid computing research (announced by Globus team, Edinburgh, Jan 2002) (3) Meta-Scheduler (SCHED): The meta-scheduler collects information relating to the currently available computational resources, the science components, and the application definition (provided by the user) to minimise the overall execution time by instantiating components on the most appropriate execution platforms. The user generates the application definition by browsing the existing component meta-data stored within the distributed component repositories. The performance of a component on a particular platform is obtained by interrogating its performance database, generated and enhanced whenever a component is executed. By understanding the application structure and exposing the data flows between components we are able to optimally map components to potentially distributed resources and re-distribute the components should circumstances dictate, e.g. the availability of better resources. This work is in an advanced state of development under an EPSRC ‘High Performance Scientific Software Components’ grant (GR N/13371) and will be developed further through funded work within the Reality Grid (EPSRC Pilot Project). (4) Automated Data archiving, querying and post-processing (DB): This will facilitate collaborative sharing of simulation results between partners in the project. Sharing, re-use and exploitation of these data sets requires the ability to locate, assimilate, retrieve and analyse large volumes of data produced at distributed locations. We will use open standards to provide transparent access to the data along with open source/ commercial database systems to provide a robust, secure and distributed back-end for the data handling. The key requirements for the database system are (i) setting up the databases and developing standards, and (ii) writing high level database post-processing tools, which apply functions to the database and delivers the processed data to the user and back to the database. An early prototype of such a system was developed at Southampton as part of the UK Turbulence Consortium activities in 1998 and has most recently been applied in other engineering domains for automated data archiving (e.g. Cox 2001). This work package will be developed at Southampton and be integrated into the GENIE portal, leveraging expertise from two recently funded Grid projects: GEODISE (Grid based optimisation: EPSRC) and a BBSRC project to deliver a Grid-based bio-molecular database. Database system (DB1): The underlying database system will use the XML and XML Schema (developed in WRAP1) to specify the portable database infrastructure that underlies our system, and binary formats for the bulk data. This will allow for automated generation and population of the underlying open source/ commercial database system (e.g. Storage Resource Broker, DB2, SQL Server, Oracle, Tamino), whilst retaining the flexibility to add new metadata dynamically as part of the postprocessing analysis. The post-processing facility will be integrated into the GENIE Portal and will allow for user queries to be made to the distributed databases, and re-use of simulation results to seed new calculations. 7 Data post-processing and database integration (DB2): Analysis and post-processing tools for the distributed data resulting from Grid-based simulation runs will allow for new information and knowledge to be deduced from simulation data. For visualisation we will use the tools developed under the “Grid for environmental systems diagnostics and visualisation” project. These are ideal and K. Haines has agreed to fully collaborate. We will also liase with the NERC Datagrid proposal, if it is successful. Much of the paleo-climate data is held at alternative data centres (e.g. http://www.ngdc.noaa.gov/paleo and http://www.pangaea.de) but the Datagrid proposal will also be considering model output. (5) GENIE Portal (PORTAL): The portal will be the web-based mechanism for authenticating users, browsing the component and data repositories, composing simulations, executing them and analysing the results. It will leverage significant and ongoing activity at Soton in Problem Solving Environment development (funded by EPSRC) and at IC in the EPIC (“e-Science Portal at Imperial College”) project (a LeSC Centre funded by DTI). It will also enable users to monitor an ongoing simulation. The portal will be developed by IC and Soton with IC focussing on the component integration (PORTAL1) and Soton focussing on Database integration (PORTAL2). Integration to Globus & Condor and security issues will be undertaken by Southampton and Imperial (PORTAL3). Deliverables 12-month: GENIE-Trainer: Grid-based 3D ocean and energy-moisture balance atmosphere model. Proof of concept for Grid coupling of components. Used to train environmental science RAs about Grid methodology. 18-month: GENIE-Mini: Grid-based coupled 3D ocean, 3D atmosphere, land and sea-ice model. Comparison of multi-decadal simulations with results of conventional (COAPEC) modelling approach. Past ~30kyr simulation with imposed ice sheets and greenhouse gas forcing. Collaborators will be able to access and make simple queries of distributed simulation results over the Grid. 24-month: GENIE-Mini with interactive carbon cycle. Assessment of robustness of carbon cycle-climate feedback predictions over the 21st century. Extension of this assessment to coming centuries. Assessment of the predictability of paleo-climate events (e.g. Younger Dryas). 30 month: GENIE-Grid: Grid-based complete model with interactive ice sheets and marine sediments. Fully coupled simulations of the last deglaciation. Simulations of the long-term (next ~20 kyr) response of the Earth system to addition of fossil fuel carbon. Collaborators will be able to make sophisticated queries of the distributed simulation results and visualise running simulations. 36-month: Presentation of results derived from Grid-based simulations of (i) inter-comparisons with alternative models, (ii) last glacial termination, and (iii) the long-term response of the Earth system to human activities. The full model and infrastructure will be made available to the community and training sessions will be held (in collaboration with the National Institute for Environmental eScience (NIEeS)). Nature of the Research Team Principal Investigator: PJ Valdes (Reading). Co-Investigators: MGR Cannell (CEH Edinburgh), SJ Cox (Southampton), J Darlington (Imperial College), RJ Harding (CEH Wallingford), AJ Payne (Bristol), JG Shepherd (SOC) and AJ Watson (UEA). Recognised Researchers: TM Lenton (CEH Edinburgh), AJ Ridgwell (UEA). Collaborators: PM Cox (Hadley Centre), RM Marsh (SOC). Allied Researchers: NR Edwards (Bern), K Haines (Reading), J Koehler (Tyndall Centre). In addition to the 10 posts requested below, Southampton e-Science centre will contribute a funded PhD student who will assist in the Grid technologies for this project. Maturity of partnership, experience in delivering large, complex projects The investigators are all recognised leaders in their fields, and have a track record of collaboration. The project will leverage expertise at the London Regional e-Science Centre at Imperial College (directed by Darlington), and Southampton’s Regional e-Science Centre for which SJ Cox is technical director. Cox is 8 PI for the ‘Grid Enabled Optimisation and Design Search (GEODISE)’ EPSRC funded e-Science testbed project, which will share key Grid-based technologies with GENIE (e.g. in the Database workpackage). All scientific partners in the project have been meeting at workshops over the last 3 years, to plan various Earth system modelling activities. Harding co-ordinated the NERC Earth System Modelling Initiative, in which Lenton, Cannell and Marsh participated. Shepherd, Ridgwell and Lenton collaborate on a Tyndall Centre project (IT1.31). Watson supervised the PhDs of Lenton and Ridgwell. Harding has worked extensively with PM Cox and colleagues at the Hadley Centre. Lenton co-ordinates a Research Network in Systems Theory (http://www.cogs.susx.ac.uk/daisyworld) that includes PM Cox and Watson. Shepherd and Valdes have a COAPEC project together, on which Marsh is a recognised researcher, and which also includes Bristol. Shepherd co-ordinates the Earth system modelling initiative (ESMI) at SOC. SJ Cox and Payne have collaborated on a variety of applications of high-performance computing to environmental problems. In particular, they have developed one of the first ice-sheet models to use a parallel-processing architecture (Takeda et al, in press); they also have a joint NERC funded project. The e-Science Centres at Southampton and Imperial College are working together on a number of projects and meet regularly at National and International Grid meetings. Justification for Resources We request 10 full-time posts (some staggered) to undertake the science, e-science and co-ordination, plus 4 part-time posts to enable efficient management and operation of the project. The work-packages for which we need each member of the Science and Grid teams are detailed above and in Appendix 1. Science Co-ordinator and Project Manager. (1) T. Lenton (SSO, CEH Edinburgh) for 3 years. TL will report to the PI and directly manage achieving the project milestones and coordinate the activities of the research team. He will be an integral part of the professional management of this large project, and, in consultation with our industrial partners, we will require 50% of his time to fulfil this demanding role. In the other 50% of his time, TL will synthesise the science in the project, including helping in the modelling of global biogeochemical cycles and designing and implementing the simulations. Grid training will be a series of visits to Imperial and Southampton. Science Team. These posts will provide expertise in each of the key science components. (2) PDRA at Reading funded for 2 years (starting at month 12), managed by Valdes. The PDRA will be responsible for the execution of the long simulations using GENIE-Mini and GENIE-Grid, with a particular focus on the predictability of the glacial-interglacial transition. Grid training will involve extended visits to Imperial during the first 6 months. (3) PDRA at SOC funded for 2 years (starting from month 0), managed by Shepherd. Grid training will involve an extended placement (~6 months during the 1st year) at Southampton e-science centre. (4) A. Ridgwell (PDRA, UEA) for 2.5 years (starting from month 0), managed by Watson. AR will make a vital contribution with the ocean biogeochemical and sediment schemes he developed during his PhD and his skills in Earth system modelling. Grid training will consist of visits to Imperial. (5) HSO at CEH Wallingford for 2 years (starting at month 6), managed by Harding. They will work closely with Peter Cox of the Hadley Centre and the Joint Centre for Hydro-Meteorological Research provides an ideal setting for this collaboration. Grid training will involve visits to Southampton. (6) PDRA at Bristol for 2 years (starting at month 12), managed by Payne. Grid training will consist of a ~3 month placement at the Southampton e-science centre. The environmental science PDRAs are requested at salary point 6 and we believe that we will be able to recruit suitable staff to this exciting proposal at this pay scale. Grid Team. These posts will provide expertise in each of the key technologies. (7,8) 2 Grid PDRAs at Southampton regional e-Science Centres for 3 years each, managed by Cox. They will liase closely with the team at Imperial and the environmental scientists at SOC, Bristol, and CEH Edinburgh & Wallingford. 9 (9,10) 2 Grid PDRA at Imperial for 3 and 2.5 years each, managed by Darlington. They will liase closely with the team at Southampton and the environmental scientists at Reading, UEA and CEH Edinburgh. We have requested salary points between 10 and 15, for the 4 e-Science RAs. This high level of salary relates directly to the pay-scales of the IT/ Application professionals who can deliver to fixed deadlines the high quality software engineering we need to deliver GENIE and have skills in e.g. XML (+W3C protocols), databases, C/C++, Java, PSE development, UDDI, CORBA, and applied Grid technologies. Our survey of sites advertising IT jobs with similar skills indicates that salaries at this level should allow us to recruit in this extremely competitive market. Support Staff. We have requested 20% of a full time system programmer at point 15 on the ADC2 scale again targeted at the level required to recruit a highly competent member of professional IT staff in this competitive market. This job will involve administering our distributed Grid-based system, including a variety of parallel & distributed cluster computing resources; installing, maintaining, and patching our Grid middleware e.g. Globus/ Condor maintaining the operating systems e.g. Linux/ Windows– of particular importance and relevance to this project will be keeping up-to-date security patches applied to web services to ensure the integrity of our systems. These jobs represent a significant specialist load for a member of staff that considerably exceeds the base-level provision of computing infrastructure that is provided by the relevant service providers at each site. Secretarial and Administrative Staff. This project envisages significant and effective interaction with the various academic and industrial partners. The degree of reporting, correspondence and administrative arrangements will be higher than for a simple stand-alone research project: we are therefore requesting to purchase at several sites 10-20% of the time of our existing skilled secretarial and administrative staff. Due to the distributed nature of the project, this support is naturally distributed amongst the sites. Travel Costs. To ensure full benefit is obtained from this project, it is essential that we exchange ideas with other workers in the field by presenting our work and being represented at appropriate national and international conferences and Grid/ e-Science forums/ meetings. We have requested funds to allow each of the requested staff to attend 1-2 meetings each year. This is in line with the number of such meetings that the investigators have attended over the last several years. It is also considered vital that international travel be supported since much e-Science activity will be happening in the USA and mainland Europe. We have requested funds to allow the science PDRAs to spend significant periods at the e-science sites and the e-science PDRAs to visit the scientists, and for travel to our quarterly project review meetings and biannual review meetings with our industrial partners. This is an integral part of the management of the project and will speed-up delivery of the various components of GENIE, along with dissemination to our academic and the wider community. This level of support is based directly on the amounts spent over recent years by the investigators in other comparable projects for on- and off-campus meetings at a variety of sites. Computing Facilities. The staff employed on the grant will need dedicated, high quality computers and associated equipment to enable them to function in-office and to provide facilities when travelling around to other academic partners. The modest additional infrastructure costs will provide for essential machines to develop and host the repositories and data archives (particularly important for testing out Gridmiddleware and distributed web services before deployment), disk file servers, and tape backup facilities related directly to the project and its staff. These are in addition to the facilities offered at each of the partner sites, which are detailed in Appendix 2. Office consumables. This includes specialist books (on e.g. Web technologies which tend to be expensive due to their target IT market and limited life expectancy), specialist journal purchases, printer consumables (for binding copies of documentation and information from the web, where appropriate), paper, photocopying, storage media (e.g. CDs), telephone, and fax services. The sum requested is in line with the level of spending that the investigators have required in delivering on the Grid/ e-Science/ Web technology projects they have worked on to date. 10 Management In a project with this complexity of interlocking parts it is important both that progress is planned realistically and that each task is monitored from the start. Each PI has responsibility for a specific software development item and/or model component. Overall management of the project will be the responsibility of the management committee which will be will be chaired by the PI and comprise all coPIs and the scientific co-ordinator/ project manager (T. Lenton). It will meet quarterly and set project goals. An Earth System science steering group, chaired by Valdes, will ensure that the scientific goals are met and that the timescale for the building and adaptation of the component models is maintained. An architecture group, chaired by Cox, will ensure that the Grid infrastructure for the project is delivered to the correct time schedule. Both PI’s have considerable experience in managing large research groups. T. Lenton’s duties will include active coordination of the Earth System science tasks, as well as liasing with the architecture group on the Grid tasks. Quarterly project meetings will be held, in which all participants and interested parties will meet to exchange information on scientific and Grid-framework issues and ensure that the project deliverables are on schedule. Intervening meetings will use the Access Grid facilities at the e-Science Centres in Edinburgh, London and Southampton. The part-time project administrator will co-ordinate meetings and paperwork. Financial management will be the responsibility of the PI’s institution (Reading). The PI will report to NERC at the times of deliverables. Connectivity We will hold a workshop at ca 24 months into the project, after GENIE-Mini is delivered. This will provide an opportunity to involve industry and interested scientists. We have had informal discussions with the director of the new National Institute for Environmental eScience (NIEeS) and have proposed that a summer school and/or workshop on Earth system modelling should be held. This will be help promote the subject area and train and build a community of eScience-educated Earth system modellers. Industry involvement: Industrial support in-kind is provided by a joint team of Intel and Compusys staff (see attached letters of support), who view this project as an important exemplar of Grid based technologies. Intel is the world’s largest computer hardware company, who will assist in the provision of state-of-the art systems technology throughout the lifetime of the project to integrate into our Grid testbed. Intel is also supporting the teams developing some of the Grid middleware that we will use (e.g. Condor, who Cox is working with on a variety of projects). Compusys are one of Europe’s leading High Performance Cluster integrators, and recently supplied a 324-node commodity system to the University of Southampton. Their collaboration on this project will further assist our Southampton University Computing Service in delivering a Grid-enabled service on this new facility. International links: We will link to the ‘Bern group’ led by Thomas Stocker and other groups in Europe and the USA. In particular, our collaborator Neil Edwards, the developer of the 3-D ocean model, has a 4year fellowship in Bern on ‘Efficient Earth System Models and the role of surface feedbacks in decadal to centennial variability’. We will engage fully in the international Earth system modelling community, including contributing to the IGBP Global Analysis, Integration and Modelling (GAIM) programme (Lenton is on the task force). The closest Grid activity to the proposed work is the Earth System Grid (“Turning Climate Model Datasets Into Community Resources”) in the USA, which is much more focussed on sharing large volumes of data. In principle GENIE will complement this work, with its focus on a rich set of components (beyond ocean and atmosphere coupling), whose development has been supported by NERC over many years and the resulting studies of long-term and paleo-climate change. Growth, outreach and exploitation Our approach is specifically designed to build capacity in this area, engage newcomers to Earth system modelling, and thus encourage growth in this multi-disciplinary subject area. There already exists a large community that has an interest in using and expanding Earth system models. The work proposed in this project will be disseminated via the availability of working systems (GENIE-Trainer, GENIE-Mini, GENIE-Grid) to the academic partners. This will be backed up by publications in academic journals and at conferences in the normal way. The academic partners will share experiences with other national and 11 international e-Science activities at conferences and by active collaboration with other UK e-Science consortia. Once in place, the GENIE modelling framework will be made available to the wider NERC community. A major route for exploitation will be via community use and extension of the framework. Scientists who are not necessarily experts in modelling or computational techniques will be able to create and execute sophisticated whole Earth system model simulations. The distributed data from such simulations will be accessible for visualisation by the whole community. Modellers will be able to wrap new science components and contribute them to the repository. In this way, the usability, flexibility and extensibility of the GENIE system will enable a dynamic virtual organisation of Earth system scientists, and will greatly ease the construction of future generations of Earth system models. References Hunke, E.C. & Dukowicz, J.K. Journal of Physical Oceanography 27, 1849-1867 (1997). Imbrie, J., et al. Paleoceanography 8, 699-735 (1993). Kapsner, W.R., et al. Nature 373, 52-54 (1995). Lenton, T.M. Tellus 52B, 1159-1188 (2000). Noblet, N.I., de, et al. Geophysical Research Letters 23, 3191-3194 (1996). Payne, A.J. Climate Dynamics 15, 115-125 (1999). Petit, J.R., et al. Nature 399, 429-436 (1999). Petoukhov, V., et al. Climate Dynamics 16, 1-17 (2000). Ridgwell, A. J. PhD thesis, UEA, Norwich, UK (2001). Shackleton, N.J. Science 289, 1897-1902 (2000). Takeda, A.L., et al. Computers and Geosciences (in press). Valdes, P.J. & Hoskins, B.J. Journal of the Atmospheric Sciences 46, 2509-2527 (1989). Watson, A.J., et al. Nature 407, 730-733 (2000). Weaver, A.J., et al. Atmosphere-Ocean 39, 361-428 (2001). Archer, D., et al. Global Biogeochemical Cycles 12, 259-276 (1998). Berger, A., et al. Climate Dynamics 14, 615-629 (1998). Blunier, T., et al. Nature 394, 739-743 (1998). Claussen, M., et al. Geophysical Research Letters 26, 20372040 (1999). Cox, P.M., et al. Climate Dynamics 15, 183-203 (1999). Cox, P.M., et al. Nature 408, 184-187 (2000). Cox, S.J., et al. IEEE Computer Society Press. Proc. ICPP 2001 (2001). Edwards, N.R., et al. Journal of Physical Oceanography 28, 756-778 (1998). Edwards, N.R. & Shepherd, J.G. Climate Dynamics (in press). Foster, I., et al. (2001), Intl. J. High Performance Computing App. 15: 200-222. Hibler, W. D. Journal of Physical Oceanography 9, 815-846 (1979). Appendix 1: Timetable of work packages Months: Person IC 1 IC 2 Soton 1 Soton 2 Edinburgh Reading SOC UEA Walling. Bristol 0-6 6-12 12-18 WRAP1 computation WRAP2 WRAP3 WRAP1 (for DB1) Planning PORTAL1 COMP1 WRAP4 COMP2 WRAP4 DB1 Co-ordinate WRAP4 COMP2 PORTAL2 SCHED PORTAL3 COMP3 DB2 Simulations ATMOS1 OCEAN2 ICE1 SEDS1 LAND2 SHEET1 Simulations ATMOS2 ICE1 Presentation SEDS2 LAND3 SHEET2 EMBM1 OCEAN1 BIO1 OCEAN1 OCEAN2 BIO2 LAND1 18-24 PORTAL1 24-30 PORTAL3 30-36 Deployment, Documentation Simulations Simulations Integration Dissemination, Documentation Presentation Presentation Presentation Presentation SHEET3 Presentation Appendix 2: Existing Computing Resources from Partners Southampton Regional e-Science Centre and SOC: Access to University SGI Origin 2000 (24 node); 324-processor Intel Linux Beowulf cluster; SGI Origin 3000 planned. London Regional e-Science Centre at Imperial: 24-processor Sun E6800; 32-processor Compaq Alpha cluster; 22-processor Linux cluster; resources being expanded over the next three years through £3M SRIF. Reading Meteorology: 6-processor SGI origin 2000; access to University SGI machine; Condor-based system of >50 Sun 12 workstations being established. UEA: 3 dual-processor Compaq Alpha DS20 machines. Bristol: Access to a 160-processor Beowulf cluster with fast connections. CEH Edinburgh: ~100-processor Beowulf cluster (Intel 1.8 GHz). CEH Wallingford: ~50-processor cluster (Sun 700 MHz). Local Area Network (LAN) connections: 100 Mbps at all sites with maximum capacity 1 Gbps. Wide Area Network (WAN) connections: SOC and Reading have 34 Mbps connections with plans to increase to 1 Gbps. Bristol is on the SuperJanet4 backbone with 2.5 Gbps links to Reading and Edinburgh. SuperJanet3 connections at Bristol and UEA are 155 Mbps. CEH sites have 2 Mbps connections, with increases under negotiation. Imperial College is a point of presence on the London MAN with a direct 1Gb connection. Southampton also has a 1Gb connection. We will feedback any additional or ongoing networking requirements that arise as a result of the project to the Grid Network Team, which has close links to Cox and Darlington. 13
© Copyright 2026 Paperzz