FY2014 Q2 Informal Status

ESPC-RRTMGP Project FY14Q2 Update
Award Number: N000141310858
Project Title: RRTMGP: A High-Performance Broadband Radiation Code for the Next
Decade
PIs and co-PIs:
Eli Mlawer (AER)
Robert Pincus (Colorado)
David Berthiaume (AER)
Brian Eaton (NCAR)
Ming Liu (NRL)
1. A kick-off meeting was held for this project on February 12-13, 2014, hosted by
John Dennis at NCAR. Attendees were Mlawer, Pincus, Berthiaume, Eaton, John
Dennis (NCAR), Jim Edwards (NCAR), Tim Whitcomb (NRL), Jed Brown (ANL), and
Tom Henderson (NOAA). This meeting provided an opportunity for the entire team
to get an overview of the project, to gain a better understanding of the
computational environment in which RRTMGP would operate as part of the NCAR
and Navy global models, discuss in detail particular aspects of the planned
development, and decide on next steps.
The meeting opened with a series of talks to ensure that all team members were
familiar with the current structure of RRTMG, previous derivatives of the code, and
the NCAR computational environment. (All of these presentations have been
uploaded to the ESPC -RRTMGP wiki.) First PI Mlawer presented the motivation for
the project, key information about radiation calculations in GCMs, and details about
RRTMG and its stored tables and interpolations algorithms. This led into co-I
Pincus’s talk about his refactoring of RRTMG into PSRad, a much better structured
code, and the work done at AER to port RRTMG to a GPU (RRTMGPU). Finally,
Dennis spoke about issues related to the NCAR codes that may impact the direction
of the RRTMGP development.
The in depth discussion that followed these presentations was very constructive.
Some of the issues discussed included:
a) Pincus presented a draft of the potential modular structure for the future code
(briefly summarized in previous quarterly status report), which is heavily based on
the structure used in PSRad. After some discussion, the NCAR participants stated
that this would be a favorable starting point from their perspective. They requested
that the existing PSRad code be provided to them for profiling in some of the
computational environments in use at NCAR. It was decided that NCAR will run
PORT on both RRTMG and PSRad.
b) Vectorization of the code was discussed, especially key issues in the current code
that inhibit vectorization. A great deal of discussion centered on the current gas
optics code (taumol), structured into individual subroutine for each spectral band.
Due to the different physical mechanisms that impact each band, as well as specific
adjustments needed in certain bands to attain desired accuracy, these subroutines
are all different, an impediment to effective vectorization. Furthermore, an
interpolation scheme used in many bands employs a branching technique due to
non-linear behavior (see Figure 1), which limits efficient vectorization on certain
hardware. Brown suggested that an interpolation scheme based on Chebyshev
polynomials might be an effective alternate approach that would be amenable to
vectorization, so the team decided to further pursue this idea after the meeting.
c) There was much discussion about needed code properties and the code
development process, with NCAR emphasizing the need for the code to be
structured in a way that facilitates unit testing, enables effective profiling, and is
modular enough to be easily extendable. Berthiaume volunteered to design a
framework for RRTMGP that he would circulate after the meeting.
d) There was considerable discussion about implementing a framework that could
run efficiently on MIC processors, GPUs, and regular CPU processors as well as
possible combinations of these hardware platforms in cluster environments. This
generality would take precedence over some more aggressive platform specific
optimizations. A combination of OpenMP and OpenACC will be used to facilitate this
generality. Since the problem space has enough fully independent calculations, it is
thought that this unification of OpenMP and OpenACC into a generalized framework
will be an effective and future-proof approach.
The participants all felt that this was a very useful and productive meeting that set
us up well for the next stage of the project.
2) A few weeks after the meeting, Berthiaume sent his proposed framework for
RRTMGP to the team. Figure 2 schematically depicts how the framework would
work. There are two base classes; dataset and component. A dataset contains a
group of similar data that could represent similar physical properties. Example
datasets could be cloud properties, molecular amounts, and geometry information.
Classes that derive from the base class conform to a specific interface and
implement functionality that saves and loads the data to/from files, checks the data
to make sure the values are reasonable, computes the memory footprint, etc. A
component represents a calculation, such as the gas optics, cloud properties, and
flux computation. A particular calculation derives from the base component classes
and inherits all of its functionality. Datasets are attached to the inputs and outputs
of components, and each component can have multiple inputs and outputs.
This proposed framework would allow the code to be generalized to different
hardware platforms utilizing the required optimizations while allowing effective
unit testing, and making the code highly extendable without requiring any of the
lower level components to be changed.
Use of this framework in the new code is still under discussion, although there is
some thought that it is too general and complex for the project and potential users.
3) Following up on the discussion about taumol in Boulder, AER generated a dataset
of errors in interpolated absorption coefficients for a sample spectral band in
RRTMG. For this calculation, code was written to compute and output the
absorption coefficient in the current code for values of the binary species parameter
() of 0, 0.01, 0.02 ,..., 0.99, 1.0. In addition, paired runs of LBLRTM and our kdistribution code were performed for these η values to obtain reference values. As
shown in Figure 3, the mean error due to the current interpolation scheme (linear
for 0.125<<0.875, 3-point otherwise) is less than 0.2% in the linear regime, and
can get as high as 0.6% for <0.125. The stdev values are typically less than 1%.
(Only errors for  values less then 0.75 are shown since the large majority of cases
fall in this region.) These error statistics will be used to determine the order of the
Chebyshev polynomials needed in the to-be-designed polynomial approach to
ensure that the results of the new method are at least as accurate as the existing
approach.
4) A draft version of the taumol code was written that unifies all of the band
calculations by generalizing the interpolation, the input data, and the absorption
coefficient layout. This unification will allow for cleaner code, better maintainability
(a modification to the algorithm only needs to be done once rather than once for
each band), and possibly more efficient vectorization by allowing greater developer
focus on the core interpolation algorithm.
Figure 1. For a single pressure, temperature, and sub-interval in LW band 5, the
absorption coefficients (blue) stored in RRTMG as a function of the code’s binary
species parameter η. Linear interpolated values for all η values are shown in red
(although only used in the code for 0.125<η<0.875). For 0<η<0.125 and 0.875<η<1,
the values obtained from the 3-point interpolation method used by RRTMG are
shown in green. The different interpolation techniques used for different η values is
not conducive to vectorization.
Proposed Framework for RRTMGP
Abstract
Component
DataSet
--check() – checks the data for reasonable values
save() – saves the dataset to a file
load() – loads the dataset from a file
size() – computes the total size (in bytes, KB, or MB)
of the data
Molecular
Amounts
DataSet
Column amounts for each molecule
--check()– override: makes sure that the column
amounts all have reasonable, physically realistic
values.
Calculate
Fluxes
No
Scattering
DataSet inputs(:) – array of (derived) datasets that this
component uses for input (could include optional
members)
DataSet outputs(:) – array of (derived) datasets that this
component uses for output.
--checkInputs() – calls check() for each of the inputs in the
dynamic array of inputs
checkOutputs() – calls check() for each of the outputs in
the dynamic array of outputs
run() – runs the component
saveInputs(fname) – saves the inputs to a file
saveOutputs(fname) – save the outputs to a file
loadInputs(fname) – loads the inputs from a file
loadOutputs(fname) – loads the outputs from a file
test() – use the save and load routines to perform a
test of this component
computeMemory() – computes the amount of
memory this component uses
MCICA
Scattering
Figure 2. Proposed framework for RRTMGP.
Random
Exponential
Overlap
Figure 3. Mean percentage error (standard deviation also shown) in absorption
coefficients due to interpolation scheme for Band 5 (700-820 cm-1) lower
atmosphere in RRTMG as a function of binary species parameter (η) values less than
0.75.