Marchesiello, 2005

Centre IRD de Bretagne
Le modèle ROMS et son
utilisation sur NYMPHEA
Patrick Marchesiello
Brest, 13 Janvier 2005
ROMS History
 Descendant of SPEM & SCRUM (relative of POM)
(Song & Haidvogel 1994; Barnier et al., 1998)
 UCLA: more like developer’s code (Shchepetkin et al., 1998, 2003, 2004;
Marchesiello et al., 2001, 2003 … )
http://www.atmos.ucla.edu/cesr/ROMS_page.html
 Rutgers: larger user community & support
http://marine.rutgers.edu/po/index.php?model=roms
 IRD Brest & UCLA & INRIA
http://www.brest.ird.fr/Roms_tools
- AGRIF: Adaptive Grid Refinement In Fortran (Debreu 1999)
- Pre-processing tools (Penven, Marchesiello)
Collaborators and Users
FRANCE
 IRD Brest: Penven, Marchesiello et al.
 LMC Grenoble: Debreu et al.
 LPO Brest: Le Gentil et al.
USA
 UCLA: McWilliams, Shchepetkin, et al.
 JPL: Chao et al.
 Rutgers U.: Arango et al.
USERS
 France: Brest, Paris, Toulouse, Noumea
 Europe: Germany (U. Bremerhaven), Italy (JRC), Portugal (IPIMAR),
Spain (AZTI)
 Africa: Morocco (INRH), Senegal (LPA), South Africa (U. Captown)
 America: California, Peru (IMARPE), Chili (U. Conception), Brazil
ROMS Main features
• Hydrostatic, Boussinesq Primitive Equations
• Free surface
• Generalized vertical s-coordinate
• Horizontal curvilinear coordinates
• High order, low dispersion numerics
• Embedded domains: AGRIF
• Open boundary conditions
• Boundary layers parameterizations
• Parallelization: OMP, MPI
• Domain partitionning
• Optimized for vector computers
• Fortran 95
• UNIX/Linux
• C preprocessor
• NetCDF library, used for all I/O
Numerics: Motivation
Kantha and Clayson (2000) after Durran (1991)
Numerics: Strategy
High order accurate methods:
Sanderson (1998): optimal choice (lower cost for a
given accuracy) for general ocean circulation models
is 3RD OR 4TH ORDER accurate methods
With special care to:
• Numerical dispersion
• Pressure gradient
• Mode splitting
• Combination of methods
Numerics in ROMS
(Shchepetkin & McWilliams, 1998, 2003, 2004)
 Horizontal (“C”) and vertical staggered grids
 Time stepping
– Split-explicit barotropic and baroclinic modes with 2-way time filter
– Predictor-corrector Leapfrog-Adams-Molton 3rd order scheme
with feed-back between momentum & tracer equations
– Non-uniform density in barotropic mode
– Conservative & constancy preserving advection for tracers.
 Advection
– 3rd order upstream biased (QUICK)
 Vertical terms
– parabolic spline reconstruction for horiz. pressure gradient and
advection terms (equivalent 8th order)
– Implicite Crank-Nicholson scheme for vertical mixing terms
Numerics: Perfomances
POG - 0.25 deg
C. Blanc
ROMS – 0.25 deg
C. Blanc
ROMS_AGRIF
The same model
(executable) runs on grids
with different space/time
resolutions
2
20 45 34 59 3 3 3
• Each domain has its own input/output files
• Grid’s locations specified in AGRIF_FixedGrids.in
• Works in OPENMP/MPI
30 55 70 89 3 3 2
0
1
10 30 20 40 5 3 5
0
• Forcings, initial conditions generated with an interactive
matlab tool: « nesting gui »
Nymphea
Implementation
 Compilation
– Software required: Fortran95, Unix, C preprocessor, NetCDF library
– Compilation interface in ROMS which defines machine dependent
options (Tru64 UNIX)
 Parallelisation
– OpenMP: 1 knot of 4 processors
– MPI: for process studies (S. Le Gentil); needs work for realistic
applications
 Applications
– Realistic: coastal regions of West Africa (Morocco and Senegal),
Iroise sea,Bay of Brest
– Process studies at high resolution
Mercator
ROMS_AGRIF for West AFRICA
Sahara 5 km
W. Africa 25 km
Levitus
C. Blanc
C. Blanc
Clipper
C. Vert
242*252*32 points
dt=720s
PERFORMANCES: COST
CONFIGURATION
• 2 Embedded grids with refinement coef=5
• Size (child grid): 242*252*32 points with dt=720s
• Duration of simulation: 10 model years
• Processors: 1 knot of 4 processors Alpha EV68 (1GHz)
• Parallelization with OpenMP
• Partitionning: 4*8
Cost: c = 6. 10-6 CPU seconds / grid point / time step
(Total run time = 15 days)
Comparisons:
• PC Xeon 2.8Ghz: c=1.10-5
• SGI/CRAY Origin2000: c=8.10-5
• Earth Simulator (NEC SX): c=5. 10-7
PERFORMANCES:
SCALABILITY
• Nymphea: 95 % for 1-4
proc.
• SGI/CRAY-Origin2000:
OMP opt. part.
95% with saturation above
128 proc.
• Earth Simulator: 95-60%
for 1-512 proc.
MPI (1 sub/proc)
OMP (1 sub/proc)
Partitioning
NSUB_E
Senegal ideal case on Nymphea
(P. Estrade)
• Domain: 150*500*40 with dt=480s
• Partitioning 1*1 : Cost = 7.5 10-6
• Partitioning 1*64 : Cost = 6. 10-6
(units= CPU s/ grid point/ time step)
25 % gain due to optimal cache use
20
20
X
30
20
X
20
X
10
10
X
10
8
8
X
6
X
6
X
2
30
25
20
15
10
5
0
2
100 % gain due to
optimal cache use
Effet du partitionnement
Tiki, pentium III (biprocesseur)
seconde/itération
New Caledonia region
on PC (J. Lefêvre)
NSUB_X
Partition du dom aine en Latitude et Longitude
Domain: 159*171*20 with dt=480s
CONCLUSION
 ROMS is well optimized (code and methods)
and adapted to Nymphea which allows to
perform large runs in a reasonable time
without excessive queuing time
 The model is ready for faster, more
numerous processors (provided AGRIF is
fully tested with MPI)
 More storage would be welcome