Data
Tools
Models
Data, models and tools: Dealing with any complex hydraulic engineering problem
invariable use is made of: data, models and tools.
Wat is the problem?
• Quality, quick availability and accessibility of data for
analysis purposes currently not satisfactory
• Models and tools used and developed by engineers are
not sufficiently documented nor version controlled
We can do much better!
Data: data not under version control, multitude of file formats, metadata not available
within data files. Models and tools: different tool versions on users’ PC’s, confusion on
version of tool used to perform calculations. Result: inefficiency!
Detailed
Simplified
User
OPeNDAP Server
SubVersion Server
Raw Data
OpenEarth (BwN) provides the infrastructure to deal with this
problem. Basic elements: SubVersion server & OPeNDAP
server. Paradigm: Fixed structure – flexible access.
Tools
Models
Supplier
What is NetCDF?
• An array based data structure for storing
multidimensional data
• N-dimensional coordinates systems
–
–
–
–
–
X coordinate (e.g. longitude)
Y coordinate (e.g. latitude)
Z coordinate (e.g. altitude)
Time dimension
… other dimensions
T
Y
Z
X
• Variables – support for multiple variables
– Temperature, humidity, pressure, salinity, etc
• Geometry – implicit or explicit
– Regular grid (implicit)
– Irregular grid
– Points
NetCDF: NASA's Earth Science Data Systems Standards Process Group recommends
NetCDF as data storage standard. Pro’s: data exchangeability, platform independent,
robust use and easy to understand.
X
X
Y
Z
2
Y
Z
1
1
Q
1
1
1
0.5
1
1
2
0.3
1
2
1
0.6
1
2
2
0.1
2
1
1
0.4
2
1
2
0.2
2
2
1
0.9
2
2
2
0.3
32 numbers
1
0.5
0.4
0.6
0.9
0.3
0.2
0.1
0.3
2
2
14 numbers
Efficient data storage: Binary NetCDF format enables complete variable definition with
a minimal set of numbers (see example) and minimal metadata repetition.
Result: efficiency in disk space, easy database querying.
Example:
x = nc_varget(transect.nc, 'crossshore_distance');
y = nc_varget(transect.nc, 'time');
z = nc_varget(transect.nc, 'height');
surface(x, y, z);
transect.nc
netcdf transect.nc {
dimensions:
crossshore = 198 ;
time = 3 ;
variables:
float crossshore_distance(crossshore), shape = [198]
crossshore_distance:unit = "meter"
float year(time), shape = [3]
year:unit = "year"
float height(time,crossshore), shape = [3 198]
height:unit = "meter"
data:
coastward_distance = (-65:5:920);
year
= (2006:2008);
height
= [ 7.62 7.49 8.26 7.91 7.72 6.03 5.41 … -7.62 -7.705 -7.79 -7.845 -7.9 -7.99 -8.08
7.64 7.49 7.95 8.54 8.34 7.54 6.62 … -7.54 -7.635 -7.73 -7.8 -7.87 -7.945 -8.02
7.56 7.43 7.95 8.84 8.42 7.7 6.77 … -7.46 -7.535 -7.61 -7.695 -7.78 -7.865 -7.95];
}
Example NetCDF file: 198 crossshore points, 3 timestamps, 3 x 198 surface elevations.
Metadata in one file together with the data. NB: transect.nc is a binairy file. Easy Matlab
routines available: nc_varput, nc_addvar, nc_varget (see upper right)
SubVersion: open source version control system. Users ‘commit’ their files in one
central database (update local copy regularly). Every commit receives a unique
revisionnumber. Comments indicate per commit which changes were made.
Blame functionality: Subversion knows of each line of code who changed it, when and
as part of what revision number. Colors indicate the age of the code (bluer = older). Any
change can always be rolled back at any time.
Merge tool: Changes made between any two versions of a tool are easily revealed
using the Merge tool. The Merge tool also helps to resolve coding conflicts in case
multiple users modified the same code.
Version control: any routine/datafile can automatically be given a comment block with
information on: last change date, author, revision number etc. Recording revision info of
tools and data used in a project enhances reproducibility of results.
Statistics: Per project/tool a separate repository can be made. Combining reusable
tools in one central repository provides large advantages (sharing, cooperation,
learning). OpenEarth tools, is open source and freeware.
Extract
Transform
Load
Provide
Raw data
Scripts
Database
Charts &
Maps
Stored files
(netcdf)
accessible
through the
web
Tools and
websites
OpenEarth
OPeNDAP
OpenEarth
Tools
Store raw data
in subversion to
keep track of
history
Add meta
information
Script to convert
raw data into
netcdf
OpenEarth
RawData
Data workflow: OpenEarth pre-scribes the following steps to make data available:
1. put raw data in a SubVersion repository, 2. use scripts to transform data to NetCDF
including meta data, 3. upload *.nc files to OpenDap server, and 4. provide easy access.
Community of practice: OpenEarth has a wide community of users (Building with
Nature, EU FP7 MICORE, Delft Cluster etc.). A wide number of trainingsessions are
available (SubVersion use, programmingstandards, etc.).
© Copyright 2026 Paperzz