Introducing the WaterML R Package for retrieving data from CUAHSI WaterOneFlow Web Services Jiří Kadlec, Bryn StClair, Daniel P. Ames, Richard A. Gill Brigham Young University 2015 AWRA Annual Water Resources Conference 16th November 2015 1 Presentation Outline • • • • • • Background What is R? What is WaterML? WaterML R Package Functions WaterML R Package Design Use Case Importance of Hydrological Data Agriculture – water management Data collection and modeling Flood event classification and forecasting Map and Time Series Snow and Ski Track condition analysis Big Data Need for statistical analysis R Statistical Software (www.r-project.org) • Built-in data structures and functions for manipulating arrays, matrices, tables • Analysis can be saved as a script for future re-use Suitable environment to analyze big spatial and time series data R Packages • A package is a collection of reusable R functions, data and code. They are available for download and installation from the R website. > 6,000 user-contributed packages for graphics, data mining, spatial analysis, modeling… How to get the data into R? one option: CUAHSI Water Data Center • http://data.cuahsi.org CUAHSI WaterOneFlow Web Services • 104 oficially registered data sources • All of the data is open-access (free to redistribute) HIS CentralCatalog (Find) Examples of available data: HydroServer 1 (Publish) HydroServer 2 (Publish) Data Discovery Web Service Data Retrieval Web Services Client (Bind) Slide 7 of 10 WaterML Data Exchange Format • Based on XML (extensible markup language) • Has both the data and metadata about a time-series • Site, Variable, Observation Method, Data Source, Quality Control Level, Time Zone • WaterML 1.1 • WaterML 2.0 • International Standard of Open Geospatial Consortium • Used by all data providers registered at CUAHSI Previous approaches for connecting CUAHSI data and R • RObsDat (direct connection to ODM database) • HydroDesktop + HydroR (Windows only) • DataRetrieval (USGS NWIS and EPA only) Need easy-to use method to connect R with any HydroServer WaterOneFlow service or Any WaterML file WaterML + R WaterML R Package • Free R package for discovery, download and parsing of WaterML data • Retrieves data from CUAHSI HIS Central catalog, WaterOneFlow web services, and custom WaterML files • Supports WaterML 1.0, 1.1, and 2.0 • Published on CRAN • (http://cran.r-project.org/web/packages/WaterML) Slide 10 of 10 WaterML R Package Functions HydroServer HIS Central Catalog Downlad API (WaterML) Upload API (JSON) (1) Data Search (2) Data Download (3) Data Upload GetServices GetSites AddSites HISCentral_GetSites GetVariables AddVariables HISCentral_GetSeriesCatalog GetSiteInfo GetValues AddMethods AddSources AddValues Software Development Challenges (1) • Communicating with SOAP Web Service from R • SOAP = Simple Object Access Protocol • To call a SOAP web service method, we must use HTTP POST web request with 2 parts: – SOAP:Envelope – SOAPAction header SOAP Envelope and SOAP Action POST http://hydrodata.info/chmi-h/cuahsi_1_1.asmx HTTP REQUEST POST DATA <?xml version="1.0" encoding="utf-8"?> <soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> Web Method Name <soap:Body> <GetSiteInfoObject xmlns="http://www.cuahsi.org/his/1.1/ws/"> <site>HYDRODATACZ-HR:187</site><authToken></authToken> </GetSiteInfoObject></soap:Body> Parameter Value </soap:Envelope> HTTP REQUEST HEADERS Content-Type: text/xml SOAPAction: http://www.cuahsi.org/his/1.1/ws/GetSiteInfoObject Software Development Challenges (2) • Parsing very big XML Data File • Initially I used XMLTreeParse Very Slow • Better solution: XPath The xpathSApply function finds all elements with the same name and hierarchy level, and stores them in an array 100,000 or more lines………. dataValues = xpathSApply(doc, "//sr:value", xmlValue, namespaces = ns) dateTimesUTC = xpathSApply(doc, "//sr:value", xmlGetAttr, name = "dateTimeUTC", namespaces = ns) Using the WaterML Package Website: worldwater.byu.edu/app/index.php/rushvalley Data Logger Sensors Automated uploading of data from sensors to ODM and Hydroserver DECAGON Data Server api.echo2data.com DECAGON Website DECAGON API Lookup Table R data conversion script Upload API Site Variable Logger Sensor Hydroserver worldwater.byu.edu/interactive/rushvalley ODM Database Website WaterML Services API HIS Central Catalog R statistical analysis tool Example Test: There is no difference between NDVI at plots with and without mammals Other uses: Exploratory analysis (error bar plot) (daily mean with 1 standard error bars) WaterML R Package Usage Statistics • 1733 downloads (since May 2015) • 250 downloads in last month • Officially recognized by CUAHSI (2015 CUAHSI president’s award for community contribution) Slide 20 of 10 Thank you for your attention R Website www.r-project.org WaterML R Package Website https://cran.r-project.org/web/packages/WaterML Source Code on Github http://github.com/jirikadlec2/waterml
© Copyright 2026 Paperzz