Matt Fry from UKCEH gave a presentation on making catchment data more available to end users across the UK and for researchers, and also trying to make research data more available generally. This aims to deliver data in ways to make it more directly useful, and Matt focused on hydrology data, and how to make water quality and other freshwater data available alongside each other.
Application Programming Interfaces (API) are a way of automating joining up data between different platforms and website sources. Matt provided an update on what APIs can do and how they help by making data available rapidly and with better integration of various datasets. If structured appropriately, this can help access and collaboration.
Matt has worked with the National River Flow Archive (NRFA) which is increasingly accessible, and useful for research and wider communities. The NRFA holds data from 1500 flow gauging stations around the UK, with data from 1880s onwards. This is a really useful link for a range of data including sites with particular soils or land cover, as well as metadata about whole catchments. It offers a range of pages with details of gauging stations and how flows are measured. The data is available to download as a csv. file, however an API is available to query particular lists of stations, particular time series, particular catchment information parameters (land cover, geology, elevation), and to request data in different formats. This is forming the basis for how CEH deliver data for their website, and the NRFA saw more than 500,000 queries using the API in 2019.
|Matt also mentioned the UK Water Resources Portal. This delivers near real time data on a range of aspects of water resources from stations across the UK. This is integrated with daily flow data from EA, and provides the historical context. For example, whether a particular flow level is typical in the context of that site. Other data can also be plotted against flow data including COSMOS soil moisture and groundwater levels. This data can be used to provide context at a national scale, and offers the opportunity to interpret flow data as a national drought indicator.
Matt also talked about the UK catchment rainfall data, with Met Office sources. They have created gridded daily 1km datasets using all rain gauges across the UK. Many users want data for individual locations or bespoke catchments. Whilst data enables this, it is a huge dataset with complex tools, so they are trying to develop tools through APIs to make data more readily accessible.
Next, Matt mentioned the Drought Data Hub which is an interactive tool for rainfall data extraction. One of the features enables you to upload a shapefile of your catchment, to calculate average rainfall across a catchment! This lets you see the rainfall series, with data up to 2015. There are also tools for long term modelled flows, and future climate scenarios for flows and soil moisture, with both gridded and point datasets available.
Next Matt talked about UKSCAPE hydrological data integration portal, which is a UKCEH science programme with a significant element focused on improving ways we deliver research data and informatics. APIs for research datasets have been identified as a key element. They have built a demonstration portal for hydrological data and are looking at how to integrate real time data from a range of public APIs. This will provide an improved understanding of how better to integrate datasets for consistent access and querying. This will also allow more comprehensive ‘data discovery’ queries to find relevant sampling sites and understand user requirements for APIs for environmental data.
Researchers have different priorities. Some will be interested in site by site interactions or local data, some researches want to find sites with suitable amounts of data for certain objectives, whilst some may want to visualise summary information across catchments. The APIs allow you to query the data based on specifically what you are looking for.
The UKSCAPE integration tool allows you to locate points along your river of interest, which have the most data layers, that might be indicative of conditions at that site. You can filter by what parameters you are looking for. For example nitrate, phosphorous or even provide a longer list of determinants, and show sites which include those parameters. You can also specify the time period you are interested in. This is a good way of demonstrating how data can be more useful.
Matt finished by outlining some upcoming work including developing DataLabs which are collaborative cloud computing spaces for running analyses against large datasets. Also, they are looking into providing python/R notebooks for collaboration on research projects, and looking at ways to speed up large analyses.
Finally, he mentioned how we should exploit the huge datasets of river data that are available. We should exploit linkages between data such as linking monitoring data with contextual data such as soils, land cover, topography, as well as drivers of river health such as Sewage Treatment Works, Combined Sewer Outfalls, and field scale crop cover.
Thanks to Matt and everyone involved in organising this webinar. It was great to hear all about these useful tools that are available to help us efficiently use river data.