Claire Trenham1, Dr Blake Seers2, Dr Paul Branson3,4, Dr Ron Hoeke2
1CSIRO O&A, Black Mountain, Australia, 2CSIRO O&A, Aspendale, Australia, 3CSIRO O&A, Crawley, Australia, 4UWA Oceans Graduate School, Crawley, Australia
Abstract:
Increases in both computational capacity and the resolution of numerical models have led to challenging workflows. Recently, developments in workflow tools allow users to construct modular scripts of limited scope and complexity to manage otherwise complex data and job execution workflows. It has become clear to coastal scientists at CSIRO that an appropriate workflow system is required to improve efficiency and reproducibility. It needs to fulfil near-real-time and climate time-scale simulations, but flexible enough to support research and development. More specifically, various ocean, coastal and nearshore models need to be configured; input data pre-processed; model execution managed; and results post-processed.
We are prototyping the Pegasus Workflow Management System (https://pegasus.isi.edu/). We chose Pegasus over more-commonly used systems due to its demonstrable use in HPC and data intensive sciences. When combined with containers, this enables platform-agnostic model execution. The architecture comprises a virtual machine for task orchestration and running local jobs, and HPC for larger-scale modelling tasks. This creates a suite which can be deployed on other systems for greater portability and reusability.
Use of Pegasus enables “best practice” research by providing reproducible and easily modified model runs; reusable structures and modular scripts; inbuilt provenance tracking; and scalability and performance. Being open source, we can use this in an Open Science context. Pegasus presents the possibility of making our science both more robust and more efficient, in addition to simplifying model set up and execution.
We present our initial findings in implementing a standard workflow in Pegasus and views on future use.
Biography:
Claire joined CSIRO Marine & Atmospheric Research in 2011 working on ocean wave climate modelling. After a period working as Senior Research Data Services Specialist for the National Computational Infrastructure (NCI) in Canberra between 2014 and 2017, she returned to CSIRO’s Oceans & Atmosphere division in 2017. She currently works in the Sea level rise, waves and coastal extremes team as part of the Climate extremes and projections group. Claire is heavily involved in climate data and preparation for CMIP6, alongside regional climate modelling, data processing, and making improvements to data and software to enhance science capabilities. In a past life she was a radio astronomer, and is also a qualified high school maths/science teacher.