Reproducible Analyses of Coastal Extremes: A Pythonic Approach to Reproducible Environments, Packaged Code, and Dynamic Documents

Dr Blake Seers1, Ms Claire Trenham2, Dr Ron Hoeke1, Mr Paul Branson3,4

1CSIRO Oceans and Atmosphere, Aspendale, Melbourne, Australia, 2CSIRO Oceans and Atmosphere, Canberra, Australia, 3CSIRO Oceans and Atmosphere, Crawley, Australia, 4UWA Oceans Graduate School, Crawley, Australia

Abstract:

Reproducibility is fundamental for scientific research. Once the research question has been articulated, a typical analysis starts with obtaining multiple data sets, and ends with a detailed scientific analysis after many interim steps. CSIRO Oceans and Atmosphere’s sea level, waves and coastal extremes (SLWCE) team are dedicated to working towards a streamlined, transparent, and reproducible workflow from the very start of a project, to the final deliverable. In this presentation we show how we are working towards a reproducible and transparent analysis workflow using Python virtual environments, Jupyter Notebooks , and some bash scripts.

We first create and structure the project directory, initialize user-defined environment variables, and set up the python environment that contains our actively-developed python package; cmextremes. The team’s cmextremes package is a Git repository which is continually developed and tested, leading to a sustainable codebase . The python environment has been set up to include dask  and other dependencies for parallel-processing so we can focus on optimizing our code for dealing with large spatio-temporal datasets.

We automate the process of setting up the project architecture with a bash script which allows us to get started on the analysis quickly and easily by using a Jupyter Notebook instance that uses the project’s python environment and has access to the user-specific environmental variables. The analysis can then be scripted, documented, visualized, and interpreted inside a dynamic document using Jupyter Notebooks. This entire process is transparently scripted and therefore completely reproducible .


Biography:

Blake joined CSIRO’s Ocean & Atmosphere in 2019, working within the Sea Level, Waves and Coastal Extremes team. Before joining CSIRO, Blake completed a PhD in Marine Science and Statistics, and worked a statistical consultant within the Department of Statistics at the University of Auckland. Blake works across various projects within the team where he contributes to the team’s growing codebase and scientific computing requirements.

ABOUT AeRO

AeRO is the industry association focused on eResearch in Australasia. We play a critical coordination role for our members, who are actively transforming research via Information Technology. Organisations join AeRO to advance their own capabilities and services, to collaborate and to network with peers. AeRO believes researchers and the sector significantly benefit from greater communication, coordination and sharing among the increasingly different and evolving service providers.

Conference Managers

Please contact the team at Conference Design with any questions regarding the conference.
© 2019 Conference Design Pty Ltd