Pangeo, Processing and Providence: Key Python technologies that you should try.

Mr Nick Mortimer1

1CISRO Ocean and Atmosphere, Crewly, Australia

Abstract:

Since making the move Python and discovering the Pangeo community, Nick has been on a journey of collaboration, working with the National Center for Atmospheric Research in Boulder Colorado and the Met Office’s Informatics Lab in Exeter Uk.

In this workshop, Nick hopes to present ways of working in Python using open source community tools, that encourage collaboration.

Tired of working with old closed source tools? Ready to embrace the Python eco-system? This workshop is designed to introduce some key Python technologies that will help you deliver analysis-ready datasets combined with scalable processing ready to tackle just about any size of dataset.

This workshop is designed to give you what you need to implement workflows in Jupyter notebooks with a focus on scalability and providence.

First, we will start with an introduction to the Pangeo environment (https://pangeo.io/) :

Jupyter lab: Web page Python-based processing in the cloud

Dask:  Write scalable analysis code in python

Xarray: N-Dimensional labelled arrays and datasets in Python, with a focus on Zarr and cloud storage

Intake: A Lightweight set of tools for loading and sharing data in data science projects

Papermill: Parameterize and run your Jupyter notebooks

Followed by examples covering some use cases from this list:

  1. Can you please process this 50Gb CSV file for me?
  2. I have 18,000 Argo float netCDF files that I want to get into aggregate
  3. I have some legacy FORTRAN and I’d like to be able to use it with scalable Python tools
  4. I’d like to create and share an analysis read dataset using Intake

Biography:

Nick Mortimer stopped using Matlab and moved to Python nearly five years ago after a conscious decision to change his career path from endlessly cleaning csv files to capturing and preparing analysis-ready datasets in near real-time.

ABOUT AeRO

AeRO is the industry association focused on eResearch in Australasia. We play a critical coordination role for our members, who are actively transforming research via Information Technology. Organisations join AeRO to advance their own capabilities and services, to collaborate and to network with peers. AeRO believes researchers and the sector significantly benefit from greater communication, coordination and sharing among the increasingly different and evolving service providers.