Mr Nick Mortimer1
1CISRO Ocean and Atmosphere, Crewly, Australia
Since making the move Python and discovering the Pangeo community, Nick has been on a journey of collaboration, working with the National Center for Atmospheric Research in Boulder Colorado and the Met Office’s Informatics Lab in Exeter Uk.
In this workshop, Nick hopes to present ways of working in Python using open source community tools, that encourage collaboration.
Tired of working with old closed source tools? Ready to embrace the Python eco-system? This workshop is designed to introduce some key Python technologies that will help you deliver analysis-ready datasets combined with scalable processing ready to tackle just about any size of dataset.
This workshop is designed to give you what you need to implement workflows in Jupyter notebooks with a focus on scalability and providence.
First, we will start with an introduction to the Pangeo environment (https://pangeo.io/) :
Jupyter lab: Web page Python-based processing in the cloud
Dask: Write scalable analysis code in python
Xarray: N-Dimensional labelled arrays and datasets in Python, with a focus on Zarr and cloud storage
Intake: A Lightweight set of tools for loading and sharing data in data science projects
Papermill: Parameterize and run your Jupyter notebooks
Followed by examples covering some use cases from this list:
- Can you please process this 50Gb CSV file for me?
- I have 18,000 Argo float netCDF files that I want to get into aggregate
- I have some legacy FORTRAN and I’d like to be able to use it with scalable Python tools
- I’d like to create and share an analysis read dataset using Intake
Nick Mortimer stopped using Matlab and moved to Python nearly five years ago after a conscious decision to change his career path from endlessly cleaning csv files to capturing and preparing analysis-ready datasets in near real-time.