Implementing Data-Driven Earth and Space Science

Jakob Lüttgau1

1German Climate Computing Centre

 

This workshop will explore the commonalities in data-driven research between different earth and space science domains and ways forward in their technical implementation.

The first session will look at the science drivers underlying the requirements for data-driven research in the earth and space sciences. These questions will inform the discussion about common data challenges, the shift to in silico experiments, and the development of a common machine learning toolbox.

The second session will discuss how to transform conceptual scientific models into ‘analysis-ready’ solutions. These solutions will require more than HPC and cloud infrastructures, but also the provision of compute and data services, and the embedding of machine learning in ‘software as a service’ solutions.

Time Moderator Title
90 minutes TBC Commonalities between earth and space science domains in data-driven science

  • Challenges
  • Experiments in silico
  • ML methods library and knowledge base
  • ML Intercomparison Project
90 minutes Jakob Lüttgau Transformation of conceptual scientific models into ‘analysis-ready’ solutions (utilities)

  • HPC, HPD, Cloud Services
  • Service Provision / Environments
  • Software as a Service

 

Hands on with Tableau (self-service analytics for you)

Timothy Louie1

1Tableau Software, Canberra, ACT

 

Tableau is focused on one thing – helping people see and understand data. Organizations everywhere, from non-profits to global enterprises, and across all industries and departments, are empowering their people with data. With Tableau they are finding opportunities in their business that they have never seen before. Learn more about how organizations like yours are using our platform to drive their business forward.

Tableau is all about making data analytics fast, easy, beautiful and most importantly, useful.

Join us for our Tableau Hands On session and we’ll show you how to connect to your data and visualise your queries without writing a single line of code. You’ll learn how to create a dashboard from scratch and see how you can quickly analyse, visualise and share information and publish your results. Whether you measure your data in petabytes or in billions of rows, Tableau is built to work as fast as you do.

It’s self-service analytics, for everyone.

Who should attend?  Anyone interested in transforming data into meaningful visualisations! Although the content is geared towards beginners, people of all Tableau skill levels are welcome.

What is it?  We believe there’s a better way to analyse your data. Come to this hands-on session to learn how to tell a story with your data… and have some fun along the way! After attending a Tableau Hands On session, you can go off and explore your own data, or upskill further with both online and in-person training.

Please note this is a hands-on class that requires you to bring your laptop with Tableau Desktop installed. If you don’t have the software already, please download a copy here.

Agenda

  • What is Tableau?
  • Working with Data
  • Visual Analytics
  • Sharing and Collaboration
  • Q&A

Biography:

Having worked in the Business Intelligence and Data Management for over 25 years, Tim has seen and used his fair share of BI analytics   and Data Warehousing tools in that time. But when he first downloaded Tableau Desktop, it was love at first sight. He knew he had to join Tableau. He now spends his days helping customers ‘to see and understand their data’ as a member of the Tableau ANZ team. You can reach him at tlouie@tableau.com.

 

Pangeo: Scalable Geoscience Tools in Python — Xarray, Dask, and Jupyter

Dr James Munroe1

1Memorial University of Newfoundland

 

Earth scientists face serious challenges when working with large datasets. Pangeo is a rapidly growing community and software ecosystem for scalable geoscience based on open source scientific python. Pangeo’s three core packages are:

  1. Jupyter, a web-based tool for interactive computing
  2. xarray, a data-model and toolkit for working with N-dimensional labeled arrays
  3. Dask, a flexible parallel computing library

When combined with distributed computing, these tools can help geoscientists perform interactive analysis on datasets up to petabytes in size. In this interactive, tutorial we will demonstrate how to employ this platform using real science examples from physical oceanography and hydrology. Participants will follow along using Jupyter notebooks to interact with xarray and Dask running on public cloud.

Agenda (Tentative):

  • Introduction to Pangeo Project and Software Ecosystem
  • Hands-on interactive tutorial of xarray
  • Break / Discussion
  • Hands-on interactive tutorial of dask
  • Lunch
  • Getting started with cloud-native data analysis
  • Break / Discussion
  • Deploying your own Pangeo platform on cloud or HPC computing resources

Learning Objectives: Participants will learn how to:

  • Recognize the software packages that comprise the Pangeo platform and explain how they work together
  • Load datasets using xarray from netCDF files, openDAP endpoints, and Zarr stores
  • Analyze data using xarray’s label-based operations and groupby feature
  • Work with very large xarray datasets using Dask

Visualisation Matters 2019

A/Prof. Tomasz Bednarz1,2

1Director of Visualisation, Expanded Perception & Interaction Centre (EPICentre) UNSW Art & Design
2Team Leader at CSIRO Data61 

 

Visualisation Matters is a curated event that was funded in 2016 by Tomasz Bednarz and June Kim to promote Computer Graphics and Interactive Techniques Down Under.

Since then, it has been used not only to promote SIGGRPAH Asia (http://sa2019.siggraph.org) but also to inspire the local community on how creativity can be enhanced by collaboration, collision of cultures, art and science, etc.

This year, we are co-locating Viz Matters with C3DIS. This collaboration has been enabled by Tomasz Bednarz and Sam Moskwa, bringing to you not only inspirational speakers but also SIGGRAPH Asia 2018 Travelling Computer Animation Festival. CAF is a very unique event, that runs during SIGGRAPH, and the festival allows winning animation to be automatically nominated for the Oscars. The animations package can be only experienced at SIGGRAPH conferences, but this year Canberra is the very first city in Australia that will experience the travelling version of the show.

This event is free to attend. To register visit: https://cdesign.eventsair.com/2019-c3dis/vis-matters. Registration automatically includes a ticket to attend SIGGRAPH Asia CAF!

Be sure to check the Visualisation Matters website and this page regularly for the latest information.

Invite your family and friends, and share the joy with us.

Tomasz and Sam

 

Note this event is not catered. Attendees will need to make their own arrangements for morning tea, lunch and afternoon tea. The following link provides helpful information on nearby cafes and food outlets: https://www.nccc.com.au/restaurants-and-bars-canberra

 


 

Biography:

Tomasz Bednarz is Associate Professor and Director of Visualisation at the Expanded Perception & Interaction Centre (EPICentre) UNSW Art & Design and Team Leader at CSIRO Data61 (a leading Visual Analytics Team, in Software & Computational Systems program).

His current role at the UNSW Art & Design reflects his conviction to a holistic approach to the wicked problems facing the collation, analytics and display of big data. His approach is expansive and encompasses the use of novel technologies (AR, VR, CAVE, Dome, AVIE), often in combination. Over the last couple of years, he has been involved in wide range of projects in the area of immersive visualisation, human-computer interaction, computational imaging, visualisation, and simulations, computer graphics, and games.

He also holds Adjunct positions at:

  • Queensland University of Technology (Science and Engineering Faculty, School of Mathematical Sciences),
  • University of Sydney (Faculty of Architecture, Design and Planning – Design Lab),
  • University of South Australia (School of IT and Math Sciences).

He is Chair of ACM SIGGRAPH Asia 2019. SIGGRAPH conferences are the world’s largest, most influential annual meetings and exhibitions in computer graphics and interactive techniques. The conference will be held for the very first time in Australia (Brisbane).

Introduction to Python Programming

Daniel Collins1

1CSIRO, Canberra, Australia

 

This half-day workshop is an introduction to the fundamentals of programming in Python, with a view towards scientific applications of Python. If you have never programmed in Python before, then this workshop will give you the tools you need to get started and will provide some future direction as well. Topics include:

  • Working with Jupyter Notebooks
  • The syntax of Python
  • How to use the Python documentation: online and inside Python itself
  • Fundamental types and operations: working with integers, floating point values, strings and Boolean values.
  • Programming logic 1: making decisions with if statements
  • Programming logic 2: repetition with for loops
  • Working with collections of data: lists and dictionaries.
  • Organising your code: functions in Python
  • Dealing with errors

All training material will be available for participants to keep after the workshop.


Biography:

To be advised

Topics in Scientific Computing with Python

Daniel Collins1

1CSIRO, Canberra, Australia

 

This half-day workshop will explore several topics in the use of Python for scientific computing. Participants will work through hands-on modules covering a range of topics and scientific domains. While some modules will assume little to no Python experience, others require previous experience such as the Python for Scientists 3 day course or equivalent. Participants can choose topics of interest to them, and the available material should appeal to research scientists as well as software engineers.

Module topics include:

  • Unit testing and test-driven development for scientists
  • A programmers guide to cleaning messy sensor data with Pandas
  • Symbolic maths
  • Object-oriented Python: writing your own classes
  • An exploration of options for increasing the performance of your code
  • Using PyTables with HDF5 data
  • Exploring Python decorators
  • Reproducibility with Python

All training material will be available for participants to keep after the workshop.


Biography:

To be advised

Making Earth and environmental science data accessible via machine-to-machine services: where are they at and where are they going.

Dr. Adrian Burton2, Mr James Gallagher1, Mr. Joseph Abhayaratna6, Dr. Ben Evans3, Dr. Lesley Wyborn3, Dr. Justin Freeman5, Mr. Aaron Sedgman7, Dr. Gareth  Williams4, Dr. Kelsey Druken3, Ms Melanie  Barlow2, Dr. Mingfang Wu2

1OPeNDAP, U.S.A
2Australian Research Data Commons, Australia
3Australia National Computational Infrustructure, Canberra, Australia
4CSIRO, Australia
5Bureau of Metrorology, Melbourne, Australia
6PSMA , Canberra, Australia
7Geoscience Australia, Canberra, Australia

 

Machine-to-machine data services have become an integral part of the research, government and industry sectors. They provide automated functions for the creation, access, processing and analysis of data. The development of data-focused services is steadily increasing in Australia, across the NCRIS capabilities, CSIRO, and government agencies all of whom are moving to more formal data publishing through services, and making their data findable, accessible and interoperable to increase reusability for wider communities.

This one-day workshop will cover:

  • Current usages of a suite of web data services technologies including DAP (i.e., THREDDS, OPeNDAP, Hyrax, ERDDAP or PYDAP), CHORDS, ThingSpeak, etc;
  • Protocols and standards for service discovery and use (e.g., OGC standards, FDSN Web Service Specifications, OpenAPI) and
  • Metadata that describes them (e.g., ISO19115, ISO19119); and Horizon scans (e.g., the redesign of OGC web service standards to be more resources oriented and use OpenAPI/Swagger toolsets).

The aim is to then ask how to more efficiently use data services to meet both the challenges of today and those of the future.

This workshop will be of interest to:

  • Data service providers for exchanging latest practices and technologies of providing data services;
  • Data service consumers for raising data usage requirements and exploring how to make best use of data services; and
  • Technology and standard communities for communicating and getting feedback from both providers and communities.

Note: The above discussed technologies or standards also apply to data beyond the earth and environment domain, other communities are more than welcome.


Biography:

To be advised

CSIRO Datasets Project – Building student data literacy with CSIRO research data

Graeme Buckie1, William Flynn

1CSIRO – Education and Outreach, Adelaide, Australia

 

In order to become empowered citizens in the modern world, students need key data literacy skills, including statistical analysis, visualisation comprehension, and data-driven critical thinking. Having access to these skills enables students to better understand what they’re being told by others, while also giving them the tools to generate their own answers using publicly accessible data. The CSIRO Datasets Project features an intuitive access platform, enabling access to CSIRO data sets and creative resources to guide teachers in introducing data analysis and computer science principles to their students, with a focus on real-world scientific research.

Participants in this workshop will explore CSIRO’s Datasets Project and develop the skills needed to bring these resources into the classroom. Over the course of the session, participants will work with educational datasets and lesson plans developed from publicly accessible research data from the CSIRO Data Access Portal. Using these resources, they will examine and develop their data analysis skills, examine the associated lesson plans in detail, discuss the importance of data visualisation, discuss cross-curricular links to data science in the Australian Curriculum and examine pedagogical practice to support their students.


Biography:

To be confirmed

Introduction to Machine Learning

Mr Chris Watkins1

1Csiro, Clayton South, Australia

 

As CSIRO embraces the transition into the technological age it has spawned a variety of digital initiatives designed to accelerate researchers’ application of modern digital advances to their technical domains. One such initiative is the CSIRO Data School program which has been designed to equip scientists with the tools necessary to apply defensible, reproducible data analytics to unique scientific datasets. This workshop will be built around a small part of the Data designed to introduce participants to the opportunities and challenges offered by the application of modern Machine Learning (ML) techniques.

Our C3DIS offering will first introduce ML and demystify the associated hype, provide a light overview of some useful ML approaches and, most importantly, equip attendees with the ability to verify and validate the results produced by their ML pipeline. We will highlight some common difficulties with real world datasets, how to identify these problems and how to rectify them.

The focus of the workshop will on applications to scientific datasets with examples including image data, time series data and regression problems. The workshop uses Python as it’s delivery vehicle and so some familiarity with the language will be assumed. We will be using the Google Collaboratory as a compute environment, so attendees are only required to bring a laptop with an internet connection. There will be limited support on offer should attendees wish to set up their own local environments.


Biography:

Research software engineer with the Scientific Computing team at CSIRO. Chris works mostly with machine learning applications to scientific problems.

Relevant and Re-usable: Collaboration in Education

Mrs Ann Backhaus1

1Pawsey Supercomputing Centre, Kensington, Australia

 

The Australian Population clock puts us at 25,273,121. This roughly equates to one person arriving to live in Australia every 57 seconds. When Australia hit its 25 million mark, the newest resident was likely to be “young, female and Chinese”.

Are we as formal and informal educators effectively and purposefully mentoring and teaching our increasingly diverse user base? Is the learning we provide meaningful, authentic, and usable to support learning generally and individually? Is the learning we create scalable and sustainable? Are there ways we can collaborate to provide “more for less”?

This workshop explores two key themes:

  • How to make education relevant to a diverse population of learners
  • How to make educational materials that are re-usable, scalable, and sustainable

This interactive BoF opens with short presentations by seasoned educators. Attendees participate in round table discussions on key aspects of relevancy, effective teaching and learning, and re-usable content in context of HPC education. Working in groups, participants share current practices and think laterally. Good practices act as a springboard to new and/or modified educational frameworks and practices.

The outcomes of the session include:

  • Sharing of current and good practices in education
  • Discussion on the tensions between effective teaching and relevancy (for learners) and re-usability, sustainability, and scalability (for educators and mentors)
  • Practical additions to participants’ educational toolboxes
  • Increased collegial networks
  • Next steps – establishment of long-term collaborative connections with the goal of enabling participants to “do more with less”

Biography:

Ann is the Education and Training Manager at Pawsey Supercomputing Centre in Kensington, Western Australia. Ann has significant experience in adult teaching and learning. She has led distributed, global teaching & learning teams strategically and operationally, in the creation of such assets as e-learning materials, digital stories, microlearning units, webinars, demonstrations, web sites and wiki sites, digital classroom courses and workshops, and a variety of enablement materials. Her experience spans numerous industries and sectors.

ABOUT AeRO

AeRO is the industry association focused on eResearch in Australasia. We play a critical coordination role for our members, who are actively transforming research via Information Technology. Organisations join AeRO to advance their own capabilities and services, to collaborate and to network with peers. AeRO believes researchers and the sector significantly benefit from greater communication, coordination and sharing among the increasingly different and evolving service providers.

Conference Managers

Please contact the team at Conference Design with any questions regarding the conference.
© 2019 Conference Design Pty Ltd