Introduction to the Galaxy workflow tool

Dr Gareth Price1, Simon Gladman4, Mr Derek Benson2, Dr Tim Ho3

1Queensland Facility for Advanced Bioinformatics (QFAB), St Lucia, Australia; 2CSIRO, Pullenvale, Australia; 3CSIRO, Clayton, Australia; 4Melbourne Bioinformatics, Australia

 

This workshop will introduce the Galaxy platform for performing genetic and genomics analysis. This platform provides a workflow engine supported with a large number of tools covering common tasks such as DNA/RNA manipulation, mapping, filtering, ranking, annotation as well as phylogenetics and metagenomics.

This workshop will be of interest to anyone seeking practical strategies and tools that they can use to investigate their own genomic data. The workshop aims to demonstrate the power of linking tools into an analysis pipeline or workflow in Galaxy language. Based on a small number of hands-on tutorials, workshop attendees will construct, edit and chain workflows to demonstrate the reproducibility and reduced hands-on time that comes with managing analysis through Galaxy workflows.


Biography:

Gareth Price has been a Genomics Scientist for over 15 years now. He has involved in experimental design, assay performance, data QC, data analysis and data interpretation from early printed microarrays, to cartridge based GeneChips through to multiple Next Gen platforms. These works have involved a variety of model organisms from microorganisms, fruit flies, mice to humans.

Gareth’s view is that research, clinical research, and healthcare are at their best when coupled with the most accurate, highest throughput and innovative technology and analysis. He uses this view to motivate the use of innovation to reduce the time between data generation and data summarisation, ready for the important phase of data interpretation and result in discovery.

Narrative Visualisation: Telling Stories with Data

Mr Marcin Nowina-Krowicki1, Dr Steven Wark1, Dr Andrew Cunningham2

1Defence Science & Technology Group, Edinburgh, Australia; 2University of South Australia, Adelaide, Australia

 

The challenge for an analyst working with Big Data is to gain actionable insights into the significant factors and relationships that impact on business outcomes. This requires more than visualisation, but also telling the story behind the data. Narrative visualisation is an approach that allows an analyst to understand these complex factors and relationships in an intuitive and engaging way. It can be used to discover knowledge within the data, or to explain how insights were obtained and their impact. This session invites presentations and demonstrations of interesting techniques across diverse domains that could contribute to this approach.

Visualising Data with R Shiny

Dr Louise Ord1

1CSIRO, South Eveleigh, Australia

 

Shiny makes it easy to build interactive, web-deployed applications using R. Dr Louise Ord will be running a hands-on R Shiny workshop to take you through all you need to begin developing your own Shiny dashboards and visualisations. In this workshop, you will be given example code that can be used as a template for your own application development. You will be taught the key aspects of reactive programming, how to create interactivity with built-in control widgets and external R and JavaScript tools, and methods for fully customising the design and appearance of your Shiny dashboards and visualisations.

Prerequisites: Basic programming skills. A basic understanding of R would be useful but is not necessary.


Biography:

Louise Ord enjoys bringing data to life through analysis, exploration and visualisation. Gaining her doctorate at the University of Oxford, she spent six years as a cosmologist, analysing cosmic microwave background anisotropies. She then moved into the field of data analytics and predictive modelling where she developed a passion for data visualisation. Now at CSIRO, Louise designs and creates analytics and visualisation tools to bring data insights to scientists.

Organisational transformation to support digital research and science

Mr Angus Macoustra, Mr Brendan Dalton

 

Advanced scientific and research instruments continue to generate massive amounts of rich and complex data. This data is subsequently captured and processed using increasingly capable computing infrastructure and processing pipelines. These changes have meant that scientific and research methodologies needed to change and adapt. These changed methodologies and support technologies were even given special terms – “eScience” and “eResearch” to differentiate them from the previous practices.

However, adaptation to meet these changes has been somewhat piecemeal and variable in its pace and approach in the past, with successful adoption largely driven by individual research project needs.

Digitally enabled science and research requires not just support through methodological and technological changes, but also that organisational change in the way that they carry out their “business”. To address that organisational change, CSIRO is in the process of implementing several transformational  initiatives that will support its digital research and science into the future:

  1. Challenges and Missions
  2. Managed Data Ecosystem
  3. Digital Academy
  4. Future Science and Technology

The workshop will examine each of these initiatives and how they address the Challenges and Digital Transformation program within CSIRO through presentations and active discussion between CSIRO leaders and workshop participants.


Biography:

Angus Macoustra is the Chief Technology Officer and Head of Scientific Computing, Information Management Technology, CSIRO.

Brendan Dalton is the Chief Information and Data Officer for CSIRO.

Jupyter Jam Session

Dr Sara King1, Dr Carina Kemp2

1AARNet, Adelaide, Australia; 2AARNet, Canberra, Australia

 

It’s time all sorts of us got into a room to do some Jupyter jamming! We hope you like jamming too!

This workshop is for conference participants at all levels of knowledge and ability.  There is an incredible array of people in the C3DIS community with areas of expertise to share and novel approaches and solutions to complex problems to offer. We’d like to invite you to share what you know!

The ‘Jupyter Jam Session’ workshop is designed as an interactive session using Jupyter Notebooks. It’s for those either interested in trying Jupyter Notebooks for the first time, those who have been using them for a while and would like a new challenge (or a solution to a problem) and right through to those with Wizard Level skills to get together and share their awesome powers.

We’ll facilitate a self-selection process to create groups so that each participant can get a feel for what level they are at and make the most of the workshop. Bring your curiosity and creativity, your special skills and clever hacks!

The workshop is targeted at researchers, research support, technologists, and eResearch professionals with an interest in testing out a mix of techniques, programming languages and libraries, using a Jupyter notebook, and sharing their own techniques. Participants need not be proficient in any programming language, just comfortable with an interactive session, being walked through a series of practical exercises, and learning some useful commands and processing techniques.


Biography:

Dr Sara King is an eResearch Analyst with Australia’s academic and research network provider, AARNet. She has extensive experience in researcher engagement and training, with expertise in research data and technologies in the Humanities and Social Science (HASS) research areas.

Dr Carina Kemp is the Director of eResearch for AARNet responsible for making the network work the best it can for research in Australia. She works with the Australian and international research community to find and implement tools that sit above the network to make technology and data research ready. Previous to joining AARNet, Dr Kemp was the Chief Information Officer at Geoscience Australia.

Biocuration: How to make your data F.A.I.R to amplify innovation and promote collaboration?

Ms Priyanka Pillai1, Mr Rowland Mosbergen1

1The University Of Melbourne, Parkville, Australia

 

The volume of data generated from health and biological sciences is growing exponentially. Currently there are limited ways to enable data discovery, retrieval and interoperability within and across domains in health and biological sciences. With increase in data availability, there is also a growing demand for high-throughput data analytics, visualisation, text mining and machine learning methods. Managing health and biological data requires a range of different activities right from understanding the provenance and context of data. Biocuration process aims to maximise the value of information and knowledge assets generated by researchers. In this workshop, we will start of with some presentations on existing biocuration practices both nationally and internationally and then breakout into group discussions on challenges, case studies and next steps. This workshop aims to build a proactive community of practice around biocuration to improve knowledge discovery, accessibility, aggregation and integration.

Category and activity: Collaboration and Engagement; Outreach, Training & Capability Development

Keywords: Data Curation, Biocuration, Metadata, Ontologies

Target audience: Anyone currently working with health and biosciences datasets or interested in learning about biocuration

Learning objectives:

  1. Improved understanding of existing biocuration practices both nationally and internationally
  2. Learn about commonalities in data practices across a range of biological and health domains.
  3. Understanding of existing pain-points in data discovery, interoperability and integration across different biological and health domains.
  4. Improved understanding of biocuration practices through case studies.

Workshop style: Interactive workshop with tabletop mapping of challenges, case studies to explore biocuration practices, group exercises and discussions.


Biography:

Priyanka Pillai is a bioinformatician and a software programmer by training and works as a Research Data Steward in the Melbourne Data Analytics Platform (MDAP). Priyanka also works as a Health Informatics Specialist for the Australian Partnership for Preparedness Research on Infectious Disease Emergencies (APPRISE) Centre of Excellence based at the Doherty Institute. Priyanka’s role with MDAP supports the uplift of data management capabilities at the University and also involves collaborating with academics on data-intensive research like bioinformatics and machine learning. Her role as a health informatician for APPRISE CRE supports a geographically distributed network of data holders and researchers and provides strategic advice to facilitate national and international information sharing. Priyanka has also been involved with science mentoring programs at the University and is an advocate of inclusivity of women in science.

DevOps workshop

Mr Sven Dowideit1

1Csiro, Brisbane, Australia

 

The  DevOps  workshop  is  a  bi-annual  opportunity  to discuss  modern  cloud-native  approaches  to  designing  and implementing systems – and how to apply them to eResearch science delivery.

We all have different areas of expertise, and so have lots we can help each other with.  In the past we’ve had interesting discussions ranging from technology specific issues, all the way to hearing about different management concerns to the organizational changes needed to do Cloud based implementations.

This workshop is generally done as a participant un-conference with speakers doing short presentations on selected examples of DevOps practice to trigger a discussion between participants – what issues they have, how they might apply them, or how they’ve had issues or solved issues.

Topics that we’re likely to cover this time:

* Host and cluster configuration management – such as Terraform

* Continuous integration and deployment – GitLab, Jenkins,

* Workflow, Pipelines and HPC

* More Orchestration – Kubernetes, Docker Swarm

* Monitoring, Logging and Observability – Prometheus, Loki, Jaeger

* Hybrid cloud, on premis?

* https://landscape.cncf.io/


Biography:

Sven Dowideit has a wealth of experience in Software development processes and applying large scale DevOps tools to the small scale.

Developing scalable and portable computational workflows with Nextflow

Dr Rad Suchecki1

1CSIRO, Urrbrae, Australia

 

Large analysis workflows are fragile ecosystems of software tools, scripts and dependencies. This complexity commonly makes them hard to maintain, extend, and all but impossible to use outside their original development environment. Nextflow is a workflow framework and a domain specific programming language which follows the dataflow paradigm and offers an alternative, and arguably superior, approach to developing, executing and sharing pipelines. Nextflow offers seamless integration with code and container image hosting services such as GitHub and Docker Hub, and out of the box support for various HPC cluster schedulers and cloud compute systems.

In this workshop you will learn

  • about processes, channels and operators – the building blocks of Nextflow
  • how to run, port and customise existing Nextflow workflows
  • how to develop a simple Nextflow workflow from scratch
  • how to separate the pipeline logic from compute and software environment configuration

By the end of this workshop you will be ready to start developing shareable, version controlled, container-backed workflows, which can be seamlessly executed across different environments from a laptop to cluster to cloud.


Biography:

Rad Suchecki obtained his BSc and PhD from the School of Computing Sciences, University of East Anglia, Norwich, UK. During his postdoc at The University of Adelaide, he developed high-performance computational pipelines and web applications for integration and visualisation of biological data. He continues this work in CSIRO’s Aginformatics group where he applies and develops frameworks and software to drive reproducibility in crop informatics and data science.

Galaxy for Scientists – Advanced Hands-on Tutorials

Dr Gareth Price1, Simon Gladman4, Mr Derek Benson2, Dr Tim Ho3

1Queensland Facility for Advanced Bioinformatics (QFAB), St Lucia, Australia; 2CSIRO, Pullenvale, Australia; 3CSIRO, Clayton, Australia; 4Melbourne Bioinformatics, Australia

 

Galaxy is a popular web-based scientific analysis platform used by tens of thousands of scientists across the world to analyse large datasets from areas such as genomics and metagenomics.

This workshop will be of interest to Galaxy users who want to find out more about the use of Galaxy in areas such as:

– Genome annotation

– Metagenomics

– Statistics and machine learning

We expect workshop attendees to have some knowledge of Galaxy and already have access to usegalaxy.org.au.


Biography:

Gareth Price has been a Genomics Scientist for over 15 years now. He has involved in experimental design, assay performance, data QC, data analysis and data interpretation from early printed microarrays, to cartridge based GeneChips through to multiple Next Gen platforms. These works have involved a variety of model organisms from microorganisms, fruit flies, mice to humans.

Gareth’s view is that research, clinical research, and healthcare are at their best when coupled with the most accurate, highest throughput and innovative technology and analysis. He uses this view to motivate the use of innovation to reduce the time between data generation and data summarisation, ready for the important phase of data interpretation and result in discovery.

EnviroCHI: Advancing Computer Human Interaction for Environmental Science and Education

Dr Ulrich Engelke1, Dr Dirk Slawinski2, Dr Anais Pages3

1CSIRO Data61, Kensington, Australia; 2CSIRO Oceans and Atmosphere, Crawley, Australia; 3Department of Water and Environmental Regulation, Joondalup, Australia

 

Our natural environment is under significant pressure: climate change, plastic pollution, and biodiversity loss, are among the major threats that our planet is currently facing. Scientists study environmental phenomena to understand these changes and develop models of future impact. Technology plays an ever-important role in these endeavours and effectively dealing with the overwhelming number of data sources is imperative in making informed decisions. The human element plays a particularly important role in leading positive change towards a more sustainable society and resilient environment. It is therefore imperative to enable scientists to do their research, educate the general public, and reach out to key decision makers. The aim of this workshop is to bring the computer human interaction, environmental science, and education communities together to identify major challenges and opportunities at the intersection of these fields. We are particularly interested in identifying how the environmental science and education communities can benefit from recent advances in human computer interaction, scientific visualisation, and interactive data analytics. One focus area will include recent development of immersive technologies, such as virtual and augmented reality, and how these could be deployed for effective exploration of vast environmental data sets. Initial output of the workshop may consist of a publication (e.g. white paper) to summarise its findings, but longer-term outcomes intend to include a national collaborative research agenda for computer human interaction in environmental science and education.


Biography:

Ulrich Engelke is a Senior Research Scientist at CSIRO Data61 in Kensington, WA, where he leads the Immersive Analytics research initiative. His current research focuses on human factors and user experience in visual and immersive analytics systems.

Dirk Slawinski is a Senior Experimental Scientist at CSIRO Oceans and Atmosphere in Crawley, WA. His research focuses on  coastal ocean and estuarine modelling, particle tracking of pelagic and benthic fauna, and benthic habitat modelling and estimation.

Anais Pages is a Senior Scientist at the Department of Water and Environmental Regulation in Joondalup, WA. She has extensive research experience in the fields of oceanography, geology and organic geochemistry as well as in environmental monitoring and site remediation.

ABOUT AeRO

AeRO is the industry association focused on eResearch in Australasia. We play a critical coordination role for our members, who are actively transforming research via Information Technology. Organisations join AeRO to advance their own capabilities and services, to collaborate and to network with peers. AeRO believes researchers and the sector significantly benefit from greater communication, coordination and sharing among the increasingly different and evolving service providers.

Conference Managers

Please contact the team at Conference Design with any questions regarding the conference.
© 2019 Conference Design Pty Ltd