CloudStor Service – Re-engineering CloudStor for Infinite Scalability

Gavin Kennedy1

1AARNet,Milton, Australia

 

Since its launch as a simple file sharing service nearly 10 years ago, AARNet’s CloudStor has become a significant resource for the Australian research community to store and share research data. Its ubiquity and flexibility, and a 1 TB complementary allocation, has pushed individual user accounts up to 70,000 this year and driven demand for additional functionality. Universities and research institutions are increasing their uptake of CloudStor’s collaborative group drive feature, with 3 PB allocated, and several universities signing on for enterprise storage allocations, making CloudStor a foundation of their research data storage strategy.

Meeting this rapid increase in utilisation and ensuring a high quality service has required the AARNet Cloud Services team to redesign CloudStor to ensure it remains reliable, responsive and scalable into the future. One of the key strategies is to shard the application stack, creating instances of CloudStor that can run independently but interoperate in a shared namespace. This presentation will discuss the architectural approach taken by AARNet to achieve this, highlighting the performance and scalability improvements. It will then discuss AARNet’s roadmap for CloudStor and the benefits and opportunities this approach delivers, including dedicated S3 storage targets, sensitive data stores and tailored institutional instances.


Biography:

Gavin Kennedy is AARNet’s Cloud Services Product Manager. Gavin comes from a background of research data management and is a passionate advocate for open source platforms to support open research and the FAIR data principles. Gavin has over 30 years’ experience as an ICT professional including 18 years in research, eResearch & research support. His roles have included operations manager, software engineer, bioinformatics researcher, project manager, outreach manager, team leader and innovation lead.

Location Index: Capability, Methodology and Approach

Ms Irina Bastrakova1, Mr Shane Crossman1, Mr Paul Box2

1Geoscience Australia, Canberra, Australia, 2CSIRO, , Australia

 

Location can be described and used to spatially integrate and enable data in a number of different ways. The Location Index (LOC-I) is a framework that provides a consistent way to seamlessly integrate data on people, business, and the environment.

This GIS without GIS framework brings together modern technological approaches of Linked Data and Discrete Global Grid Systems (DGGS) as well as important aspects of Social Architecture to ensure relevance, transperancy, openness and accessibility of multidiscuplinary data for the Australian Government and non-government users.

LOC-I aims to provide the standards-based flexible spatial data infrastructures to support cross-sector decision making, planning and service delivery.

LOC-I also aims to support cross-domain foundation data linkages and analysis tools to open up substantial opportunity for providing a richer set of information to develop, analyse and evaluate policy, programs and services.

Through developing new capabilities across Commonwealth agencies, Location Index objective is to provide users with:

  • improved and standardised governance of data and information
  • improved location referencing by data custodians for future use
  • improved ways of doing business, reducing cost and time in collection, management and delivery
  • a greater number of users and re-uses of the data
  • governed and managed federated supply chains
  • communities of subject matter experts to enable information sharing

Biography:

Irina Bastrakova is a Director of Spatial Data Architecture at of Geoscience Australia.

Irina has been actively involved with international and national geoinformatics communities for more than 18 years. Interested in practical applications of geoscientific and geospatial standards through leveraging common information models, data patterns and vocabularies across multidisciplinary data assets to support effective ingestion of data within High Performance environments, her particular focus is on application and adoption of modern technologies such as Linked Data and improving online accessibility to geoscientific data through application of common metadata and data standards.

Irina has M.Sc in Structural Geology from the Moscow State University. Irina is the Chair of the Australian and New Zealand Metadata Working Group. She is a member of a number of international (ISGN Technical and Governance Committees) and national committees (Standards Australia, ARDC Metadata Advisory Board, Australian Government Linked Data Working Group).

Towards a National Platform for Australian Health Research Data

Mrs Lien Le1, Dr Hugo Leroux2, Mrs Kate LeMay1, Dr Adrian Burton1, Prof Lynne Cobiac2, Dr Jurgen Fripp2, Dr Liming Zhu2

1Australian Research Data Commons, Brisbane, Australia, 2Commonwealth Scientific Industrial Research Organisation, Australia

 

Australia spends hundreds of millions of dollars on clinical or health research annually. Once completed, however, this data remains dormant and inaccessible to other researchers.

The CSIRO and the ARDC have committed resources to rethinking how clinical research is conducted and managed in Australia. Through a number of workshops and engagements with key stakeholders, we seek to build a research-community-led alliance to develop a roadmap to investigate the feasibility of a trusted infrastructure that provides:

  • Increased integrity of research through transparency of data underpinning health research conclusions
  • Increased efficiency of research through quicker access to existing studies and avoidance of duplicative research
  • Increased findability, accessibility, interoperability and reusability (FAIR) of data
  • Increased innovation through new data-driven research built on a long term national data asset designed for re-use, re-purposing, linkage, meta-analysis and machine learning.

An inaugural workshop with key stakeholders in early March has emphasised the following key points:

  • There is value and an appetite for a national framework enabling data sharing, however there are currently many challenges in successfully achieving this goal.
  • There is a need for systems, data and metadata to be interoperable
  • Reproducibility of published results is a live issue internationally
  • There is a need to provide best practice infrastructure thereby obtaining best practice in research protocols
  • The availability of metadata and processes from previous studies should allow comparison across many studies


Biography:

To be confirmed

High Performance Computing at DST: An update

Dr John Taylor1

1DST Group, Canberra, Australia

 

The presentation will provide an overview of the new HPC and Computational Science program at DST and its future development. Significant progress in the development of the DST HPC program has occurred over the past year. The talk will cover the major research challenges where DST is applying HPC and computational science. Potential opportunities for collaboration will also be identified.


Biography:

Professor Taylor is currently Program leader HPC and Computational Science at DST, Research Group Leader Computational Platforms at CSIRO and Honorary Professor at the Research School of Computer Science at ANU. He has written more than 140 articles and books on computational and simulation science, climate change, global biogeochemical cycles, air quality and environmental policy, from the local to the global scale, spanning science, impacts and environmental policy.

John Taylor’s research has been widely cited and attracted significant media attention. Professor Taylor has worked as a Computational Scientist and group leader both at the Mathematics and Computer Science Division, Argonne National Laboratory and at the Atmospheric Science Division at Lawrence Livermore National Laboratory. John was Senior Fellow in the Computation Institute at the University of Chicago and has served on Advisory Panels for the US National Center for Atmospheric Research (NCAR) and the US National Energy Research Scientific Computing Center. John is a Fellow of the Clean Air Society of Australia and New Zealand.

 

Earth and Environment Science Information Partners: ESIP & E2SIP connecting networks on opposite sides of the globe

Ms Erin Robinson1, Dr.  Lesley  Wyborn2, Dr. Simon Cox4, Dr. Jens Klump3, Dr. Adrian Burton6, Dr Natasha Simons6, Dr Ben Evans2, Dr. Tim Rawlings5

1Earth Science Information Partners, Boulder, United States, 2Australian National University, Canberra, Australia , 3CSIRO, Perth, Australia, 4CSIRO, Melbourne, Australia , 5University of Melbourne/AuScope, Melbourne, Australia, 6ARDC , Canberra, Australia

 

Significant public investments in Australia, USA and Europe are building Earth and environmental science eResearch infrastructures for transdisciplinary research. Each are developing best practices for both infrastructure and data management to enable the FAIR principles (Findable, Accessible, Interoperable and Reusable), and all are based on a culture of sharing datasets, software, tools, data services, vocabularies, etc., across institutional, community, national and continental boundaries.

Over the last 20 years, the US-based Earth Science Information Partners (ESIP) has become a braintrust and professional home for the Earth science data and informatics community. The Australian Earth and Environment Science Information Partners (E2SIP) was established last year through liaison with ESIP at the 2018 C3DIS Conference. ESIP is providing the governance structure to incubate E2SIP as a cluster and we have added an E2SIP representative to the ESIP Board and several Australian organizations have joined ESIP as organizational partners.

Moving forward, E2SIP intends to support similar functions in Australia by bridging across organizations including CSIRO, Bureau of Meteorology, Department of the Environment, Geoscience Australia, AuScope, NCI, ARDC, IMOS, TERN, and ALA. E2SIP is working with the National Earth and Environmental Facilities Forum which provides a common voice to government on behalf of long term science research infrastructure.

This abstract highlights the efforts of the US Earth Science Information Partners (ESIP) and the newly established Australian Earth and Environment Science Information Partners (E2SIP) over the last year and aims to introduce E2SIP to those parts of the C3DIS community who are not already involved.


Biography:

Ms. Robinson works at the intersection of community informatics, Earth science and non-profit management. Erin is currently the Executive Director for the Earth Science Information Partners (ESIP). In this position, she facilitate collaboration among over 1000 Earth science technology practitioners across 130+ organizations to expedite progress toward data interoperability and making data matter.

 

The open geospatial community in Oceania

Dr Adam Steer1

1Spatialised, Weston, Australia

 

The Open Geospatial Foundation (OSgeo) and the Open Streetmap Foundation (OSMF) have been mainstays in support of; and advocacy for; open geospatial software and data for many years.

OSGeo supports foundational geospatial tooling used across the eResearch community – from invisible infrastructure (GDAL; Proj4; pyWPS; Zoo-WPS ) to prominent user-focussed, user facing applications (QGIS, geonode, geoserver, geonetwork, leafletJS; openlayers) – to name just a few.

In 2009, an international conference of the OSGeo foundation was held in Sydney; and after a long hiatus, the community was revived in 2018. The result was a sold-out joint conference of the OSGeo and OSMF communities for the Oceania region in Melbourne. This was both an incredible show of community support, and an incredible showcase of open source innovation in the region.

…and the momentum continues. By the time C3DIS happens, there will be a fully-fledged local OSGeo Oceania organisation, aimed at supporting a regional community of open source geo-developers, geo-users, and geo-enablers – and 2019 conference organisation will be in full swing.

This talk will be about charting the journey of OSGeo Oceania so far, and how the eResearch community in Australia and Oceania can engage with, support, and benefit from this local and global community.

Come and join the party!


Biography:

Dr Adam Steer operates an independent consultancy using open source tools to solve wicked geospatial problems. He has previously worked in the defence startup space, at infrastructure-scale computing facilities developing new data services, and on Antarctic sea ice, merging traditional observation techniques with cutting edge geospatial technology. He’s a charter member of the Open Geospatial Foundation, sits on the board of OSGeo Oceania and is an Earth Observation Australia steering committee member. He has a cat, likes to play music and ride bikes and climb rocks, and makes more mistakes than the average human most days.

Preparation for CMIP6: how to deal with a multi-petabyte climate data collection

Ms Claire Trenham1, Mr Tim Erwin2, Dr  Aurel Moise3, Dr Paola Petrelli4, Dr Kate Snow5, Dr  Louise Wilson3, Dr Vanessa Hernaman2, Ms Clare Richards5

1CSIRO, Black Mountain, Australia, 2CSIRO, Aspendale, Australia, 3Bureau of Meteorology, Melbourne, Australia, 4Centre of Excellence for Climate Extremes (UTas), Hobart, Australia, 5National Computational Infrastructure, Acton, Australia

 

The Coupled Model Intercomparison Project phase 6 (CMIP6) represents the largest collection of climate & weather data to date, with an expected total volume around 30PB. To work effectively with this data in Australia, the community needs a local replica of commonly used datasets, as well as means to find the data of interest for each researcher’s needs, and tools to effectively work with very large spatiotemporal datasets.

The National Computational Infrastructure (NCI) has established a mechanism to automatically download data from the Earth Systems Grid Federation (ESGF) for requested variables, and a database indexing this data. NCI supports collaborators from the Centre of Excellence for Climate Extremes (CleX), CSIRO and the Bureau of Meteorology to build and assess tools to enable effective community use of this data as it becomes available over the coming months.

The “CleF” tool has been developed by CleX to search for data stored locally at NCI as well as checking for additional data available on the ESGF. CSIRO and the BoM are working together to review the processing pipeline tool that was developed using CMIP5, observational and reanalysis data. We will identify what is needed to update “the pipeline” for python3, and make it compatible with the CleF search tool.

We report on the collaboration between key organisations to prepare for the deluge of CMIP6 data. We believe we are much more ready for this dataset than we were for the ~1PB CMIP5 dataset in 2012.


Biography:

Claire works in the sea level, waves and coastal extremes team within CSIRO’s Climate Science Centre. Claire’s background spans mathematics, astrophysics, ocean wave modelling, high performance data services, data management and climate modelling. Claire is heavily involved in climate data and preparation for CMIP6, as well as regional and coastal climate modelling, data processing, making improvements to data and software to enhance science capabilities, and participating in STEM engagement activities with school students.

 

CSIRO Datasets Project – Building student data literacy with CSIRO research data

Graeme Buckie1, William Flynn

1CSIRO – Education and Outreach, Adelaide, Australia

 

In order to become empowered citizens in the modern world, students need key data literacy skills, including statistical analysis, visualisation comprehension, and data-driven critical thinking. Having access to these skills enables students to better understand what they’re being told by others, while also giving them the tools to generate their own answers using publicly accessible data. The CSIRO Datasets Project features an intuitive access platform, enabling access to CSIRO data sets and creative resources to guide teachers in introducing data analysis and computer science principles to their students, with a focus on real-world scientific research.

Participants in this workshop will explore CSIRO’s Datasets Project and develop the skills needed to bring these resources into the classroom. Over the course of the session, participants will work with educational datasets and lesson plans developed from publicly accessible research data from the CSIRO Data Access Portal. Using these resources, they will examine and develop their data analysis skills, examine the associated lesson plans in detail, discuss the importance of data visualisation, discuss cross-curricular links to data science in the Australian Curriculum and examine pedagogical practice to support their students.


Biography:

To be confirmed

Relevant and Re-usable: Collaboration in Education

Mrs Ann Backhaus1

1Pawsey Supercomputing Centre, Kensington, Australia

 

The Australian Population clock puts us at 25,273,121. This roughly equates to one person arriving to live in Australia every 57 seconds. When Australia hit its 25 million mark, the newest resident was likely to be “young, female and Chinese”.

Are we as formal and informal educators effectively and purposefully mentoring and teaching our increasingly diverse user base? Is the learning we provide meaningful, authentic, and usable to support learning generally and individually? Is the learning we create scalable and sustainable? Are there ways we can collaborate to provide “more for less”?

This workshop explores two key themes:

  • How to make education relevant to a diverse population of learners
  • How to make educational materials that are re-usable, scalable, and sustainable

This interactive BoF opens with short presentations by seasoned educators. Attendees participate in round table discussions on key aspects of relevancy, effective teaching and learning, and re-usable content in context of HPC education. Working in groups, participants share current practices and think laterally. Good practices act as a springboard to new and/or modified educational frameworks and practices.

The outcomes of the session include:

  • Sharing of current and good practices in education
  • Discussion on the tensions between effective teaching and relevancy (for learners) and re-usability, sustainability, and scalability (for educators and mentors)
  • Practical additions to participants’ educational toolboxes
  • Increased collegial networks
  • Next steps – establishment of long-term collaborative connections with the goal of enabling participants to “do more with less”

Biography:

Ann is the Education and Training Manager at Pawsey Supercomputing Centre in Kensington, Western Australia. Ann has significant experience in adult teaching and learning. She has led distributed, global teaching & learning teams strategically and operationally, in the creation of such assets as e-learning materials, digital stories, microlearning units, webinars, demonstrations, web sites and wiki sites, digital classroom courses and workshops, and a variety of enablement materials. Her experience spans numerous industries and sectors.

Recognition and Career Development for Researchers who Code: a Workshop for RSEs

Mr Rowland Mosbergen1

1University Of Melbourne, Carlton, Australia

 

Software is a critical component of modern research. However, the people creating research software are frequently unrecognised for their contribution towards research output.

This workshop is for building a community for academics who create and maintain research software, but are lacking recognition and metrics needed to progress their academic career. Also welcome are professional software engineers working in the research space, research support team members that work closely with researchers, system administrators who maintain research systems; academics who rely on such expertise; eResearch leaders and policy makers.

Because the main percentage of C3DIS delegates are researchers,  we have decided to do a Hacky Hour earlier in the week. The idea is to promote Hacky Hours generally, connect volunteer RSEs or research support people who want to help, and to get some cross domain problems written down.

We then want to feed those cross domain problems into this workshop to identify some solutions that we could convert into proposals to send to funders like the ARDC, NCI, Pawsey, and BPA. This is a great opportunity as these funders have just been provided funding and are already thinking of how they might allocate these funds.

We also hope to highlight the impact and effectiveness of RSEs, we hope to collate personas, profiles, and stories on their impact. Personas and profiles are crucial to help RSEs self-identify and interested in becoming part of this budding community. The impact stories will help demonstrate the considerable value that they bring to organisations.

The secondary goal for the workshop will be to collect these stories, and publish them on a dedicated website planned for October 2019, to be unveiled during eResearch Australasia. During the workshop, we will also work on create engagement strategies to build awareness of the community.


Biography:

Rowland has 17 years experience in IT while working in research, corporate financial software and small business. He graduated QUT in 1997 with a Bachelor of Engineering in Aerospace Avionics, then worked for GBST, a software company servicing the financial industry, where he worked with National Australia Bank and Merrill Lynch in their Margin Lending products for over 4 years. Rowland owned and ran a computer support business for over 5 years, then worked as a web developer for 2 years before joining the Wells laboratory as part of the Stemformatics team in 2010.

12

ABOUT AeRO

AeRO is the industry association focused on eResearch in Australasia. We play a critical coordination role for our members, who are actively transforming research via Information Technology. Organisations join AeRO to advance their own capabilities and services, to collaborate and to network with peers. AeRO believes researchers and the sector significantly benefit from greater communication, coordination and sharing among the increasingly different and evolving service providers.

Conference Managers

Please contact the team at Conference Design with any questions regarding the conference.
© 2019 Conference Design Pty Ltd