Reticulate workflows for complex decision making – an example from an emergency animal disease decision support system

Kerryne Graham1, Duan Beckett2, Justin Freeman2, Chris Cowled3, Marcus Thatcher4, Peter Hurley5, Michael Newton6, William Scobell6, Peter Durr1

1CSIRO Australian Animal Health Laboratory, Geelong, Australia, 2Bureau of Meteorology, Melbourne, Australia, 3CSIRO Health & Biosecurity, Geelong, Australia, 4CSIRO Oceans and Atmosphere, Aspendale, Australia, 5CSIRO Oceans and Atmosphere , Black Mountain, Australia, 6NewtonGreen Technologies Pty Ltd, Newcastle, Australia

Abstract:

Introduction: The workflow concept is fundamental to decision support systems (DSS). This applies particularly to DSS used in national emergencies, such as bush-fires, floods, public health and animal health epidemics. Traditionally, emergency management DSS have used a “linear” workflow, whereby users are assumed to have a defined starting point (with respect to knowledge and data) and a clear end point which leads to an optimum decision.

Methods: We are developing a high-level framework of multi-stage workflows where end-user access is via a web interface and requests for data analyses and modelling are undertaken via an application programming interface (API) and completed on a high-performance computing infrastructure.

Results:  Given the objective of providing a DSS for emergency animal diseases (EAD), we have developed SPREAD, a system of determining how an epidemic disease is spreading either within a farm or between farms. SPREAD provides visualisation of the epidemic in space and time, assessing of the role of wind dispersion, next generation sequence (NGS) assembly and annotation, and finally construction of various transmission networks.

Conclusion:  The concept of integrating whole-genomic sequencing data with wind dispersion modelling was developed from the experience of the 2007 UK Foot-and-Mouth Disease outbreak. However, this pioneering work was undertaken retrospectively, and to date no system has been developed that will enable transmission pathways to be determined in near real-time.  SPREAD enables animal health managers and veterinary officers to efficiently use meteorology and NGS data in reticulated workflows for effective EAD management.


Biography:

Kerryne works within the Veterinary Investigation and Epidemiology team located at the Australian Animal Health Laboratory. She is involved in several collaborative projects where she is able to apply her expertise in data management, spatial analysis and implementation of surveillance information systems.    Kerryne has most recently been involved in modelling the habitat suitability for the effective release of Cyprinid Herpes virus-3 for the control of European Carp and is also a team member of  SPREAD – a web application integrating epidemic data, wind-dispersion and molecular data for surveillance and response.

Python-based visual exploration of enriched motifs from panning experiments

Dr Yi Jin Liew1, Dr Jason Ross1, Dr Kathy Surinya2, Dr Maxime Francois2, Dr Simon Puttick3, Dr Stephen Rose3

1CSIRO, North Ryde, Australia, 2CSIRO, Adelaide, Australia, 3CSIRO, Herston, Australia

Abstract:

Intro

Some tumours have proteins in their cell membranes that are absent in healthy cells. In developing potential treatments, we want to identify novel peptide binders to these proteins. To do so, phage display libraries are panned against these proteins over several rounds–retained sequences would be enriched for real binders.

Our team consists of biologists with extensive panning experience, chemists with protein know-how, and bioinformaticians with experience in analysing next generation sequencing data. To ease data exploration by all parties, the bioinformaticists built an interactive dashboard in Bokeh to visualise changes in motif frequencies across any two phage display experiments. Ultimately, these plots help separate biased motifs (which is enriched due to technical factors) from true binders (enriched due to binding to the protein of interest).

Methods

We coupled a k-mer counting strategy with a custom distance matrix to cluster similar peptide sequences in the main plot. Point sizes of each peptide were proportional to the frequencies so that abundant peptides stood out in the plot. Based on iterative feedback from the team, we devised four colour schemes that emphasises different aspects of the data. One key feature of Bokeh–its lasso-selection tool–was enhanced so that selected regions had tabulated detailed information, and a sequence logo to spot general motifs. The extensible nature of the dashboard allows for the further inclusion of other informative subplots.

Results/Conclusions

Based on the observations made on the plot and further calculations, we chose a few promising candidates for further testing.


Biography:

Yi Jin is currently a Research Scientist in the Molecular Diagnostics Solutions group in CSIRO, attempting to squeeze public datasets for promising cancer biomarkers. He thinks that well-visualised data speaks ten thousand words.

Prior to that, he was a postdoc at the King Abdullah University of Science and Technology in Saudi Arabia, where he studied DNA methylation in corals (and regrets never mastering diving despite working on corals AND living by the Red Sea). He graduated with a PhD in Genetics from the University of Cambridge, but has, over the years, swapped the pipette for a keyboard.

Use of neural network loss functions for highly unbalanced image segmentations

Mr Robert Langtry1

1Research Services Division at Defence Science Technology Group

Abstract:

Detection of pores in materials fabricated using additive manufacturing is key to understand the structural integrity of those materials. Images with infinitesimal pores are highly unbalanced datasets and this complicates the use of machine learning methods.

Application of U-net Neural Networks is ideal for the task of binary segmentation. However, traditional approaches, such as under/oversampling, to unbalanced data rely on input manipulation which cannot be employed for this case. Thus, in order to resolve this issue, I have tested different loss functions from the literature in both synthetic problems and on the aforementioned problem. Using these loss functions the network is able to adapt to imbalanced datasets and gives higher quality predictions of pores than traditional techniques.

Loss functions are able to generate models that suit various statistical measures and allow for the problem of imbalance to be solved without altering the dataset. This neural network solution is agnostic of problem and has use outside of image segmentation tasks as a method for handling unbalanced datasets during training.

In this talk I will go through a variety of loss functions, their links to statistical measures and how that is useful to a data science practitioner in both synthetic examples and the aforementioned pore detection problem.


Biography:

Robert has a master degree in Computer Science from the University of Melbourne, where he wrote his thesis on Asymmetric numeral system compression. He works as an eResearch Specialist at the Defence Science Technology Group.

Phase Transitions Induced by Adversarial Examples in Simple Statistical Models

Dr Thomas Lyall Keevers1

1DSTG, Sydney, Australia

Abstract:

State-of-the-art machine learning models excel in a range of difficult tasks, often attaining or surpassing human-level performance. Curiously, these same models are often vulnerable to small, adversarial perturbations. The origin of these vulnerabilities and solutions to them remain elusive. A number of papers have explored these effects empirically using complex state-of-the-art models, simple toy models that provide greater transparency, or by finding analytic bounds on classifier robustness to adversarial perturbation. In this talk we examine a range of simple, but carefully crafted, models to untangle the influence of adversarial examples on classifier performance. We find that in several instances the adversarial examples are able to induce phase transitions such that the model properties abruptly change when the adversarial perturbations exceed situation-specific thresholds. We relate these transitions to the tension of achieving model accuracy and local stability.


Biography:

Thomas Keevers completed a Bachelor of Science (Advanced)with First Class Honours at the University of Sydney in 2011and earned a Ph.D. in physics at the University of New SouthWales in 2016. Since joining Joint and Operations AnalysisDivision in 2016, Thomas has provided analytic support to nu-merous defence projects.

 

ABOUT AeRO

AeRO is the industry association focused on eResearch in Australasia. We play a critical coordination role for our members, who are actively transforming research via Information Technology. Organisations join AeRO to advance their own capabilities and services, to collaborate and to network with peers. AeRO believes researchers and the sector significantly benefit from greater communication, coordination and sharing among the increasingly different and evolving service providers.

Conference Managers

Please contact the team at Conference Design with any questions regarding the conference.
© 2019 Conference Design Pty Ltd