Dr Thomas Lyall Keevers1
1DSTG, Sydney, Australia
State-of-the-art machine learning models excel in a range of difficult tasks, often attaining or surpassing human-level performance. Curiously, these same models are often vulnerable to small, adversarial perturbations. The origin of these vulnerabilities and solutions to them remain elusive. A number of papers have explored these effects empirically using complex state-of-the-art models, simple toy models that provide greater transparency, or by finding analytic bounds on classifier robustness to adversarial perturbation. In this talk we examine a range of simple, but carefully crafted, models to untangle the influence of adversarial examples on classifier performance. We find that in several instances the adversarial examples are able to induce phase transitions such that the model properties abruptly change when the adversarial perturbations exceed situation-specific thresholds. We relate these transitions to the tension of achieving model accuracy and local stability.
Thomas Keevers completed a Bachelor of Science (Advanced)with First Class Honours at the University of Sydney in 2011and earned a Ph.D. in physics at the University of New SouthWales in 2016. Since joining Joint and Operations AnalysisDivision in 2016, Thomas has provided analytic support to nu-merous defence projects.