Dr Abeer Mazher1, Dr Luk Peeters2
1Csiro, Perth, Australia, 2Csiro, Adelaide, Australia
Numerical modelling increasingly generates massive, high dimensional spatio-temporal datasets. Exploring such datasets relies on effective visualization. This study presents a generic workflow to (i) project high dimensional spatio-temporal data onto a two-dimension (2D) plane in a computationally efficient manner, such that; distances between data points in high dimensional space are preserved accurately in 2D and (ii) represent 2D projection spatially using a two dimensional perceptually uniform background color map.
Machine Learning (ML) based Dimensionality Reduction Techniques (DRT) for data visualization i.e., t-Distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Maniflold Approximation and Projection (UMAP) are compared with the traditional Principal Component Analysis (PCA) by incorporating perceptual uniform color scheme in terms of accuracy, resolution and computational efficiency. The accuracy is evaluated using DRT independent quality metric based on the co-ranking framework.
The workflow is applied to an output dataset of an Australian Water Resource Assessment (AWRA) Model for Tasmania, Australia. The dataset consists of daily time series of nine components of the water balance at a 5 km grid cell resolution for the year 2017. The case study shows that PCA provides rapid visualization of global data structure, while the more computationally demanding t-SNE provides more accurate representation of local trends and variations. However, UMAP preserve more global structure with superior run time performance compare to t-SNE. The spatial visualization workflow, coupling low dimensional projection with perceptually uniform color maps, allows a visual expert interpretation of the high dimensional datasets and expected to perform well for earth science applications.
As an Applied Statistician, I have an extensive experience in developing statistical algorithms to implement in the fields of Earth Sciences, Econometrics and Remote Sensing in order to provide potential solutions to real world problems. I work with the diverse team of researchers to explore pattern recognition, machine/ deep learning and visualization techniques for Remote Sensing and Earth Science applications.