Mr Morgan Williams1,2, Dr Jens Klump1
1CSIRO Mineral Resources, Kensington, Australia,
2Australian National University, Canberra, Australia
Much of our present understanding of the Earth’s 4.5 billion year history is derived from investigations of the chemistry and minerals of rocks, which can preserve records of time, environmental conditions and geological processes. While tens to hundreds of samples/analyses are commonly utilised to investigate specific geological problems, existing public geochemical databases allow thousands to millions of rock and mineral analyses to be investigated on a global scale.
Geochemical data is compositional in nature (i.e. sums to 100%), and multivariate statistical analysis requires appropriate log-transformations. Global geochemical databases include analyses of varying quality and provenance conducted over several decades, and themselves pose specific technical challenges. Additional issues arise from null- and below-detection values due to the nature of compositional data, which are compounded by high-dimensionality and low overall data density.
Deriving accurate interpretation from geochemical data typically requires balancing dimensional reduction, visualisation and representation of some geological reference frame. Here we investigate the potential value of aggregating of tens of thousands to millions of analyses, and highlight the advantages of using higher-dimensional multivariate compositional analysis. Well-established domain knowledge is used as a foundation for further analysis (e.g. petrological and geochemical classification, geochemical proxies). We present an accessible example platform for interactive exploration of geochemical trends and relationships using Jupyter Notebooks and web-based dashboard applications.
Geochemical databases highlight the importance of ‘small but complex’ data, and multivariate analysis of geochemistry may provide useful constraints on geological processes in the otherwise inaccessible reaches of deep time
Morgan Williams is a Postdoctoral Fellow within the CSIRO Mineral Resources unit. With a principal background in isotope geochemistry, he is currently applying data science principles to global geochemical datasets to provide insight into large-scale geological processes.