Targeting the Cross-domain Bullseye of Metadata for Physical Samples: What is the Minimum Kernel Required for Interoperability and Reuse?

Dr Jens Klump3, Dr Lesley Wyborn1, Dr  Kerstin Lehnert2, Dr Sarah Ramdeen2

1National Computational Infrastructure, ANU, Canberra, Australia, 2Lamont Doherty Earth Observatory, Columbia University, Palisades, United States of America, 3CSIRO, Perth, Australia


The 2019 CODATA Beijing Declaration on Research Data ( ) extends the term ‘data’ to include ‘physical samples and analogue artefacts (and the digital representations and metadata relating to these things), thus reinforcing samples as first-class citizens of the modern research data ecosystem. To gain their full potential physical samples need to be uniquely identified, well described, and findable in online catalogues, so that they can then be linked to related observational and analytical data, publications, people and other digital information.

There has been widespread implementation of the IGSN as a persistent, globally unique identifier for physical samples: initially within the geoscience community, but in more recent years it has expanded virally into many other domains making it impossible to develop a common vocabulary that defines metadata for all samples collected.  One size does not fit all, and each individual domain (e.g. soil scientists, biologists, cosmochemists, paleoclimate scientists, archaeologists) has its own suite of vocabularies to describe their samples.

The proposed solution is to have global community agreement on a minimum set of attributes that are common to all samples, the ‘Bullseye’, i.e. the common core kernel that is sample-specific but discipline agnostic, and then allow individual communities to add their domain-specific metadata requirements around this agreed common kernel. As IGSN continues to grow, protocols and best practices for describing sample metadata also need to be developed and implemented. A clearinghouse should be established for individual domains to lodge their preferred sample metadata profiles, vocabularies and ontologies.


Jens Klump is a geochemist by training and leads the Geoscience Analytics Team in CSIRO Mineral Resources based in Perth, Western Australia. In his work on data infrastructures, Jens covers the entire chain of digital value creation from data acquisition to data analysis with a focus on data in minerals exploration. This includes automated data and metadata capture, sensor data integration, both in the field and in the laboratory, data processing workflows, and data provenance, but also data analysis by statistical methods, machine learning and numerical modelling.


