Dr Jonathan Yu1, Dr Simon Cox1
1Csiro , Clayton , Australia
Web technologies are changing the way scientific data is shared. For efficient sharing of data, the content of datasets must use descriptors that are also shared by both humans and machines at scale. Linked Data supports this by leveraging web principles to allow links between and within datasets. Individual definitions about fields within datasets may be assembled into vocabularies, and published at standard web locations for use in multiple datasets. For example, a soil profile dataset should use a standard soil classification, in which each soil type is denoted by a web identifier (http URI), which can be de-referenced to get a description of the type, formatted using web standards like OWL or SKOS.
We have technologies and tools to publish vocabularies as Linked Data, such as SISSVoc or the LDR service. These have been used to manage and publish reference data, including codelists, units of measure, substances, organisations and general ‘vocabulary’ elements. However, preparation of vocabulary content for these has required specialist RDF-based tools like TopBraid and PoolParty. Domain scientists are more familiar with standard desktop productivity tools like Excel.
We have developed an Excel2LDR tool that enables users to define vocabulary content in an Excel template, and publish it directly into a Linked Data Registry (LDR) without leaving the Excel application. We present examples of Excel2LDR use and publication of vocabulary content to CSIRO LDR instance. We also compare this with other Excel-based implementations.
Dr Jonathan Yu is a data scientist researching information and web architectures, data integration, Linked Data, data analytics and visualisation and applies his work in the environmental and earth sciences domain. He is part of the Environmental Informatics group in CSIRO Land and Water. He currently leads a number of initiatives to develop new approaches, architectures, methods and tools for transforming and connecting information flows across the environmental domain and the broader digital economy within Australia and internationally.