Dr Simon Cox1, Dr SIddeswara Guru2, Edmond Chuc2, Tina Schroeder2, Mosheh Eliyahu2, Yi Sun2, Jenny Mahuika2
1CSIRO, Melbourne, Australia, 2TERN, University of Queensland, Brisbane, Australia
Plot-based ecology data is collected by different agencies in multiple jurisdictions. Data are collected using varying survey methods and procedures even though the natural system and observed properties are similar, and the underlying methods are all derived from some common survey protocols. Furthermore, data representations and formats vary. As a consequence, use of the data in analysis is usually confined to a jurisdiction from where the data was collected.
Combination of datasets would enable their use at different scales for analysis and synthesis. In this paper, we describe an approach to represent plot-based ecology data using standard semantic models. This allows integration of observation data into a common data structure. The structure uses the W3C/OGC Semantic Sensor Network vocabulary, supplemented by a small number of domain-specific classes. This structure is compatible with a variety of other observation data systems used internationally, potentially allowing for integration with relevant non-ecology data.
The plot model is completed by controlled-vocabularies to provide the value-spaces for key slots in the mode. Currently, the controlled-vocabularies are mostly governed by the individual providers, though they are published using a common linked-data platform provided by ARDC. Full semantic integration requires mappings and harmonization between the controlled vocabularies which raises some background science questions.
We will discuss some of the initial implementation progress of a system to load data from the different providers into a common datastore underpinned by a graph database.
Simon Cox has been researching standards for publication and transfer of earth and environmental science data since the emergence of the world wide web. He is principal- or co-author of a number of international standards that have been broadly adopted in Australia and Internationally. The value of these is in enabling data from multiple origins and disciplines to be combined more effectively, which is essential in tackling most contemporary problems in science and society. His current work focuses on aligning science information with the semantic web technologies and linked open data principles, and the formalization, publication and maintenance of controlled vocabularies and similar reference data.
Simon was awarded the 2006 Gardels Medal by the Open Geospatial Consortium, and presented the 2013 Leptoukh Lecture for the American Geophysical Union. Simon is currently on the Executive Committee of CODATA, and is a member of the National Committee for Data in Science.