Online time series sensor data cleaning system: A case study in water quality

Dr Yifan Zhang1, Dr Peter Thorburn1, Mr Peter Fitch2

1CSIRO, Brisbane, Australia
2CSIRO, Canberra, Australia


Water quality high-frequency monitoring offers a comprehensive and improved insight into the temporal and spatial variability of the target ecosystem. However, most monitoring system lacks the consideration of sensor data quality control. The sensor data missing, background noises and signal interference have long been a huge obstacle for the users in understanding and analysing the sensor data, therefore makes the utilisation of sensor data much inefficient.

Therefore, we present an online data cleaning system for water quality sensor data. After collecting the raw sensor data, the data cleaning system applied different data filters to corresponding water quality sensor streams. In this approach, the specific environmental effects and can be considered separately. Cleaned data streams are then sent to the web-based frontend interfaces for end users.

There are two main tasks in this system:  detect and remove water quality outliers, and recover the missing sensor data. For the first task, the water quality filters are built based on the variable-specific threshold, changing rate and statistical distributions. The machine learning-based algorithms such as KNN are applied in filling the sensor data gaps in the monitoring streams.

The prototype system releases the end users from the trivial data cleaning work and shows a significant improvement in the readability of the water quality sensor data. In the next stage, more neural network based algorithms would be tested and integrated to provide more reliable and accurate data cleaning results.


Yi-Fan Zhang is a Postdoctoral fellow in Agriculture & Food, CSIRO. He received a PhD in data science from Queensland University of Technology in 2016. His work focuses on deep learning for agriculture decision making and management, with an emphasis on time series modelling and forecasting.


AeRO is the industry association focused on eResearch in Australasia. We play a critical coordination role for our members, who are actively transforming research via Information Technology. Organisations join AeRO to advance their own capabilities and services, to collaborate and to network with peers. AeRO believes researchers and the sector significantly benefit from greater communication, coordination and sharing among the increasingly different and evolving service providers.

Conference Managers

Please contact the team at Conference Design with any questions regarding the conference.
© 2017 Conference Design Pty Ltd