Miss Lauren Stevens1, Chris Watkins1, Tim Weaver1, Alice Curkpatrick2, James Quinn2, Chris Teague2, Kavina Dayal1
1Csiro, , Australia, 2Cotton Seed Distributors Ltd. (CSD), Australia
The research and breeding of future cotton varieties is a joint venture by CSIRO and Cotton Seed Distributors Ltd. (CSD). For the past five years, CSD have been collecting data on cotton crops across the major cotton growing regions of Australia. This dataset includes phenological and agronomic features from their key varieties in both irrigated and dryland (rainfed) fields and is an important resource to understand features that impact cotton yield during the season. In a joint project between CSIRO and CSD, we analysed this dataset using machine learning for three key growth stages known as snapshots: First Flower, Cut-out and End of Season.
We have used the R programming platform to train an XGBoost model on data from all five seasons, based on key features identified by domain experts, and compare predictive performance for the season yield at each snapshot. Additionally, we have begun to investigate the predictive importances of all measured features and whether this information could be used to improve crop productivity and management.
Our analysis shows that machine learning out-performs a simple linear regression by at least 1 bale per hectare with r2 values on the order of 0.8.
The machine learning tool will form the basis of a web-based R Shiny App that will allow crop managers to make informed decisions about the crop performance and investigate in season management while gaining insights from the data.
Lauren works for CSIRO in IM&T supporting the organisation with their data visualisation and analytical needs through eResearch projects. Lauren has previous experience in the Climate Science Centre contributing to key project support and technical expertise in climate modelling research.