Computational and Structural Biotechnology Journal (Jan 2023)
Machine learning classification of plant genotypes grown under different light conditions through the integration of multi-scale time-series data
Abstract
In order to mitigate the effects of a changing climate, agriculture requires more effective evaluation, selection, and production of crop cultivars in order to accelerate genotype-to-phenotype connections and the selection of beneficial traits. Critically, plant growth and development are highly dependent on sunlight, with light energy providing plants with the energy required to photosynthesize as well as a means to directly intersect with the environment in order to develop. In plant analyses, machine learning and deep learning techniques have a proven ability to learn plant growth patterns, including detection of disease, plant stress, and growth using a variety of image data. To date, however, studies have not assessed machine learning and deep learning algorithms for their ability to differentiate a large cohort of genotypes grown under several growth conditions using time-series data automatically acquired across multiple scales (daily and developmentally). Here, we extensively evaluate a wide range of machine learning and deep learning algorithms for their ability to differentiate 17 well-characterized photoreceptor deficient genotypes differing in their light detection capabilities grown under several different light conditions. Using algorithm performance measurements of precision, recall, F1-Score, and accuracy, we find that Suport Vector Machine (SVM) maintains the greatest classification accuracy, while a combined ConvLSTM2D deep learning model produces the best genotype classification results across the different growth conditions. Our successful integration of time-series growth data across multiple scales, genotypes and growth conditions sets a new foundational baseline from which more complex plant science traits can be assessed for genotype-to-phenotype connections.