Chlorophyll fluorescence spectral discrimination by

DEPARTMENT for ENVIRONMENT, FOOD and RURAL AFFAIRS
Research and Development
CSG 15
Final Project Report
(Not to be used for LINK projects)
Two hard copies of this form should be returned to:
Research Policy and International Division, Final Reports Unit
DEFRA, Area 301
Cromwell House, Dean Stanley Street, London, SW1P 3JH.
An electronic version should be e-mailed to [email protected]
Project title
Chlorophyll fluorescence spectral discrimination by artificial neural network
methods
DEFRA project code
HH1530SPC
Contractor organisation
and location
Horticulture Research International, Wellesbourne, CV35 9EF
Total DEFRA project costs
Project start date
£ 9,851
01/12/01
Project end date
01/04/02
Executive summary (maximum 2 sides A4)
Background
Prediction of the future home-life quality of ornamental plants at the point of sale is of major interest to the
horticulture industry. Plants that appear healthy during shelf-life may show rapid decline during home-life due
to marketing stress or damage and techniques for early detection of latent plant damage are relevant to
ROAME HH16 on Uniformity in Crop Produce and to the grower and retailer. LINK project HL0134LPC
on Robust product design and prediction for post-harvest pot-plant quality and longevity used the rapid
induction kinetics of chlorophyll fluorescence (CF) as a predictor of the post-harvest quality of ornamental potplants. However, that project used only a simple linear regression model based on a simple summary measure
of the CF data and the aim of this project, HH1530SPC, was to test whether artificial neural networks (ANN's)
could be used to produce improved predictions of plant quality superior to those used in HL0134LPC.
Chlorophyll fluorescence quality prediction models
The chlorophyll fluorescence curves used in HL0134LPC involved nine parameters, five values F1, F2, F3, F4
and F5 at fixed time points, minimum and maximum values F0 and Fm, the time to maximum fluorescence TFm
and the area above the curve. The HL0134LPC project consultant (Professor Strasser, Geneva) supplied a
summary CF performance index intended to capture all the useful CF information in the formula:
 F F  
F   F F  
F F 
PI  2 1 2. m 1 .1 1 . m 1 .1 4 1  .
 F F  
 3 1 

Fm 

F1


Fm  F1 

PI was then used to predict future plant quality by using a simple linear regression model based on PI.
In project HH1530SPC, the linear regression used in HL0134LPC was generalised in two ways. First, the
assumption of a single regression parameter PI was generalised by replacing the single PI predictor variable by
a multivariate set of CF parameters. Second, the assumption of linearity was generalised by replacing the linear
regression model by a non-linear ANN model. These generalisations led to three alternative CF models
CSG 15 (9/01)
1
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
DEFRA
project code
HH1530SPC
i)
Model I: Simple linear regression fitted to PI
ii)
Model II: Multivariate linear regression fitted to CF parameters
iii)
Model III: ANN model fitted to CF parameters.
The three models were fitted to the HL0134LPC data and were then compared using statistical methodology.
Predictive CF models for begonia and Poinsettia quality variates (Objectives 1 and 2)
Predictive models were fitted for begonia flower count, flower drop and damaged leaf count and for Poinsettia
leaf drop and bract drop using the three alternative CF models outlined above. For each model, a predicted
index was generated for each observation and the observed plant quality variables were then plotted against the
predicted index. The plots were used to provide a graphical comparison of the power of the three alternative
models for predicting plant quality variates using CF data.
i) CF predictions for individual begonia plants recorded at de-sleeving showing the relationship between the
observed square root of flower count two weeks after de-sleeving for individual begonia plants and the
predicted model index for each plant for the three possible models.
8
Square Root of Flower Count at Week 2
[a] Model I
[c] Model III
[b] Model II
7
6
5
4
3
2
10
20
30
40
50
-20.0
IndexMI at desleeving
-19.5
-19.0
-18.5
-59.0
IndexMII at desleeving
-58.5
-58.0
-57.5
IndexMIII at desleeving
The vertical scatter about the fitted line shows the goodness-of-fit of each individual model and it is apparent
that Model III, the ANN model, gave a small improvement in fit over both Models I and II. However, the
residual scatter for all three models remained substantial and the predictions for the individual plants remained
unreliable.
ii) CF predictions for individual Poinsettia plants recorded at de-sleeving showing the relationship between the
observed square root of flower count two weeks after de-sleeving for individual Poinsettia plants and the
predicted model index for each plant for the three possible models.
10
Square Root of Leaf Drop at Week 2
[a] Model I
[c] Model III
[b] Model II
8
6
4
2
0
20
30
40
50
IndexMI at desleeving
60
-18
-17
-16
IndexMII at desleeving
-15
30
31
32
33
34
35
IndexMIII at desleeving
The vertical scatter about the fitted line shows the goodness-of-fit of each individual model and it is apparent
that Model III, the ANN model, gave a very substantial improvement in fit over both Models I and II. The
CSG 15 (9/01)
2
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
DEFRA
project code
HH1530SPC
scatter for Model III was substantially reduced but prediction for the individual plants still remained
problematical, even for Model III.
The predictive power of ANN models relative to linear regression (Objective 3)
The power of ANN's for plant home-life quality using CF data was assessed by calculating the percentage
variance explained by the three models for the measured begonia and Poinsettia quality variates after two weeks
in home-life conditions. The power of the ANN model (Model III) was compared with the power of the simple
linear regression on PI (Model I) and the power of the multiple linear regression on CF parameters (Model II)
in the following tabulation
Percentage variance explained Begonia
by each model for each variate. flower count
14.6%
Model I
22.3%
Model II
30.4%
Model III
Begonia
flower drop
31.1%
31.9%
39.9%
Begonia damaged Poinsettia
leaf count
leaf drop
16.3%
25.9%
33.6%
27.3%
40.4%
50.2%
Poinsettia
bract drop
2.5%
3.8%
11.1%
The ANN model gave better predictions and increased power for all the recorded variates for both begonia and
Poinsettia. However, for the begonia variates, the increase in power relative to the increased complexity of the
models was relatively modest and did not represent any major improvement in the fitted models. The Poinsettia
bract drop predictions were too weak to be interesting but the ANN model achieved a very substantial
improvement in leaf drop prediction. This showed that for Poinsettia, there was a very real improvement in
predictive power for the ANN model relative to the linear regression methods.
Delivery against objectives
This project has delivered a new ANN methodology for improved prediction of plant quality variables from
observed chlorophyll fluorescence data. Overall, the power of chlorophyll fluorescence for predicting the
home-life performance of individual pot-plants was modest and even the most effective model, the ANN model
for poinsettia leaf drop, explained only about 50% of the observed variability. However, there appeared to be
real potential for batch screening using batch sampling and the models developed in this report will be used in
HL0134LPC to test the effectiveness of CF for batch screening. The goodness-of-fit information from the
various models will be used to estimate the power of CF screening to discriminate between batches of plants
subjected to different levels of stress. The models will also provide additional insight into the relationship
between a CF response curve and the subsequent quality of a pot-plant and could have important applications in
future research on plant quality.
CSG 15 (9/01)
3
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
DEFRA
project code
HH1530SPC
Scientific report (maximum 20 sides A4)
1) Introduction
The quality of ornamental plants can be determined by visual inspection at the point of sale but damage caused
by poor handling or lack of temperature control during transport may not be apparent at that time. However, as
damaged plants can deteriorate rapidly after purchase, some method of detecting plant damage before the visual
symptoms of damage become apparent is highly desirable. Chlorophyll fluorescence (CF) measures the
photosynthetic activity of plants and has the potential to detect latent damage (DeEll, van Kooten et al., 1999).
Since CF monitoring is rapid and non-destructive, the technique has the potential to provide a useful method for
routine quality control of pot-plants at the point of sale.
The Horticulture LINK project HL0134LPC on Robust product design and prediction for post-harvest potplant quality and longevity has used the rapid induction kinetics of chlorophyll fluorescence as a predictor of
post-harvest quality of ornamental pot-plants. Professor Strasser, the project consultant, has developed a theorybased measure of photosynthetic activity called the performance index (PI) using the principles of the JIP test
(Parsons, Edmondson et al, 2001). PI is a non-linear measure of plant photosynthesis based on a combination of
the characteristics of the CF spectral response curve and is intended to reduce the high dimensional information
of a CF curve to a single dimensional variable. In HL0134LPC, it was assumed that PI captured all the useful
CF information relating to plant damage and that PI was the most appropriate CF measure for the prediction of
plant quality. Hence the future home-life performance of a plant was predicted using PI as the sole CF predictor
in an ordinary linear regression model.
There are two main difficulties with this approach. First, although PI may be a useful theoretical measure of
photosynthetic activity, it does not necessarily follow that PI is the most useful statistical predictor of plant
damage. Second, although linear statistical regression is powerful for prediction when there is a simple linear
relationship between the predictor variables and the outcome variable, it is unclear whether this assumption is
valid for the prediction of plant quality from chlorophyll fluorescence measurements.
The methodology used in HL0134LPC for relating home-life quality to the initial CF response can be
generalised in two ways. First, the assumption that PI captures all the useful information in a CF spectrum can
be relaxed by fitting a multiple linear regression model using a range of parameters from a CF response curve.
Second, the assumption of linearity can be relaxed by fitting a non-linear model using artificial neural network
(ANN) methods. ANN methods have been shown to be useful for classification of plant species using a range
of CF parameters as inputs (Tyystjärvi, Koski et al, 1999).) and should have similar utility for plant quality
prediction.
In this report, the linear regression model used in HL0134LPC will be generalised by replacing the single PI
predictor variable by a multivariate set of chlorophyll fluorescence parameters with coefficients chosen to
maximise the power of the linear regression model. The model will then be further generalised by fitting a nonlinear ANN model to relax the assumption of linearity. The utility of the three models, the linear PI model, the
linear multivariate model and the fully general ANN model will be compared and the power of the three
methods will be assessed. Finally, the appropriateness of the single PI measure as a predictor of home-life
quality will be assessed by comparison with the generalised predictors constructed using multiple linear
regression and ANN methods.
2) Objectives of the project
i)
To use ANN methods to construct a predictive model for begonia flower drop, bud drop and flower
count after week 1 of home-life using CF parameter data collected at the start of home-life in 1999,
2000 and 2001
CSG 15 (9/01)
4
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
ii)
To use ANN methods to construct a predictive model for poinsettia bract drop and leaf drop after week
1 of home-life using CF parameter data collected at the start of home-life in 1999/00, 2000/01 and
2001/02
iii)
To quantify the predictive power of ANN models for begonia and poinsettia quality using crossvalidation methods to compare the power of ANN models with the power of simple linear regression
models
3) Chlorophyll fluorescence
i) Chlorophyll fluorescence measurement
The chlorophyll fluorescence measurements for project HL0134LPC were made using a Plant Efficiency
Analyser (PEA) supplied by Hansatech Instruments Ltd., King's Lynn. The PEA is a portable instrument
designed to measure chlorophyll fluorescence induction by high time resolution continuous excitation. The
instrument measures the time-dependent changes in fluorescence emission, which occur when a dark-adapted
leaf is exposed to light. Typically, illumination of a healthy leaf after 10 - 30 minutes dark adaptation will result
in an immediate rise to level (F0) followed by a rapid polyphasic rise to a maximum fluorescence level (Fm) as
exemplified in Figure 1.
Figure 1: A typical chlorophyll fluorescence response curve showing the position of the measured
parameters used to characterise the shape of the curve
Fm
1.0
0.8
Area = Area above curve from 0 to TF m
F4
0.6
F3
0.4
F2
F1
F0
0.2
0.05 0.1
0.01
Normalized Fluorescence
Fluorescence Parameters
F5
0.1
0.3
2
TFm
30 ms
1
10
100
0.0
1000
Time (milliseconds)
The PEA was set to record fluorescence values at five time intervals 0.05 ms, 0.1 ms, 0.3 ms, 2 ms and 30 ms
within the polyphasic kinetic part of the fluorescence induction curve (also called the Kautsky curve) and the
corresponding fluorescence values F1, F2, F3, F4 and F5, together with F0, Fm, the time to maximum fluorescence
(TFm) and the area above the fluorescence curve at TFm were recorded. The relationship between these nine
measures and the chlorophyll fluorescence curve are summarised in Fig 1. (F0 is estimated by backward
extrapolation from the main curve and is indicated by the lozenge symbol on the vertical axis)
5
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
ii) Performance Index (PI)
The most commonly used measure of plant stress is the ratio Fv/Fm, where Fv = (Fm - F0) is a measure of the
quantum yield of the dark-adapted photo-system II (PSII). Project HL0134LPC investigated the use of the rapid
induction kinetics of chlorophyll fluorescence for prediction of post-harvest quality of pot-plants using a
performance index (PI) based on the principles of the JIP test (Strasser, Eggenberg and Strasser, 1996). The
performance index supplied by the project consultant Professor Strasser at the University of Geneva was
defined by the equation:
 F F   F   F F   F F 
PI  2 1 2. m 1 .1 1 . m 1 .1 4 1 
 F3 F1   Fm   F1   Fm F1 
(1)
In HL0134LPC, the chlorophyll fluorescence spectral response curve for each individual plant was summarised
by PI, which was then used as a predictor for future home-life performance using ordinary linear regression
methods (Parsons, Edmondson, et al 2001).
4) Plant quality attributes
i) Measured attributes of plant quality for HL0134LPC
Project HL0134LPC assessed the quality of begonia and poinsettia plants in shelf-life by scoring the quality of
the plants using expert and consumer scores and by measuring a range of physiological plant attributes of
quality. In addition to the quality assessment score on each plant, the following physiological characteristics
were simultaneously recorded on each plant during each week of shelf-life:
Begonia
a) Flower drop
b) Bud drop
c) Flower count
d) Damaged flower count
e) Damaged leaf count
Poinsettia
a) Bract drop
b) Leaf drop
ii) Prediction of individual attributes of plant quality
The overall quality of a plant is dependent on the observable attributes of quality and in this report the utility of
chlorophyll fluorescence for predicting quality during home-life has been examined by modelling each attribute
separately. One advantage of this approach is that if marketing stress affects only certain individual attributes of
quality, fitting individual models will predict the effects of marketing stress with maximum sensitivity. A
further advantage is that if different characteristics of the chlorophyll fluorescence curve are predictive for
different aspects of stress damage, modelling plant attributes individually may give additional information
about the relationship between chlorophyll fluorescence and the effects of stress damage. The ultimate aim of
the work will be to develop predictors for overall quality by integrating the individual plant attributes of quality
into a single measure of overall plant quality: that work will be developed in LINK project HL0134LPC and in
DEFRA project HH1529SPC and will not be discussed here.
6
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
5) Chlorophyll fluorescence prediction of quality
i) Simple linear prediction method based on PI
HL0134LPC tested the effects of a range of different marketing regimes on pot-plant quality by subjecting
batches of plants to simulated marketing and then monitoring the subsequent performance of the plants during a
period in simulated home-life environments. The chlorophyll fluorescence spectrum of the individual plants
was recorded at de-sleeving immediately following simulated marketing and the quality attributes of the
individual plants were then recorded weekly during the simulated home-life period.
Let PI represents the performance index at de-sleeving and let yt represents a plant quality attribute measured on
the tth recording occasion after de-sleeving. Let Ht represent a set of home-life environmental factors assumed
not to interact with PI and let et represent a random error term. Then a predictive linear regression model for yt
based on PI is:
yt    .PI  H t h  et
(2)
Equation (2) is the simple linear regression equation for yt that was used to model the attributes of plant quality
in HL0134LPC. Unfortunately, the correlation between PI and the subsequent plant quality in home-life for
individual plants was found to be weak and there appears little prospect that a simple regression equation based
on PI will be useful for predicting individual plant quality (see Year 3 Annual Report for HL0134LPC).
ii) Generalisation of the simple linear prediction method
Potentially there are at least two ways of achieving more powerful predictions of future plant quality using
chlorophyll fluorescence data.
a) Multivariate linear prediction based on the individual chlorophyll fluorescence parameters where each
parameter is given a proper empirical weighting for predicting plant quality attributes.
b) Multivariate non-linear prediction based on non-linear models of the chlorophyll fluorescence parameters
using trained artificial neural networks (ANN’s).
Multivariate linear regression analysis is a statistical technique that can be applied using standard statistical
methodology and is a natural generalisation of simple linear regression. ANN models are potentially more
powerful than linear regression models but the methodology is less rigorous and there is a risk that models may
be over-fitted with consequent spurious estimates of the power of the model. The inputs for ANN models are
the individual chlorophyll fluorescence parameters of a CF curve therefore the natural reference models for
comparing the power of ANN model are the corresponding multivariate regressions on the individual
chlorophyll fluorescence parameters. It is essential, therefore, to compare ANN models both with the simple
linear regressions on PI and with the multivariate linear regressions on the individual chlorophyll fluorescence
parameters.
In the remainder of this report, the three basic models, the linear regression model based on PI, the multivariate
regression model based on the individual parameters of the chlorophyll fluorescence curve and the full ANN
model will be designated Model I, Model II and Model III, respectively. These three models will form a
structured reference set and will be discussed and developed in a structured way to emphasise the thematic
nature of the project.
7
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
6) Models for quality prediction
Model I. The simple linear regression model yt    .PI  H t h predicts the quality attribute output (yt) using
the performance index (PI) at desleeving and a set of home-life factor conditions Ht. Project HL0134LPC used
two home life treatments, temperature and lighting, therefore replacing Ht by explicit home-life factors, gives
the model
y t    Temp  Light   .PI
(Model I)
The dependencies shown in Model I can be represented graphically by a series of connected nodes where the
various nodes represent inputs or outputs and the lines connecting pairs of nodes represent the relationships
between those nodes. The graphical representation of Model I is shown in Fig 2, where the relationship between
PI and the output node is represented by the regression coefficient .
Fig 2: Model I represented as a network from inputs,
PI, home-life factors and constant, to output (yt).
CF Performance Index (PI)
Home-life Factors
PI
Light Temp. Constant
()

Output
Model II. The linear generalisation of Model I allows each individual chlorophyll fluorescence parameter to
have a linear empirical regression coefficient. The single regression on PI is replaced by a sum of regression
terms   i xi of the individual chlorophyll fluorescence parameters to give
n
y t    Temp  Light    i xi
(Model II)
n
The dependencies in Model II can be shown graphically, as in Fig 3:
Fig 3: Model II represented as a network from inputs, CF
parameters, home-life factors and constant, to output (yt).
CF parameters - Inputs (i)
x1
x2
.....
Home-life Factors
xn
i
Output
8
Light Temp. Constant
()
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
Model III. The full generalisation of Model II is an artificial neural network (ANN) model. In an ANN model,
the CF parameters at desleeving are the inputs to an artificial neural network with j hidden layer nodes, a
logistic activation function (h) and a linear output function (o). The hidden layer provides the non-linear
generalisation of Model II by acting as a series of non-linear switches and weights that allow the input variables
to be combined in various complex ways. This allows a general non-linear function of the CF parameters at
desleeving to be used to predict subsequent home-life quality. By increasing the size of the hidden layer, neural
networks can approximate any continuous function, therefore Model III is a true generalisation of Model II. The
non-linear model can be expressed algebraically by the formula:


y t    Temp  Light   w jo h   wij xi 
j o
 i j

(Model III)
where h ( z )  exp( z ) /(1 exp( z )) .
A representation of the nodes and interconnections of the various layers can be shown graphically, as in Fig 4:
Fig 4: Model III represented as a network from inputs, CF
parameters, home-life factors and constant, to output (yt).
CF parameters - Inputs (i)
x1
x2
wij
Hidden
Layers (j)
.....
Home-life Factors
xn
Light Temp. Constant
()
... ...
wjo
Output
The number of parameters in Model III can be changed by increasing or decreasing the number of hidden layers
in the network and an important aspect of fitting or training a neural network model for a particular task is the
specification of the number of nodes in the hidden layer and the interconnections between the various nodes in
the model.
7) Cross-validation and model fitting
i) Cross-validation
In the original work plan, it was intended that cross-validation methods would be used to fit and validate the
different models. Cross-validation divides data into two subsets and uses one subset to fit the model and the
other subset to test the performance of the fitted model. The procedure is repeated many times using different
divisions of the data to generate a sampling distribution for the assumed model. Cross validation has been
reported extensively in the literature for modelling large data sets and it was anticipated that the methodology
would be appropriate for this project. However, the relatively modest amount of data available for this project
meant that cross-validation was less useful than expected and the method has not been used in this project.
9
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
ii) Fitting and validation for Models I and II
For Models I and II, model fit has been assessed by using conventional R2 methods based on the percentage
variance of the total variance explained by a fitted model. The significance of the individual terms in a model
has been assessed by fitting a maximal model including all potential explanatory variables and then sequentially
testing the amount of variability explained by each term in the model. Terms that explained a non-significant
amount of variation were omitted from the model and the process was repeated until no further terms could be
omitted. The importance of the remaining terms in the final fitted model were then assessed individually by
omitting each term individually from the final fitted model and assessing the variability explained by each
omitted term separately.
iii) Parameter normalisation for Model III
For an artificial neural network model, the inputs (fluorescence parameters) must be scaled to lie in the interval
[0,1] and in this project the ANN parameters F0, F1, F2, F3, F4 and F5 have been scaled by division by Fm, TFm
has been scaled by division by 1000 and the area above the fluorescence curve (Area) at T Fm has been scaled by
division by Fm x TFm.
iv) Fitting and validation for Model III
The fully connected ANN model shown in Fig 4 has every chlorophyll fluorescence input node connected to
every hidden layer node, with the hidden layer nodes and the home-life input factors connected directly to the
output node. The fitting and validation procedure for this model requires the pruning of unnecessary CF input
nodes and connections until a minimum network with explanatory power similar to the fully connected model is
obtained. Unnecessary connections have very little explanatory power and are unlikely to explain real
characteristics of the data. The pruning procedure used here was to test the effect of omitting a connection from
the network shown in Fig 4 by examining the amount of variation explained by each connection using an
approximate R2 procedure. Connections and nodes that had little explanatory value were omitted until a
minimum network was achieved.
8) Predictive models for Begonia attributes of plant quality (Objective 1)
i) Preliminary analysis of quality characteristics
Preliminary examination indicated that none of the correlations between the begonia plant quality attributes
during home life and the chlorophyll fluorescence at marketing were strong. However, the three characteristics
with the best correlations were the week 2 flower drop, week 2 flower count and week 2 damaged leaf count
data, not the week 1 bud drop, week 1 flower count and week 1 flower drop characteristics as suggested in the
original work plan. Therefore it was decided that the week 2 data set would provide the most useful data for
this project. Due to changes in the initial settings of the PEA by Hansatech at the end of Year 1 of HL0134L,
only data from Years 2 and 3 of HL0134L has been used in this project.
ii) Models I and II for flower counts
Table 1 shows an analysis of variance for Models I and II relating flower counts in week 2 to the mean leaf CF
parameter data at desleeving. For Model I, the regression coefficient for PI at desleeving was positive showing
that a higher PI value at desleeving gave a higher flower count in week 2 of home-life.
For Model II, the important CF parameters for predicting flower counts were, in order of importance, T Fm, F4,
Area, F3 and F5 and the estimated regression parameters show that a plant with a high value of TFm, Area, F4 or
10
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
F5 at desleeving had a lower flower count in week 2 of home-life whereas a plant with a high value of F3 had a
higher flower count at week 2 of home-life.
Table 1: Analysis of variance and chlorophyll fluorescence parameters for begonia flower count
data
Model
Model I
Model II
Term
df
s.s.
m.s.
df
s.s.
m.s.


270
163.22
0.60
270
163.22
0.60
Total
1
-0.081
0.27
0.27
ns
1
-0.228
0.27
0.27
ns
+ Temp
1
-0.197
2.01
2.01
*
1
-0.224
2.01
2.01
*
+ Light
1
0.036
23.17
23.17
***
5
37.47
7.50
***
+ PI/CF
267
137.76
0.52
263
123.46
0.47
Residual
parameter
-1
-3.54
-10.91
10.91
***
-s TFm
-1
-43.6
-5.99
5.99
***
- Area
-1
10.96
-5.43
5.43 ***
- F3
-1
-17.50
-8.16
8.16 ***
- F4
-1
-10.94
-3.20
3.20 **
- F5
iii) Model III for flower counts
Fig 5 shows the minimum ANN model consistent with the observed flower count and chlorophyll fluorescence
data and Table 2 shows the corresponding estimated weights for the individual network connections.
Fig 5: Fitted ANN model for the flower count in
week 2 of home-life.
CF parameters - Inputs (xi; i =1…6)
F0
TFm Area
F2
F4
Home-life Factors
F5 Light
Temp.
Constant
()
wij
Hidden layers
(j =1…3)
wjo
Output (y) - Flower Count at week 2
11
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
Table 2: Estimated weights in ANN model
for flower count
No. Weight
Connection
Value
1
w11
F0
→ h1
10.45
2
w13
F0
→ h3
20.26
3
w21
TFm
→ h1
0.77
4
w22
TFm
→ h2
3.72
5
w23
TFm
→ h3
-5.77
6
w33
Area → h3
-13.35
7
w43
F2
→ h3
5.49
8
w52
F4
→ h2
-2.09
9
w62
F5
→ h2
-0.54
10
w63
F5
→ h3
-1.55
11
w1o
H1
→ o
-93.5
12
w2o
H2
→ o
30.20
13
w3o
H3
→ o
18.15
14
→
o
63.32


15
Temp
Temp → o
-0.23
.
16
Light
Light
→ o
-0.21
The estimated network weights in Table 2 characterise the relationship between the CF data and the flower
count data. The importance of individual input nodes in the final fitted ANN model was calculated by the
change in the sum of squared residuals for the removal of each input node in turn from the model and this gave
the following ranking, in order of importance, of the input nodes: F5, F2, Area, TFm, F4, and F0.
iv) Models I and II for flower drop
Table 3 shows an analysis of variance for the regression model relating flower drop in week 2 to the
chlorophyll fluorescence data at desleeving. The regression coefficient for PI in Model I was negative showing
that a higher PI value at desleeving gave a lower flower drop in week 2 of home-life. The only important CF
parameter for predicting flower drop was F5 with the estimated regression parameter for this term showing that
a plant with a high value of F5 at desleeving had a higher flower drop in week 2 of home-life.
Table 3: Analysis of variance and chlorophyll fluorescence parameters for begonia flower drop data
Model
Term
Total
+ Temp
+ Light
+ PI/CF
Residual
paramete
-rsF5

df
270
1
1
1
267
-0.720
-0.181
-0.038
Model I
s.s.
206.95
36.92
2.91
26.05
141.06
m.s.
df
0.77
36.92 ***
2.91 *
26.05 ***
0.53
270
1
1
1
267
-1
Model II
s.s.

-0.812
-0.170
12.27
206.95
36.92
2.91
27.82
139.30
-27.82
m.s.
0.77
36.92 ***
2.91 *
27.82 ***
0.52
27.82 ***
v) Model III for flower drop
Fig 6 shows the minimum ANN model consistent with the observed flower drop and chlorophyll fluorescence
data and Table 4 shows the corresponding estimated weights for the individual network connections.
12
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
Fig 6: Fitted ANN model for the flower drop at
week 2 of home-life.
CF parameters - Inputs (xi; i =1…4)
TFm
F1
F4
Home-life Factors
F5 Light
Temp.
Constant
()
wij
Hidden layers
(j =1…4)
wjo
Output (y) - Flower Drop at week 2
Table 4: Estimated weights in ANN model
for flower drop
No. Weight
Connection
Value
1
w13
TFm
→ h3
-9.77
2
w14
TFm
→ h4
46.38
3
w22
F1
→ h2 103.93
4
w23
F1
→ h3
-49.26
5
w31
F4
→ h1
-0.93
6
w32
F4
→ h2
-25.91
7
w33
F4
→ h3
16.88
8
w34
F4
→ h4
-60.59
9
w41
F5
→ h1
-0.79
10
w44
F5
→ h4
16.81
11
w1o
H1
→ o
-47.57
12
w2o
H2
→ o
25.47
13
w3o
H3
→ o
26.2
14
w4o
H4
→ o
1.53
15
→ o
-14.41


16
Temp
Temp → o
-0.84
.
17
Light
Light
→ o
-0.17
The estimated network weights in Table 4 characterise the relationship between the CF data and the flower drop
data. The importance of the individual input nodes in the final fitted ANN model was calculated by the change
in the sum of squared residuals for the removal of each input node in turn from the model and this gave the
following ranking, in order of importance, of the input nodes: F1, F4, TFm and F5.
vi) Models I and II for damaged leaf counts
Table 5 shows an analysis of variance for Models I and II relating damaged leaf counts at week 2 to the mean
leaf CF parameter data at desleeving. For Model I, the regression coefficient for PI at desleeving was negative
showing that a higher PI value at desleeving gave a lower damaged leaf count in week 2 of home-life. For
13
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
Model II, the important CF parameters for predicting damaged leaf count were, in order of importance, T Fm, F5,
F2 and Area and the estimated regression parameters show that a plant with higher values of T Fm, F5, F2 or Area
at desleeving had a higher damaged leaf count in week 2 of home-life.
Table 5: Analysis of variance and chlorophyll fluorescence parameters for begonia damaged leaf
count data
Model
Model I
Model II
Term
df
s.s.
m.s.
df
s.s.
m.s.


270
124.39
0.46
270
124.39
0.46
Total
1
-0.106
1.02
1.02 ns
1
-0.148
1.02
1.02 ns
+ Temp
1
0.016
0.00
0.00 ns
1
0.038
0.00
0.00 ns
+ Light
1
-0.034
20.40
20.40 ***
4
42.64
10.66 ***
+ PI/CF
267
102.96
0.39
264
80.73
0.31
Residual
parameter
-1
3.19
-8.61
8.61
***
-s TFm
-1
27.91
-2.50
2.50 **
- Area
-1
6.08
-4.67
4.67 ***
- F2
-1
14.63
-7.71
7.71 ***
- F5
vii) Model III for damaged leaf counts
Fig 7 shows the minimum ANN model consistent with the observed damaged leaf count and chlorophyll
fluorescence data and Table 6 shows the corresponding estimated weights for the individual network
connections.
Fig 7: Fitted ANN model for the damaged leaf
count in week 2 of home-life.
CF parameters - Inputs (xi; i =1…5)
TFm Area
F1
F3
Home-life Factors
F5 Light
Temp.
Constant
()
wij
Hidden layers
(j =1…3)
wjo
Output (y) - Damaged leaf count at week 2
14
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
Table 6: Estimated weights in ANN model
for damaged leaf count
No. Weight
Connection
Value
1
w11
TFm
→ h1
23.87
2
w12
TFm
→ h2
4.43
3
w13
TFm
→ h3
6.73
4
w22
Area → h2
32.85
5
w23
Area → h3
51.4
6
w32
F1
→ h2
2.78
7
w33
F1
→ h3
-0.35
8
w42
F3
→ h2
10.66
9
w43
F3
→ h3
18.71
10
w51
F5
→ h1
-1.86
11
w52
F5
→ h2
-5.77
12
w53
F5
→ h3
-11.94
13
w1o
h1
→ o
-34.53
14
w2o
h2
→ o
104.17
15
w3o
h3
→ o
-25.49
16
→
o
-41.86


17
Temp
Temp → o
-0.13
.
18
Light
Light
→ o
0.06
The estimated weights in Table 6 characterise the relationship between the CF data and the damaged leaf count.
The importance of individual input nodes in the final fitted ANN model was calculated by the change in the
sum of squared residuals for the removal of each input node in turn from the model. This gave the following
ranking, in order of importance, of the input nodes: F5, TFm F3, F1 and Area..
9) Predictive models for Poinsettia attributes of plant quality (Objective 2)
i) Preliminary analysis of quality characteristics
Preliminary examination of the relationship between the poinsettia quality characteristics and the chlorophyll
fluorescence data indicated that although the correlation with chlorophyll fluorescence was not strong for any
characteristic there was some evidence that chlorophyll fluorescence had some predictive power for leaf drop
in week 2. The correlations with bract drop were very weak but in view of the importance of bract drop for
poinsettia quality it was thought worthwhile to examine both leaf drop and bract drop in week 2.
ii) Models I and II for leaf drop
Table 7 shows an analysis of variance for the regression model relating leaf drop in week 2 to the mean leaf CF
parameter data at desleeving. For Model I, the regression coefficient for PI at desleeving was positive showing
that a higher PI value gave higher leaf drop in week 2.
For Model II, the important CF parameters for predicting leaf drop were, in order of importance, F3, TFm and
Area and the estimated regression parameters for these terms showed that a plant with a higher value of F3,
Area and TFm at desleeving had a lower leaf drop at week 2 of home-life.
15
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
Table 7: Analysis of variance and chlorophyll fluorescence parameters for poinsettia leaf
drop data
Model
Model I
Model II
Term
Df
s.s.
m.s.
Df
s.s.
m.s.


268
836.76
3.12
268
836.76
3.12
Total
1
0.096
4.57
4.57 ns
1
0.100
4.57
4.57 ns
+ Temp
1
0.029
0.00
0.00 ns
1
0.016
0.00
0.00 ns
+ Light
219.05
1
0.106
219.05
3
235.30
78.42 ***
+ PI/CF
***
parameter
265
613.13
2.31
263
596.91
2.27
Residual
s- TFm
-4.02
-23.80
23.80 **
-43.60
-16.92
16.92 **
- Area
-40.91
-197.65 197.65 ***
- F3
ii) Model III for leaf drop
Fig 8 shows the minimum ANN model consistent with the observe leaf drop and chlorophyll fluorescence data
and Table 8 shows the corresponding estimated weights for the individual network connections.
Fig 8: Fitted ANN model for leaf drop in
week 2 of home-life.
CF parameters - Inputs (xi; i =1…5)
F0
Area
F4
F3
Home-life Factors
F5 Light
Temp.
wij
Hidden layers
(j =1…4)
wjo
Output (y) - Leaf Drop at week 2
16
Constant
()
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
Table 8: Estimated weights in ANN model
for leaf drop
No. Weight
Connection
Value
1
w11
F0
→ h1
-86.80
2
w12
F0
→ h2 -114.48
3
w13
F0
→ h3
64.97
4
w14
F0
→ h4
-60.93
5
w22
Area → h2
68.70
6
w23
Area → h3
-25.85
3
7
w24
Area → h4
30.87
8
w31
F3
→ h1
-86.61
9
w32
F3
→ h2
49.03
10
w33
F3
→ h3
-39.62
3
11
w34
F3
→ h4
-59.30
12
w41
F4
→ h1 -142.28
13
w42
F4
→ h2 118.61
14
w43
F4
→ h3
-75.62
3
15
w44
F4
→ h4
-10.98
16
w51
F5
→ h1 135.45
17
w52
F5
→ h2
-72.20
18
w53
F5
→ h3
47.73
3
19
w54
F5
→ h4
34.98
20
w1o
h1
→ o
5.48
21
w2o
h2
→ o
30.08
22
w3o
h3
→ o
53.97
23
w4o
h4
→ o
-8.57
24
→
o
-27.55


25
Temp
Temp → o
-0.24
.
26
Light
Light
→ o
0.17
The estimated weights in Table 8 characterise the relationship between the CF data and the leaf drop data. The
importance of individual input nodes in the final fitted ANN model was calculated by the change in the sum of
squared residuals for the removal of each input node in turn from the model. This gave the following ranking,
in order of importance, of the input nodes: F5, F4, F0, F3, and Area.
iii) Models I and II for bract drop
Table 9 shows an analysis of variance for the regression model relating bract drop in week 2 to the mean leaf
CF parameter data at desleeving. For Model I, the regression coefficient for PI at desleeving was negative
showing that a higher PI value gave lower bract drop in week 2.
For Model II, the only significant CF parameter for predicting bract drop was F0, with an estimated regression
parameter that showed that a plant with a higher value of F0 at desleeving had a higher bract drop in week 2 of
home-life.
Table 9: Analysis of variance and chlorophyll fluorescence parameters for poinsettia bract
drop data
Model
Model I
Model II
Term
df
s.s.
m.s.
df
s.s.
m.s.


270
358.47
1.34
270
358.47
1.34
Total
1
0.132
0.62
0.62 ns
1
0.094
0.62
0.62 ns
+ Temp
1
-0.124
0.96
0.96 ns
1
-0.103
0.96
0.96 ns
+ Light
1
-0.024
11.28
11.28 **
1
15.76
15.76 ***
+ PI/CF
265
345.62
1.30
265
341.13
1.29
Residual
paramet
-1
23.64
-15.76
15.76 ***
-ers
F0
17
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
iv) Model III for bract drop
Fig 9 shows the minimum ANN model consistent with the observed bract drop and chlorophyll fluorescence
data and Table 10 shows the corresponding estimated weights for the individual network connections.
Fig 9: Fitted ANN model for bract drop in
week 2 of home-life.
CF parameters - Inputs (xi; i =1…6)
F0
Area
F1
F2
F3
Home-life Factors
F5 Light
Temp.
wij
Hidden layers
(j =1…4)
wjo
Output (y) - Bract Drop at week 2
Table 10: Estimated weights in ANN model
for bract drop
No. Weight
Connection
Value
1
w11
F0
→ h1
-63.23
2
w12
F0
→ h2 122.61
3
w13
F0
→ h3
37.24
4
w14
F0
→ h4
35.35
5
w21
Area → h1
40.27
6
w22
Area → h2
-40.30
3
7
w23
Area → h3
-28.78
8
w33
F1
→ h3
28.98
9
w34
F1
→ h4
98.42
10
w41
F2
→ h1
-34.75
3
11
w42
F2
→ h2
-63.64
12
w43
F2
→ h3
3.74
13
w52
F3
→ h2
33.13
14
w54
F3
→ h4
-55.48
15
w61
F5
→ h1
15.49
16
w62
F5
→ h2
-16.96
3
17
w63
F5
→ h3
-11.49
18
w1o
h1
→ o
-54.88
19
w2o
h2
→ o
50.93
20
w3o
h3
→ o -114.53
21
w4o
h4
→ o
57.50
22
→ o
5.10


23
Temp
Temp → o
0.18
.
24
Light
Light
→ o
-0.10
18
Constant
()
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
The estimated weights in Table 10 characterise the relationship between the CF data and the bract drop data.
The importance of individual input nodes in the final fitted ANN model was calculated by the change in the
sum of squared residuals for the removal of each input node in turn from the model. This gave the following
ranking, in order of importance, of the input nodes: F5, F0, F3, F2, F1 and Area.
10) Quantify the power of ANN’s compared with linear models (Objective 3)
The power of the models fitted in Objectives 1 and 2 can be compared by examining the proportion of the total
variance explained by each model. Tables 11 and 12 for begonia and poinsettia, respectively, summarise the
residual sums of squares and degrees of freedom for each fitted model and also show the percentage variance
explained by each model.
Table 11 for begonia shows that the PI predictor in Model I for flower count and damaged leaf count had very
little power and that the multivariate linear predictor in Model II gave a definite increase in power. The full
ANN model gave a further increase in power for both variates although the increase relative to Model II was
modest compared with the increased complexity of the model. For the flower drop data, there was little
difference between the power of Model I and Model II and again the increased power of the full ANN model
was modest compared with the increased complexity of the model. None of the fitted models explained more
than 40% of the total variance.
Table 11: Comparison of the power of Models I, II and III for CF prediction for begonia using the percentage variance
explained by each model relative to the null model to compare models
Null model
Model I
Model II
Model III
Residual
Residual
Variance
Residual
Variance
Residual
Variance
Source
s.s.
df
s.s.
Df
explained
s.s.
df
explained
s.s.
df
explained
Flower
163.22 270
137.76 267
14.6%
123.46 263
22.3%
107.32 255
30.4%
Count
Flower
206.95 270
141.06 267
31.1%
139.30 267
31.9%
116.93 254
39.9%
Drop
Damaged
124.39 270
102.96 267
16.3%
80.73 264
33.6%
69.41 253
40.4%
Leaf Count
Table 12 for poinsettia shows that for leaf drop the PI in Model I and the linear multivariate predictor in Model
II both had similar power and explained about 25% of the variance. However, the ANN predictor in Model III
gave a very substantial increase in power and explained about 50% of the total variation. The predictive power
of chlorophyll fluorescence for the bract drop data was virtually negligible for all three models.
Table 12: Comparison of the power of Models I, II and III for CF prediction for poinsettia using the percentage variance
explained by each model relative to the null model to compare models
Null model
Model I
Model II
Model III
Residual
Residual
Variance
Residual
Variance
Residual
Variance
Source
s.s.
df
s.s.
df
explained
s.s.
df
explained
s.s.
df
explained
Leaf
836.76 268
613.13 265
25.9%
596.91 263
27.3%
377.99 243
50.2%
Drop
Bract
358.47 268
345.62 265
2.5%
341.13 265
3.8%
291.52 245
11.1%
Drop
The improvement in the fit of the models shown in Tables 11 and 12 can be illustrated by calculating a
generalised performance index for each of the three models, Index MI  PI , Index MII    i xi and
n
Index MIII 
w
j o


 i j

   wij xi  based on the chlorophyll fluorescence predictive terms in Models I, II and III
jo h
19
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
respectively. Figures 10[a-c] show plots of the begonia flower drop data in week 2 against the three indexes for
begonia while 11[a-c] show plots of the poinsettia leaf drop data in week 2 against the three indexes for
poinsettia. For begonia flower count, the plots show a progressive reduction in scatter about the predicted
values from Model I to Model III whereas for poinsettia leaf drop, the scatter about the predicted values for
Model I and Model II appear very similar. However, for Model III fitted by ANN, the scatter about the
predicted values for poinsettia leaf drop is very substantially reduced and shows a very substantial improvement
over the linear model predictions.
Fig 10[a-c]: Relationship between flower count at week 2 of home-life and CF index at desleeving from Model
I [a], Model II [b] and Model III [c] for begonia.
8
Square Root of Flower Count at Week 2
[a] Model I
[c] Model III
[b] Model II
7
6
5
4
3
2
10
20
30
40
50
-20.0
IndexMI at desleeving
-19.5
-19.0
-18.5
-59.0
IndexMII at desleeving
-58.5
-58.0
-57.5
IndexMIII at desleeving
Fig 11[a-c]: Relationship between leaf drop at week 2 of home-life and CF index at desleeving from Model I
[a], Model II [b] and Model III [c] for Poinsettia.
10
Square Root of Leaf Drop at Week 2
[a] Model I
[c] Model III
[b] Model II
8
6
4
2
0
20
30
40
50
IndexMI at desleeving
60
-18
-17
-16
IndexMII at desleeving
-15
30
31
32
33
34
35
IndexMIII at desleeving
11) Conclusions
Tables 1 and 5 show that the TFM parameter was the most important term in the Model II regression equations
for begonia flower counts and damaged leaf counts. However, T FM does not occur in the PI calculation
(equation 1) and as PI is scale invariant, it is likely that PI will also be time independent. Therefore PI will be
uninformative about TFM and this is the probable reason why Model I was less informative than Model II about
the begonia flower count and damaged leaf count data.
20
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
For Table 3, the only important regression coefficient in Model II is F5 and although this parameter does not
occur in PI (equation 1), it seems probable that F5 will be highly correlated with other chlorophyll fluorescence
parameters that do occur in PI. For that reason, Model I and Model II have similar power for the begonia flower
drop data. The interpretation of the Model III terms is difficult but it is worthwhile to note that F5 was the most
important input node for both flower count and damaged leaf count, although not for flower drop. The TFM
node was included for all three measured variates which emphasises that the time course of the response curve
needs to be included if CF is used as a predictor for begonia plant quality.
Table 11 shows that for flower count and damaged leaf count the simple linear regression on PI (Model I) had
very little predictive power, the multivariate linear model (Model II) had a significant improvement in power
and the ANN model (Model III) had the best power overall. For flower drop, there was very little to choose
between the predictive power of any of the three models. For all three models, the predictive power of
chlorophyll fluorescence for the quality attributes of individual begonia plants was relatively modest.
Tables 7 and 9 show the regression coefficients for the poinsettia leaf and bract drop data respectively but since
the CF predictive power for bract drop was very low, only Table 7 contains useful information. The dominant
term in Model II for leaf drop was F3 but TFM and Area were also significant. The regression coefficient for F3 in
Table 7 is negative showing that as F3 increased, leaf drop decreased. However, the regression coefficient for PI
in Table 7 is positive showing that as PI increased leaf drop also increased. The reason for this apparent
anomaly is that PI (equation 1) contains F3 only through the difference (F3-F1) in the denominator of the
equation. This means that when F3 increases, PI decreases and this gives rise to the apparently anomalous
situation where a decrease in PI causes a decrease in leaf drop. Hence, PI appears to be negatively related to
quality. This difficulty in the interpretation of the PI variate illustrates one of the advantages of using the
individual CF parameters as explanatory variables rather than a single combined index.
Although the multivariate linear regression model for poinsettia leaf drop gave little increase in power over the
simple regression model, the ANN model had almost double the explanatory power. The fitted ANN model for
poinsettia leaf drop described in Fig 8 and Table 8 is complex and difficult to understand but the ranking of the
input nodes does indicate that F5, F4, F0 and F3 were the most important nodes taken in that order. The complexity
of the ANN model for poinsettia leaf drop suggests that there is some risk that the model may be over-fitted but
the increase in explanatory power compared with the linear models is so large that it seems clear that the ANN
model has produced a real improvement over the linear models.
12) Exploitation
Overall, the power of chlorophyll fluorescence for predicting the home-life performance of individual potplants appears modest. Even the most effective model, Model III for poinsettia leaf drop, explained only about
50% of the observed variability, which is probably inadequate for screening individual plants. However, there
does appear to be the potential for detecting damage to whole batches of plants using batch-screening
techniques. As part of project HL0134LPC, three transport treatments were tested including minimum stress,
normal marketing stress and cold stress and there is evidence from HL0134LPC (see Parsons, Edmondson et al
2001) that there were detectable batch effects on whole batches of plants. The models developed in this project
will be used to test the effectiveness of CF screening for discriminating between batches of plants subjected to
different levels of stress. Chlorophyll fluorescence indexes will be calculated for each plant using the indexes
discussed in Section 10 and the power of chlorophyll fluorescence to discriminate between complete batches of
plants subjected to different levels of stress will be assessed. The work will be reported in the final project
report for HL0134LPC.
21
Project
title
Chlorophyll fluorescence spectral discrimination by artificial
neural network methods
MAFF
project code
HH1530S
PC
References:
DeEll, J., van Kooten, O., Prange, R.. and Murr, D. (1999). Applications of Chlorophyll Fluorescence
Techniques in Postharvest Physiology. Horticultural Reviews 23, 69-107.
Parsons, N. R., Edmondson, R. N., Clark, I. and Langton, F. A. (2001). Robust product design and prediction
for post-harvest pot plant quality and longevity. Third year confidential technical
Strasser, R.J., Eggenberg, P. and Strasser, B.J. (1996). How to work without stress but with fluorescence.
Bulletin de la Sociėtė Royale des Sciences de Liēge 65 (4-5), 330-349.
Tyystjärvi, E., Koski, A., Keränen, M. and Nevalainen, O. (1999). The Kautsky Curve is a Built-in Barcode.
Biophysical Journal 77, 1159-1167.
Please press enter
22