Introduction: Estimating produce yield ahead of harvest time is a

Introduction:
Estimating produce yield ahead of harvest time is a valuable ability for any sort of agricultural
producer. An early, accurate yield estimate allows a producer to set up purchase agreement deals,
properly plan a budget, and plan ahead for the next growing season. With increased incidence of
severe weather patterns and disease, yield estimates are more critical than ever.
Yield prediction is still in its infant stages for specialty fruit production like citrus groves.
Currently, orange producers predict their grove yield using hand collected tree counts in addition
to expert knowledge on factors ranging from the pervasiveness of disease to the number of
young buds early in the season. This process is time consuming and gives only a rough
approximation of the final yield. Yield prediction has been attempted using airborn hyperspectral imaging (Ye 2007) with mixed results from year to year. Vision-based prediction models
as well as multi-modal models incorporating vision have been shown to have high accuracy of
predicting yield for apples and citrus (Wang et al. 2012 ; Swanson et al. 2010) but cannot make
predictions until late in the growing season when the fruit has already budded.
Zaman proved that citrus yield could be correlated with ultrasonic tree canopy volume
measurements (Zaman et al. 2006). Our research aims to find a similar correlation using highresolution ladar scans which have been shown to be more effective than ultrasonic scans at
producing accurate canopy volume measurements (Tumbo et al. 2002 ). We use the greater
accuracy of laser scans to predict yield on a block-by-block basis.
Orange trees experience notoriously extreme year-to-year yield variation, and neither
current yield estimate methods nor our method claims to accurately predict the yield for an
individual block in a single year. We need only match the accuracy of the current yield
estimation methods to provide a valuable alternative for citrus growers. Currently, obtaining tree
counts requires an expert operator to drive down a row of trees and hand count and mark the
GPS coordinates of each tree. Each row must be traversed twice because the operator can focus
on only one side of the row at a time. Our research shows that growers can obtain yield estimates
without the time consuming process of hand counting trees by attaching scanners to existing
hardware (such as pruners or sprayers).
Apparatus ( I don’t know the specifics of the laser-truck set up ):
Laser scanner set up on pickup truck. Takes some number of readings encompassing
some range of angles to obtain a verticle snapshot of a row of trees.
Procedure:
Our yield data is limited in accuracy by the scale on which it was collected and by the
edge effects introduced through the harvesting procedure. A harvesting truck continues
collecting oranges until it is full, regardless of whether it is crossing between tree blocks.
Therefore some of the yield from a previous block may be incorrectly attributed to its neighbor.
This random error should even out across the entire grove as it is equally likely to occur between
any two blocks (Is this true?).
Our research was conducted using laser scan data from [#-acres] acres in a Florida
orange grove from 2011 to 2012 with associated ground truth data on the ages of each tree as
well as the yield for each half-acre block of trees. Tree ages were classified as between one to
two years, three to five years, six to nine years, and over ten years. We obtained the yield data
with one half square acre resolution and calculated the volume and density of each block.
Feature Extraction:
Each raw laser scan was cleaned of points not associated with a tree’s canopy by ignoring
points below a height threshold and points further than the tree’s trunk (the known width of the
row).
Figure 1. Laser Scan Data before pre-processing to remove points that are unlikely to be associated with the tree
canopy due to their height and distance from the scanner.
Figure 2. Processed laser scan omitting outlying points.
The remaining points were then reflected across the axis of the trunk. The convex hull
formed by the inlaying points and their reflections was taken to be the area of the canopy.
Matlab’s convhull method was used to generate the convex hull and volume.
Figure 3. Inliers reflected across the axis of the trunk to form the frame of a convex hull.
Figure 4. The convex hull generated from the reflected canopy points in figure 3.
The area of the resulting convex hull was then integrated over the distance to the next
scan to generate a volume ‘slice’. The sum of all the slices for a given block of trees constitutes
the volume feature of that block.
A barren tree will produce fewer oranges than a lush tree, so we calculated a density
approximation of each slice to better represent the fullness of scanned trees. The density for a
volume slice was calculated using the ratio of:
# of valid laser readings between the apparatus and the tree
____________________________________________
total # of valid laser readings
Where a valid laser reading is defined as one above the known minimum height at which the tree
canopy begins. Visually this ratio is shown in figure 5. Put simply, the more laser readings
penetrating a canopy the less dense that canopy is. The density of a volume slice is the average
density of the two scans that constitute that slice.
Figure 5. Density approximation of a tree. Points between the laser and the tree trunk are shown in green. Points
beyond the tree trunk are shown in red.
We used the laser scans to generate gridded height maps of the blocks. We then plotted
our calculated volume slices alongside the height map to confirm that our estimation of volume
was reasonable. Figures 6 and 7 show the volume approximation rising and falling in tandem
with the height of the trees.
Figure 6. Height map with volume plotted alongside (20cm resolution). The right-side volume is plotted in red and
the left-side volume in blue. (Hard to see anything with axis equal on)
Figure 7. Zoomed in view of figure 6.
Our height/volume plots correlate with the ground truth tree types as well. Figure 8
shows the GPS location of the expert operator hand-collected tree data overlaid on a section of
figure 6. The size of the circles is relative to the age of the tree.
Figure 8. Expert Operator collected Tree-Age locations overlaid on height/volume plot.
Volume and Yield can be shown to correlate on a macro scale as well. When we color and plot
each block according to its volume and yield the connection, though imperfect, makes itself
clear. The blue and red regions tend to match up between the two plots.
Figure 9. Left: Total volume by plot. Right: Total yield by plot from 2009 – 2012. Colors range from blue (low
value) to red (high value).
We generated a linear regression model between the volumes and tree counts, both taken
at a block level, using Matlab’s LinearModel.fit function to confirm the relationship between
volume and tree count. We plotted the generated model to create an added variable plot of the
correlation. Volume and tree counts correlate linearly with an r2 = .913. [Carl: Explanation of
what plot(model) does with multiple predictor variables – An added variable plot illustrates the
incremental effect on the response of specified terms by removing the effects of all other terms. The slope of the fitted
line is the coefficient of the linear combination of the specified terms projected onto the best-fitting direction. The
adjusted response includes the constant (intercept) terms, and averages out all other terms.]
Figure 10. Added Variable plot of tree counts versus volume. R2 = .913
We performed the same analysis between the block level volume and yield, and the block
level tree counts and yield to produce correlation estimates.
Figure 11. Scatter plot of block-level volumes and yield. R2 = .444
Figure 12. Added Variable plot of tree count versus yield. R2 = .441
Tree counts and volume correlate closely with each other and equally well with yield data. We
see a more linear relationship when we incorporate our density estimate, increasing the R2 value
from .444 to .456.
Figure 13. Added variable plot of density and volume vs yield. R2 = .456.
Analysis:
We split the data into a train and validation set and compared our laser scan based yield
estimate to the yield estimate obtained from tree counts. We generated a linear regression model
based on the training set and then used that model with the testing set data to obtain estimations
for block-yield. Trees in the range of one to two years were ignored due to their inability to
produce oranges.
We performed K-fold validation to confirm our results. We split the data set into four
quarters and performed four experiments where each experiment used three data quarters as the
training set and the remaining quarter as the test set. This way we could be sure we weren’t
training with an unusually predictive set of blocks and introducing bias into our analysis.
Results:
We found that our yield predictions were more accurate when we trained with the
volume/density scan data than with the ground truth tree counts. The tree counts predicted yield
with greater error in almost every case.
Features
Volume/Density
Tree Counts
Volume/Density
Tree Counts
Error * 103
Average
Average
RMS
RMS
Exp. 1
.8947
1.0494
1.1251
1.3588
Exp. 2
1.1313
1.1263
1.3784
1.3807
Exp. 3
.7822
.7763
1.0975
1.1517
Exp. 4
.5642
.6084
.7403
.7759
Conclusion:
Our ladar based yield prediction performs as well as tree count based yield prediction.
Therefore citrus growers can use our method to save themselves the time and effort required to
generate a hand tabulated count of trees by location and age.
References
Swanson, Matthew, Cristian Dima, and Anthony Stentz. “A Multi-Modal System for Yield
Prediction in Citrus Trees.” ASABE Meeting Presentation. David L.Lawrence Convention
Center, Pittsburgh, PA. June 20th 2010. Paper 10.
Tumbo , S.D., M. Salyani, J.D. Whitney, T.A. Wheaton, and W.M. Miller. “Investigation of
Laser and Ultrasonic Raning Sensors for Measurements of Citrus Canopy Volume.” Applied
Engineering in Agriculture. 18(3). 2002: 367 – 372. Print.
Viau, Alain, Jae-Don’t Jang, Véronique Payan, and Alain Devost. “The Use of Airborne LIDAR
and Multispectral Sensors for Orchard Trees Inventory and Characterization”. Precision
Agriculture. 5. 2005:
Wang, Qi. Stephen Nuske, Marcel Burderman, and Sanjov Singh. “Automated Crop Yield
Estimation for Apple Orchards”. 2012. Robotics Institute. Paper 931.
Ye, Xujun, Kenshi Sakai, Masafumi Manago, Shin-ichi Asada, and Akira Sasao. “Prediction of
citrus yield from airborne hyperspectral imagery”. Precision Agriculture. 8(1). 2007: 111 –
125.
Zaman, Q.U., A.W. Schumann, and H.K. Hostler. “Estimation of Citrus Fruit Yield Using
Ultrasonically-Sensed Tree Size.” Applied Engineering in Agriculture. 22(1). 2006: 39-44.
Print.