Broiler Growth Modeling - Biostatistical Consulting Center

Modeling Biological Variables
using Neural Networks
H. Anwar Ahmad, PhD, MBA, MCIS
Associate Professor
Luma Akil. PhD-C
Dept. Biology, Jackson State University
COMSATS May 16, 2012
My Journey
Two Research Projects:
• NIH Biostatistics Support Unit
• DOS Biostatistics Consulting Center
Broiler Growth Modeling
Broiler Growth Modeling
• Determination of optimal nutrient requirements
in broiler requires an adequate description of
bird growth and body composition using growth
curves.
• One-way to calculate nutrient requirements and
predict the feed intake of growing broilers needs
to start with the understanding of the genetic
potential.
Broiler Growth Modeling
• Typically nutritionists takes one nutrient at
a time and keeping everything else
constant monitor the body growth over
certain period of time on a graded level of
that particular nutrient, ignoring overall
system approach and thus missing the
bigger picture.
Gompertz non linear Regression
Equation
• W = A exp[−log(A/B)exp(−Kt)].
• W is the weight to age (t) with 3 parameters:
• A is asymptotic or maximum growth
response,
• B is intercept or weight when age (t) = 0,
• K is rate constant.]
Broiler Growth Modeling
• The current research project is a step
towards a holistic system approach.
• Using all the available information from
published literature, simulated and
experimental data artificial intelligence (AI)
models were developed using body growth
for future energy requirements in broilers.
Neural Networks
• A neural network is a mathematical model of
an information processing structure that is
loosely based on the working of human brain.
• An artificial neural network consists of large
number of simple processing elements
connected to each other in a network.
Neural Networks-Neurons
Schematic View of Artificial Neural Network
Output Layers
Hidden Layers
Input Layers
Data Collection
• Average daily body weights of 25 male broiler
chicks (Ross x Ross 308) from day 1- day 70,
were selected from a recently published report
(Roush et al., 2006).
• These averages were converted into 14 interval
classes of 5 days each, with means and
standard deviations.
• These five day-interval classes of broiler growth
reflect accurate growth patterns and curves in
broilers.
5-D Average
Bwt Interval
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Mean
55.20
138.20
312.20
588.80
945.40
1354.00
1826.40
2334.60
2828.60
3292.40
3765.80
4159.80
4488.20
4708.80
Standard 95% confidence
Deviation
interval
14.46
6.34
38.80
17.01
70.95
31.09
104.71
45.89
120.20
52.68
135.07
59.20
162.98
71.43
166.98
73.18
147.32
64.57
152.40
66.79
137.95
60.46
110.91
48.61
94.15
41.26
32.87
14.40
Lower
95% CI
48.86
121.19
281.11
542.91
892.72
1294.80
1754.97
2261.42
2764.03
3225.61
3705.34
4111.19
4446.94
4694.40
Upper RiskNormal
95% CI Distribution
61.54
55.20
155.21
138.20
343.29
312.20
634.69
588.80
998.08
945.40
1413.20
1354.00
1897.83
1826.40
2407.78
2334.60
2893.17
2828.60
3359.19
3292.40
3826.26
3765.80
4208.41
4159.80
4529.46
4488.20
4723.20
4708.80
Data Simulation
• The 14 means and standard deviations
were used to simulate broiler growth data
from day 1 to day 70, with Normal
distribution of @Risk software (Palisade
Corporation).
• Six simulations with one thousand
iterations each were performed.
Data Manipulation and Simulation
• On each 14 interval classes of 5days, 100 data
points were randomly drawn for a total of 1400
observations, representing 20 observations for
each day up to 70 days.
• Only 50 days data were used for this research
and were arranged in one row of spreadsheet to
determine training examples for neural network
training.
Neural Networks Training
• Out of 1400 data points, 750 training
examples (epoch) for NN training were
generated
Neural Networks Training
• Out of 1400 data points, 750 training examples
(epoch) for NN training were generated.
• Starting from the day one, the first four body
weight observations were used as inputs while
the fifth day body weight observation were used
as output, to constitute one training example
(epoch).
Neural Networks Training
• The second training example consisted of
second, third, fourth, and fifth observations of
day one body weight, as inputs, while the sixth
observation of day one as output.
• There were a total of 750 such examples
(epoch) generated with the simulated data.
Neural Networks Testing
• For the NN testing the same procedure of
epoch generation was applied.
• However, instead of simulated data, the
original body weight data was used for a
total of 50 epoch.
Age (d)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
BW (g)
43
43
51
62
77
93
113
134
159
192
227
265
308
355
406
460
520
586
654
724
796
868
944
1018
1101
1185
1270
1349
1438
1528
1632
1704
1830
1936
2030
2122
2230
2335
2442
2544
2650
2739
2801
2942
3011
3089
3207
3295
3395
3476
I1
43
43
51
62
77
93
113
134
159
192
227
265
308
355
406
460
520
586
654
724
796
868
944
1018
1101
1185
1270
1349
1438
1528
1632
1704
1830
1936
2030
2122
2230
2335
2442
2544
2650
2739
2801
2942
3011
3089
3207
3295
3395
3476
I2I
43
51
62
77
93
113
134
159
192
227
265
308
355
406
460
520
586
654
724
796
868
944
1018
1101
1185
1270
1349
1438
1528
1632
1704
1830
1936
2030
2122
2230
2335
2442
2544
2650
2801
2801
2942
3011
3089
3207
3295
3395
3476
3568
I3
51
62
77
93
113
134
159
192
227
265
308
355
406
460
520
586
654
724
796
868
944
1018
1101
1185
1270
1349
1438
1528
1632
1704
1830
1936
2030
2122
2230
2335
2442
2544
2650
2739
2942
2942
3011
3089
3207
3295
3395
3476
3568
3706
I4
62
77
93
113
134
159
192
227
265
308
355
406
460
520
586
654
724
796
868
944
1018
1101
1185
1270
1349
1438
1528
1632
1704
1830
1936
2030
2122
2230
2335
2442
2544
2650
2739
2801
3011
3011
3089
3207
3295
3395
3476
3568
3706
3770
O
77
93
113
134
159
192
227
265
308
355
406
460
520
586
654
724
796
868
944
1018
1101
1185
1270
1349
1438
1528
1632
1704
1830
1936
2030
2122
2230
2335
2442
2544
2650
2739
2801
2942
3089
3089
3207
3295
3395
3476
3568
3706
3770
3867
Epoch
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Results- Neural Networks Comparisons
Output:
BP3
Output:
BP5
Output:
Ward
R squared:
0.998 R squared:
0.967 R squared:
0.973
r squared:
0.998 r squared:
0.994 r squared:
0.9986
Mean squared error:
3514 Mean squared error:
47376 Mean squared error:
38949
Mean absolute 51.198
error: Mean absolute190.369
error: Mean absolute
182.532
error:
Min. absolute error:
0.453 Min. absolute error:
0.058 Min. absolute error:
35.495
Max. absolute118.192
error: Max. absolute 516.575
error: Max. absolute error:
415.5
Correlation coefficient
0.999 r:Correlation coefficient
0.997r:Correlation coefficient
0.9993 r:
Percent within 5%:
64.000 Percent within 5%:
4.000 Percent within 5%: 0
Percent within 5%
20.000
to 10%:
Percent within 5%
34.000
to 10%:
Percent within 5% 28
to 10%:
Percent within 10%
2.000
to 20%:
Percent within 10%
24.000
to 20%:
Percent within 10%38to 20%
Percent within 20%
2.000
to 30%:
Percent within 20%
32.000
to 30%:
Percent within 20%12to 30%
Percent over 30%:
12.000 Percent over 30%:
6.000 Percent over 30%: 22
BP3- NN Body Weight Prediction with Roush Test Data
4500
2500
Actual
BP3Network
1500
Act-BP3Net
Age (d)
49
46
43
40
37
34
31
28
25
22
19
16
13
10
7
-500
4
500
1
BWt (g)
3500
BP5- NN Body Weight Prediction with Roush Test Data
3500
Actual
2500
BP5Network
Act-BP5Net
1500
Age (d)
49
46
43
40
37
34
31
28
25
22
19
16
13
10
7
-500
4
500
1
BWt (g)
4500
Ward NN Body Weight Prediction with RoushTest Data
4500
Actual
2500
WardNetwork
1500
Age (d)
49
46
43
40
37
34
31
28
25
22
19
16
13
10
-500
7
500
4
Act-WardNet
1
BWt (g)
3500
Conclusion
• A holistic system approach encompasses
all aspects of the problem instead
fragmented solution.
• Simulated approach is efficient and
feasible once the underlying variables and
parameters of the problem are clearly
understood and defined.
Conclusion
• Ahmad, H. A, 2009. Poultry Growth
Modeling using Neural Networks and
Simulated Data. J. Appl. Poultry Res.
18:440-446
• Ahmad, H.A., M. Mariano, 2006.
Comparison of Forecasting Methodologies
using Egg Price as a Test Case. Poultry
Science 85:798-807
Conclusion
• BP3 NN gives the best prediction lines with near
perfect R² (0.998) value out of all the NN
architecture utilized.
• NN modeling approach is an efficient and better
alternative to its traditional statistical
counterparts.
• Further research on energy and amino acid
modeling will enhance our understanding of
these modeling approaches.
Egg Production Modeling
Egg Production Modeling
Mathematical and Stochastic Egg Production Models
• Compartmentalization model (Grossman and
Koops, 2001)
• Very complex
• Stochastic model (Alvarez and Hocking, 2007)
• Require too many variables to determine egg
production.
• Impractical under most commercial situations.
Data Collection
• Average weekly egg production data of 240
layers from 22 US commercial strains were
graciously provided by David A. Roland of
Auburn University.
• The data was collected in three phases:
wk 21-36; wk 37- 52; wk 53-66.
• Daily egg production data of 22 strains were
computed into 7 day-hen production means and
standard deviations.
Egg Production Curve
•
•
•
•
• Egg production curve in commercial layers
follows a unique pattern:
Production begins around 20 weeks of age
starting slowly around 5%, increase weekly for
the next 7-8 weeks until attain peak production
of 95-97% around week 28.
During next 8-10 weeks, egg production
maintains a plateau around its peak production.
Egg production during week 38-52 (phase II),
slowly start declining.
During phase III (wk 53-72), production declines
further until it reaches a point of non-profitability,
around 60%
Data Collection
• The original data was used to map egg
production curves in three different
phases, compare curves among 22
commercial strains and compare strains
laying white eggs with those laying brown
eggs.
wk
21
wk
23
wk
25
wk
27
wk
29
wk
31
wk
33
wk
35
wk
37
wk
39
wk
41
wk
43
wk
45
wk
47
wk
49
wk
51
wk
53
wk
55
wk
57
wk
59
wk
61
wk
63
wk
65
Egg Production %
Egg Production Curve
120
100
80
60
40
20
0
Age from wk 21
k2
1
w
k2
2
w
k2
3
w
k2
4
w
k2
5
w
k2
6
w
k2
7
w
k2
8
w
k2
9
w
k3
0
w
k3
1
w
k3
2
w
k3
3
w
k3
4
w
k3
5
w
k3
6
w
Egg Production %
Egg Production Curve-phase I
Phase I
120
100
80
60
40
20
0
Age from wk 21
Age from wk 37
w
w
w
w
w
w
w
w
w
w
w
w
w
w
w
w
k5
2
k5
1
k5
0
k4
9
k4
8
k4
7
k4
6
k4
5
k4
4
k4
3
k4
2
k4
1
k4
0
k3
9
k3
8
k3
7
Egg Production %
Egg Production Curve-phase II
Phase II
99
98
97
96
95
94
93
92
91
Egg Production Curve-phase III
Egg Production %
Phase III
96
94
92
90
88
86
84
82
80
w
3
k5
w
4
k5
w
5
k5
w
6
k5
w
7
k5
w
8
k5
w
9
k5
w
0
k6
w
1
k6
w
Age from wk 53
2
k6
w
3
k6
w
4
k6
w
5
k6
w
6
k6
Comparative Egg Production
Brown vs. White Layers
120
EP%
100
80
60
40
Brown Av
20
White Av
0
1
3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47
Age from wk 22
Comparative Egg Production Curves of
US Commercial Strains-phase I
120
Egg Production %
100
80
Only strain 1 was different from 13
60
40
20
0
1
2
3
4
5
6
7
8
9
10
Age from wk 22
11
12
13
14
15
16
Egg Production Modeling
• For the current project, data from one of the
brown egg laying strain was chosen for
simulation and training of neural networks.
• Mean weekly egg weights and standard
deviations from wk 22-36 were computed.
• These parameters were fed into Normal
Distribution of @Risk 4.0 software (Palisade
Corporation).
• Six simulations with one thousand iterations
each were performed.
Data Simulation and Manipulation
• On each 15 set of mean and standard
deviation representing each week of
production from week 22 to week 36, 20
data points were randomly drawn for a total
of 300 observations, for training the neural
networks.
• Similarly for testing the neural networks, ten
random data points were drawn.
Data Simulation and Manipulation
• Each of 20 training and 10 test
observations, for each week egg
production, were arranged in one row of a
spreadsheet to determine 105 neural
network examples, respectively.
Neural Networks Training
• Starting from the wk 22, the first four egg
production observations were used as
inputs while the fifth observation were
used as output, to constitute one training
example (epoch).
Neural Networks Training
• The second training example consisted of
second, third, fourth, and fifth observations of
egg production, as inputs, while the sixth
observation as output.
• There were a total of 105 such examples
(epoch) generated with the simulated data.
• For the NN testing the same procedure of
epoch generation was applied.
NN Examples (Epoch)
Age
Eg.
I1
I2
I3
I4
O
wk22
d1, epoch1
6.67
12.62
21.11
54.44
10.86
wk22
d2, epoch2
12.62
21.11
54.4
10.86
24.06
wk22
d3, epoch3
21.11
54.44
10.86
24.06
14.16
wk22
d4
54.44
10.86
24.06
14.16
19.67
wk22
d5
10.86
24.06
14.16
19.67
40.96
wk22
d6
24.06
14.16
19.67
40.96
54.21
wk22
d7
14.16
19.67
40.96
54.21
46.47
wk23
d8, epoch8
32.09
24.4
33.3
48.34
78.37
wk23
d9
24.4
33.3
48.34
78.37
47.75
wk23
d10
33.3
48.34
78.37
47.75
52.19
wk23
d11
48.34
78.37
47.75
52.19
73.86
wk23
d12
78.37
47.75
52.19
73.86
50.92
wk23
d13
47.75
52.19
73.86
50.92
49.9
wk23
d14
52.19
73.86
50.92
49.9
44.69
BP-3 Model
BP-3 NN Strain 1, phase 1
Actual
BP-3NN
Act-Net
150
EP%
100
R² = 0.68
50
0
1
6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101
-50
-100
Age(d)
GRNN Model
Actual
GRNN
Act-Net
GRNN Strain 1, phaseI
150
R² = 0.72
EP%
100
50
0
1
8
15
22 29
36 43
50
57 64
-50
-100
Age(d)
71 78
85 92
99
WARD-5 Model
Ward-5 NN Strain 1, phaseI
Actual
Ward-5NN
Act-Net
150
EP%
100
50
0
-50
1
7
13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 103
-100
Age(d)
Results- Neural Networks Comparisons
Parameter
BP-3
GRNN
Ward-5
R squared:
0.681
0.715
0.697
r squared:
0.78
0.75
0.76
Mean squared error:
136.09
121.51
129.38
Mean absolute error:
7.81
7.28
7.40
Min. absolute error:
0.04
0.12
0.09
Max. absolute error:
46.54
43.54
48.40
Correlation coefficient r:
0.88
0.86
0.87
Percent within 5%:
45.71
49.52
50.48
Percent within 5% to 10%:
25.71
24.76
20.95
Percent within 10% to 20%:
16.19
15.24
18.10
Percent within 20% to 30%:
5.71
2.86
4.76
Percent over 30%:
6.67
7.62
5.71
GRNN comparison with brown
strains, phase I
GRNN Brown strains, phase I
120
100
EP%
80
60
40
20
0
1
2
3
4
5
6
7
8
9
Age from w k 22
10
11
12
13
14
15
GRNN comparison with white
strains, phase I
GRNN White strains, phase I
120
EP%
100
GRNN
80
60
40
White strains average
20
0
1
2
3
4
5
6
7
8
Age from w k 22
9
10
11
12
13
14
15
GRNN comparison with
commercial strains, phase I
Strains comparisons with GRNN, phase I-EP%
120
100
EP%
80
GRNN
60
40
20
0
1
2
Series1
Series8
Series15
Series22
3
4
Series2
Series9
Series16
Series23
5
6
Series3
Series10
Series17
Series24
7
Series4
Series11
Series18
Series25
8
9
Strains
Series5
Series12
Series19
10
11
Series6
Series13
Series20
12
13
Series7
Series14
Series21
14
15
Conclusion
• Data simulation may offer a feasible
alternative to expensive original data
collection once the underlying variable’s
parameters are defined and understood.
• Compared to other mathematical and
statistical models neural networks
predicted egg production curve
accurately and efficiently.
Conclusion
• All three architect of NN produced comparable
results in terms of R² that varied from 0.681 to
0.715.
• All the networks over predicted during the initial
period when egg production was peaking.
Conclusion
• Once the initial over-prediction anomaly is
corrected, NN modeling approach will be an
efficient and superior to its traditional
mathematical and statistical counterparts for
egg production prediction.
• Further research on energy and amino acid
modeling will enhance our understanding of
these modeling approaches.
Conclusion
• Ahmad, H. A, 2011. Egg Production
Forecasting: Determining efficient
modeling approaches. J. Appl. Poultry
Res. 20(#4):463-473. NIHMSID # 365792
+
CD4
Pattern Recognition using
GRNN in HIV Patients
GRNN_CD4 Prediction
CD4
1000
900
800
700
R2 = 0.92
Actual CD4
600
GRNN
500
400
300
200
100
0
1
3
5
7
9
11
13
15
17
19
Pattern
21
23
25
27
29
31
33
35
CD4+ prediction with Viral Load
using BP-3NN
1400
1200
CD4 Prediction with BP-3
Actual CD4
1000
CD 4 Values
800
BP3 CD4
600
400
200
0
R squared:
Viral Load
Immunology 101
• CD4 is a protein that is present on
Thelper lymphocytes (blood cells that respond to
foreign substances).
• Thelper do not kill the cells that carries the antigen
but interact with B-lymphocytes or Tkiller
lymphocytes to respond to antigens.
• Simple tests are available for CD4 proteins that
in turn identify and count Thelper lymphocytes.
BPNN for CD4 Prediction with VL
Actual CD4
CD 4
200
180
160
140
120
100
80
60
40
20
0
R2 = .1392
NN Prediction
1
15 29 43 57 71 85 99 113 127 141 155 169
NN predicted CD4
Test data
GRNN for CD4 Prediction with VL
1400
R2 = 0.194
CD4
1200
Actual CD4
1000
800
GRNN pred. CD4
600
400
200
0
1
7
13
19
25
31
37
43
49
55
61
67
73
79
Test Data
85
91
97
103 109 115 121 127 133 139 145 151 157 16
GRNN Prediction for Viral Load
with CD4
Viral load
GRNN pred VL
1200000
Actual viral load
1000000
800000
600000
400000
200000
0
1
17 33 49 65 81 97 113 129 145 161 177
Test data
BP-3NN CD4 Prediction
using Pattern
CD4
R2 = .84
Actual CD4
1200
1000
NN CD4
800
600
400
200
0
1 3
5 7
9 11 13 15 17 19 21 23 25 27 29 31 33 35
Test data
High Blood Pressure vs. Obesity
Ozone Modeling
Salmonella Outbreaks