Modeling Biological Variables using Neural Networks H. Anwar Ahmad, PhD, MBA, MCIS Associate Professor Luma Akil. PhD-C Dept. Biology, Jackson State University COMSATS May 16, 2012 My Journey Two Research Projects: • NIH Biostatistics Support Unit • DOS Biostatistics Consulting Center Broiler Growth Modeling Broiler Growth Modeling • Determination of optimal nutrient requirements in broiler requires an adequate description of bird growth and body composition using growth curves. • One-way to calculate nutrient requirements and predict the feed intake of growing broilers needs to start with the understanding of the genetic potential. Broiler Growth Modeling • Typically nutritionists takes one nutrient at a time and keeping everything else constant monitor the body growth over certain period of time on a graded level of that particular nutrient, ignoring overall system approach and thus missing the bigger picture. Gompertz non linear Regression Equation • W = A exp[−log(A/B)exp(−Kt)]. • W is the weight to age (t) with 3 parameters: • A is asymptotic or maximum growth response, • B is intercept or weight when age (t) = 0, • K is rate constant.] Broiler Growth Modeling • The current research project is a step towards a holistic system approach. • Using all the available information from published literature, simulated and experimental data artificial intelligence (AI) models were developed using body growth for future energy requirements in broilers. Neural Networks • A neural network is a mathematical model of an information processing structure that is loosely based on the working of human brain. • An artificial neural network consists of large number of simple processing elements connected to each other in a network. Neural Networks-Neurons Schematic View of Artificial Neural Network Output Layers Hidden Layers Input Layers Data Collection • Average daily body weights of 25 male broiler chicks (Ross x Ross 308) from day 1- day 70, were selected from a recently published report (Roush et al., 2006). • These averages were converted into 14 interval classes of 5 days each, with means and standard deviations. • These five day-interval classes of broiler growth reflect accurate growth patterns and curves in broilers. 5-D Average Bwt Interval 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Mean 55.20 138.20 312.20 588.80 945.40 1354.00 1826.40 2334.60 2828.60 3292.40 3765.80 4159.80 4488.20 4708.80 Standard 95% confidence Deviation interval 14.46 6.34 38.80 17.01 70.95 31.09 104.71 45.89 120.20 52.68 135.07 59.20 162.98 71.43 166.98 73.18 147.32 64.57 152.40 66.79 137.95 60.46 110.91 48.61 94.15 41.26 32.87 14.40 Lower 95% CI 48.86 121.19 281.11 542.91 892.72 1294.80 1754.97 2261.42 2764.03 3225.61 3705.34 4111.19 4446.94 4694.40 Upper RiskNormal 95% CI Distribution 61.54 55.20 155.21 138.20 343.29 312.20 634.69 588.80 998.08 945.40 1413.20 1354.00 1897.83 1826.40 2407.78 2334.60 2893.17 2828.60 3359.19 3292.40 3826.26 3765.80 4208.41 4159.80 4529.46 4488.20 4723.20 4708.80 Data Simulation • The 14 means and standard deviations were used to simulate broiler growth data from day 1 to day 70, with Normal distribution of @Risk software (Palisade Corporation). • Six simulations with one thousand iterations each were performed. Data Manipulation and Simulation • On each 14 interval classes of 5days, 100 data points were randomly drawn for a total of 1400 observations, representing 20 observations for each day up to 70 days. • Only 50 days data were used for this research and were arranged in one row of spreadsheet to determine training examples for neural network training. Neural Networks Training • Out of 1400 data points, 750 training examples (epoch) for NN training were generated Neural Networks Training • Out of 1400 data points, 750 training examples (epoch) for NN training were generated. • Starting from the day one, the first four body weight observations were used as inputs while the fifth day body weight observation were used as output, to constitute one training example (epoch). Neural Networks Training • The second training example consisted of second, third, fourth, and fifth observations of day one body weight, as inputs, while the sixth observation of day one as output. • There were a total of 750 such examples (epoch) generated with the simulated data. Neural Networks Testing • For the NN testing the same procedure of epoch generation was applied. • However, instead of simulated data, the original body weight data was used for a total of 50 epoch. Age (d) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 BW (g) 43 43 51 62 77 93 113 134 159 192 227 265 308 355 406 460 520 586 654 724 796 868 944 1018 1101 1185 1270 1349 1438 1528 1632 1704 1830 1936 2030 2122 2230 2335 2442 2544 2650 2739 2801 2942 3011 3089 3207 3295 3395 3476 I1 43 43 51 62 77 93 113 134 159 192 227 265 308 355 406 460 520 586 654 724 796 868 944 1018 1101 1185 1270 1349 1438 1528 1632 1704 1830 1936 2030 2122 2230 2335 2442 2544 2650 2739 2801 2942 3011 3089 3207 3295 3395 3476 I2I 43 51 62 77 93 113 134 159 192 227 265 308 355 406 460 520 586 654 724 796 868 944 1018 1101 1185 1270 1349 1438 1528 1632 1704 1830 1936 2030 2122 2230 2335 2442 2544 2650 2801 2801 2942 3011 3089 3207 3295 3395 3476 3568 I3 51 62 77 93 113 134 159 192 227 265 308 355 406 460 520 586 654 724 796 868 944 1018 1101 1185 1270 1349 1438 1528 1632 1704 1830 1936 2030 2122 2230 2335 2442 2544 2650 2739 2942 2942 3011 3089 3207 3295 3395 3476 3568 3706 I4 62 77 93 113 134 159 192 227 265 308 355 406 460 520 586 654 724 796 868 944 1018 1101 1185 1270 1349 1438 1528 1632 1704 1830 1936 2030 2122 2230 2335 2442 2544 2650 2739 2801 3011 3011 3089 3207 3295 3395 3476 3568 3706 3770 O 77 93 113 134 159 192 227 265 308 355 406 460 520 586 654 724 796 868 944 1018 1101 1185 1270 1349 1438 1528 1632 1704 1830 1936 2030 2122 2230 2335 2442 2544 2650 2739 2801 2942 3089 3089 3207 3295 3395 3476 3568 3706 3770 3867 Epoch 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Results- Neural Networks Comparisons Output: BP3 Output: BP5 Output: Ward R squared: 0.998 R squared: 0.967 R squared: 0.973 r squared: 0.998 r squared: 0.994 r squared: 0.9986 Mean squared error: 3514 Mean squared error: 47376 Mean squared error: 38949 Mean absolute 51.198 error: Mean absolute190.369 error: Mean absolute 182.532 error: Min. absolute error: 0.453 Min. absolute error: 0.058 Min. absolute error: 35.495 Max. absolute118.192 error: Max. absolute 516.575 error: Max. absolute error: 415.5 Correlation coefficient 0.999 r:Correlation coefficient 0.997r:Correlation coefficient 0.9993 r: Percent within 5%: 64.000 Percent within 5%: 4.000 Percent within 5%: 0 Percent within 5% 20.000 to 10%: Percent within 5% 34.000 to 10%: Percent within 5% 28 to 10%: Percent within 10% 2.000 to 20%: Percent within 10% 24.000 to 20%: Percent within 10%38to 20% Percent within 20% 2.000 to 30%: Percent within 20% 32.000 to 30%: Percent within 20%12to 30% Percent over 30%: 12.000 Percent over 30%: 6.000 Percent over 30%: 22 BP3- NN Body Weight Prediction with Roush Test Data 4500 2500 Actual BP3Network 1500 Act-BP3Net Age (d) 49 46 43 40 37 34 31 28 25 22 19 16 13 10 7 -500 4 500 1 BWt (g) 3500 BP5- NN Body Weight Prediction with Roush Test Data 3500 Actual 2500 BP5Network Act-BP5Net 1500 Age (d) 49 46 43 40 37 34 31 28 25 22 19 16 13 10 7 -500 4 500 1 BWt (g) 4500 Ward NN Body Weight Prediction with RoushTest Data 4500 Actual 2500 WardNetwork 1500 Age (d) 49 46 43 40 37 34 31 28 25 22 19 16 13 10 -500 7 500 4 Act-WardNet 1 BWt (g) 3500 Conclusion • A holistic system approach encompasses all aspects of the problem instead fragmented solution. • Simulated approach is efficient and feasible once the underlying variables and parameters of the problem are clearly understood and defined. Conclusion • Ahmad, H. A, 2009. Poultry Growth Modeling using Neural Networks and Simulated Data. J. Appl. Poultry Res. 18:440-446 • Ahmad, H.A., M. Mariano, 2006. Comparison of Forecasting Methodologies using Egg Price as a Test Case. Poultry Science 85:798-807 Conclusion • BP3 NN gives the best prediction lines with near perfect R² (0.998) value out of all the NN architecture utilized. • NN modeling approach is an efficient and better alternative to its traditional statistical counterparts. • Further research on energy and amino acid modeling will enhance our understanding of these modeling approaches. Egg Production Modeling Egg Production Modeling Mathematical and Stochastic Egg Production Models • Compartmentalization model (Grossman and Koops, 2001) • Very complex • Stochastic model (Alvarez and Hocking, 2007) • Require too many variables to determine egg production. • Impractical under most commercial situations. Data Collection • Average weekly egg production data of 240 layers from 22 US commercial strains were graciously provided by David A. Roland of Auburn University. • The data was collected in three phases: wk 21-36; wk 37- 52; wk 53-66. • Daily egg production data of 22 strains were computed into 7 day-hen production means and standard deviations. Egg Production Curve • • • • • Egg production curve in commercial layers follows a unique pattern: Production begins around 20 weeks of age starting slowly around 5%, increase weekly for the next 7-8 weeks until attain peak production of 95-97% around week 28. During next 8-10 weeks, egg production maintains a plateau around its peak production. Egg production during week 38-52 (phase II), slowly start declining. During phase III (wk 53-72), production declines further until it reaches a point of non-profitability, around 60% Data Collection • The original data was used to map egg production curves in three different phases, compare curves among 22 commercial strains and compare strains laying white eggs with those laying brown eggs. wk 21 wk 23 wk 25 wk 27 wk 29 wk 31 wk 33 wk 35 wk 37 wk 39 wk 41 wk 43 wk 45 wk 47 wk 49 wk 51 wk 53 wk 55 wk 57 wk 59 wk 61 wk 63 wk 65 Egg Production % Egg Production Curve 120 100 80 60 40 20 0 Age from wk 21 k2 1 w k2 2 w k2 3 w k2 4 w k2 5 w k2 6 w k2 7 w k2 8 w k2 9 w k3 0 w k3 1 w k3 2 w k3 3 w k3 4 w k3 5 w k3 6 w Egg Production % Egg Production Curve-phase I Phase I 120 100 80 60 40 20 0 Age from wk 21 Age from wk 37 w w w w w w w w w w w w w w w w k5 2 k5 1 k5 0 k4 9 k4 8 k4 7 k4 6 k4 5 k4 4 k4 3 k4 2 k4 1 k4 0 k3 9 k3 8 k3 7 Egg Production % Egg Production Curve-phase II Phase II 99 98 97 96 95 94 93 92 91 Egg Production Curve-phase III Egg Production % Phase III 96 94 92 90 88 86 84 82 80 w 3 k5 w 4 k5 w 5 k5 w 6 k5 w 7 k5 w 8 k5 w 9 k5 w 0 k6 w 1 k6 w Age from wk 53 2 k6 w 3 k6 w 4 k6 w 5 k6 w 6 k6 Comparative Egg Production Brown vs. White Layers 120 EP% 100 80 60 40 Brown Av 20 White Av 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 Age from wk 22 Comparative Egg Production Curves of US Commercial Strains-phase I 120 Egg Production % 100 80 Only strain 1 was different from 13 60 40 20 0 1 2 3 4 5 6 7 8 9 10 Age from wk 22 11 12 13 14 15 16 Egg Production Modeling • For the current project, data from one of the brown egg laying strain was chosen for simulation and training of neural networks. • Mean weekly egg weights and standard deviations from wk 22-36 were computed. • These parameters were fed into Normal Distribution of @Risk 4.0 software (Palisade Corporation). • Six simulations with one thousand iterations each were performed. Data Simulation and Manipulation • On each 15 set of mean and standard deviation representing each week of production from week 22 to week 36, 20 data points were randomly drawn for a total of 300 observations, for training the neural networks. • Similarly for testing the neural networks, ten random data points were drawn. Data Simulation and Manipulation • Each of 20 training and 10 test observations, for each week egg production, were arranged in one row of a spreadsheet to determine 105 neural network examples, respectively. Neural Networks Training • Starting from the wk 22, the first four egg production observations were used as inputs while the fifth observation were used as output, to constitute one training example (epoch). Neural Networks Training • The second training example consisted of second, third, fourth, and fifth observations of egg production, as inputs, while the sixth observation as output. • There were a total of 105 such examples (epoch) generated with the simulated data. • For the NN testing the same procedure of epoch generation was applied. NN Examples (Epoch) Age Eg. I1 I2 I3 I4 O wk22 d1, epoch1 6.67 12.62 21.11 54.44 10.86 wk22 d2, epoch2 12.62 21.11 54.4 10.86 24.06 wk22 d3, epoch3 21.11 54.44 10.86 24.06 14.16 wk22 d4 54.44 10.86 24.06 14.16 19.67 wk22 d5 10.86 24.06 14.16 19.67 40.96 wk22 d6 24.06 14.16 19.67 40.96 54.21 wk22 d7 14.16 19.67 40.96 54.21 46.47 wk23 d8, epoch8 32.09 24.4 33.3 48.34 78.37 wk23 d9 24.4 33.3 48.34 78.37 47.75 wk23 d10 33.3 48.34 78.37 47.75 52.19 wk23 d11 48.34 78.37 47.75 52.19 73.86 wk23 d12 78.37 47.75 52.19 73.86 50.92 wk23 d13 47.75 52.19 73.86 50.92 49.9 wk23 d14 52.19 73.86 50.92 49.9 44.69 BP-3 Model BP-3 NN Strain 1, phase 1 Actual BP-3NN Act-Net 150 EP% 100 R² = 0.68 50 0 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 -50 -100 Age(d) GRNN Model Actual GRNN Act-Net GRNN Strain 1, phaseI 150 R² = 0.72 EP% 100 50 0 1 8 15 22 29 36 43 50 57 64 -50 -100 Age(d) 71 78 85 92 99 WARD-5 Model Ward-5 NN Strain 1, phaseI Actual Ward-5NN Act-Net 150 EP% 100 50 0 -50 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 103 -100 Age(d) Results- Neural Networks Comparisons Parameter BP-3 GRNN Ward-5 R squared: 0.681 0.715 0.697 r squared: 0.78 0.75 0.76 Mean squared error: 136.09 121.51 129.38 Mean absolute error: 7.81 7.28 7.40 Min. absolute error: 0.04 0.12 0.09 Max. absolute error: 46.54 43.54 48.40 Correlation coefficient r: 0.88 0.86 0.87 Percent within 5%: 45.71 49.52 50.48 Percent within 5% to 10%: 25.71 24.76 20.95 Percent within 10% to 20%: 16.19 15.24 18.10 Percent within 20% to 30%: 5.71 2.86 4.76 Percent over 30%: 6.67 7.62 5.71 GRNN comparison with brown strains, phase I GRNN Brown strains, phase I 120 100 EP% 80 60 40 20 0 1 2 3 4 5 6 7 8 9 Age from w k 22 10 11 12 13 14 15 GRNN comparison with white strains, phase I GRNN White strains, phase I 120 EP% 100 GRNN 80 60 40 White strains average 20 0 1 2 3 4 5 6 7 8 Age from w k 22 9 10 11 12 13 14 15 GRNN comparison with commercial strains, phase I Strains comparisons with GRNN, phase I-EP% 120 100 EP% 80 GRNN 60 40 20 0 1 2 Series1 Series8 Series15 Series22 3 4 Series2 Series9 Series16 Series23 5 6 Series3 Series10 Series17 Series24 7 Series4 Series11 Series18 Series25 8 9 Strains Series5 Series12 Series19 10 11 Series6 Series13 Series20 12 13 Series7 Series14 Series21 14 15 Conclusion • Data simulation may offer a feasible alternative to expensive original data collection once the underlying variable’s parameters are defined and understood. • Compared to other mathematical and statistical models neural networks predicted egg production curve accurately and efficiently. Conclusion • All three architect of NN produced comparable results in terms of R² that varied from 0.681 to 0.715. • All the networks over predicted during the initial period when egg production was peaking. Conclusion • Once the initial over-prediction anomaly is corrected, NN modeling approach will be an efficient and superior to its traditional mathematical and statistical counterparts for egg production prediction. • Further research on energy and amino acid modeling will enhance our understanding of these modeling approaches. Conclusion • Ahmad, H. A, 2011. Egg Production Forecasting: Determining efficient modeling approaches. J. Appl. Poultry Res. 20(#4):463-473. NIHMSID # 365792 + CD4 Pattern Recognition using GRNN in HIV Patients GRNN_CD4 Prediction CD4 1000 900 800 700 R2 = 0.92 Actual CD4 600 GRNN 500 400 300 200 100 0 1 3 5 7 9 11 13 15 17 19 Pattern 21 23 25 27 29 31 33 35 CD4+ prediction with Viral Load using BP-3NN 1400 1200 CD4 Prediction with BP-3 Actual CD4 1000 CD 4 Values 800 BP3 CD4 600 400 200 0 R squared: Viral Load Immunology 101 • CD4 is a protein that is present on Thelper lymphocytes (blood cells that respond to foreign substances). • Thelper do not kill the cells that carries the antigen but interact with B-lymphocytes or Tkiller lymphocytes to respond to antigens. • Simple tests are available for CD4 proteins that in turn identify and count Thelper lymphocytes. BPNN for CD4 Prediction with VL Actual CD4 CD 4 200 180 160 140 120 100 80 60 40 20 0 R2 = .1392 NN Prediction 1 15 29 43 57 71 85 99 113 127 141 155 169 NN predicted CD4 Test data GRNN for CD4 Prediction with VL 1400 R2 = 0.194 CD4 1200 Actual CD4 1000 800 GRNN pred. CD4 600 400 200 0 1 7 13 19 25 31 37 43 49 55 61 67 73 79 Test Data 85 91 97 103 109 115 121 127 133 139 145 151 157 16 GRNN Prediction for Viral Load with CD4 Viral load GRNN pred VL 1200000 Actual viral load 1000000 800000 600000 400000 200000 0 1 17 33 49 65 81 97 113 129 145 161 177 Test data BP-3NN CD4 Prediction using Pattern CD4 R2 = .84 Actual CD4 1200 1000 NN CD4 800 600 400 200 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 Test data High Blood Pressure vs. Obesity Ozone Modeling Salmonella Outbreaks
© Copyright 2026 Paperzz