International Conference on Management and Information Systems September 18-20, 2015 Fitting of Logistic Regression Model for Prediction of Likelihood of India Winning or Losing in Cricket Match Monalisha Pattnaik [email protected] Utkal University, Bhubaneswar Anima Bag [email protected] C.V. Raman Group of Institutions, Janla The present study focuses on the prediction of likelihood of India winning or losing in One Day International (ODI) cricket match against Australia by fitting the logistic regression model. According to ICC ODI championship rating, dated 7th August 2015, India holds 2nd position with 5875 points and 115 rating by playing 51 matches. Data from actual recent matches with five independent variables and one dependent binary logistic variable are used throughout to illustrate the implementation of this successful use of mathematical and statistical principles to the solution of a practical problem in one-day international cricket match. Keywords: Prediction, ODI, Logistic regression model, Cricket, Binary logistic 1. Introduction The International Cricket Council (ICC) ODI Championship is an international One Day cricket competition run by the ICC. A One Day International (ODI) is a form of limited overs cricket, in which each team faces a fixed number of overs, usually fifty. The first ODI was played on 5 January 1971 between Australia and England at the Melbourne Cricket Ground and ODIs were played in white kits with a red ball. One Day International matches are also called Limited Overs Internationals (LOI) and it is a late twentieth-century development. According to ICC ODI championship rating, dated 7th August 2015, India holds 2nd position with 5875 points and 115 rating by playing 51 matches. Cricket is hugely popular sport around the world. An estimated three billion people are cricket fans, a figure that is larger only for soccer, which has an estimated 3.5 billion fans. In recent years cricket‟s governing body, the International Cricket Council (ICC), has sought to make cricket even more popular. In order to achieve this, one strategy the ICC has adopted is to introduce Twenty 20 (T20), a shorter format of the game, with the intention of making cricket a faster, more exciting spectacle that might attract a new audience. One-day cricket, which is a shortened version of normal game of cricket, is a game between two teams each of 11 players played on an approximately oval field with semi-major and semi-minor axes of approximately 70 and 55m, respectively. it has a central „pitch‟ along the major axis approximately 20 m long. To predict the likelihood of India winning or losing a ODI match against Australia, Logistic Regression model is framed. Data is collected from 15 recent matches with 5 independent variables and 1 dependent variable of India win or lose. 2. Logistic Regression Analysis Logistic regression is generally preferred when there are only two categories of the dependent variable. Logistic regression fits an S-shaped curve to the data. This curve relationship ensures two things-first, that the predicted values are always between 0 and 1 and secondly, that predicted values correspond to the probability of Y being 1, or win, and being 0, or lose in the present study. To achieve this, a regression is first performed with a transformed value of Y, called logit function. The equation is: ( ( ) ) ( ) Where, Where, odds refer to the odds of Y being equal to 1. To understand the difference between odds and probabilities. In the above equation a, and are constants and and are independent variables. Case Analysis-1 We wish to predict the likelihood of India winning a one day international match against the Australia. Data ISBN 978-1-943295-00-5 57 International Conference on Management and Information Systems September 18-20, 2015 collected from 15 recent matches on the following variables. Viratscore is the score of Virat Kohli in the match. Since Kohli‟s batting is seen as instrumental to India‟s chances, we wish to see if his score has in fact impact on India‟s victories. Does batting first help or hinder? The variable Batfirst is coded as 1 otherwise 0. Taking early wickets helps, so say the experts. Wicket10 shows the number of Australia wickets to fall in the first 10 overs of their batting. Score of opener -1 and opener -2 are the indication of India win or lose the match. The variables are Scoreopener-1 and Scoreopener-2. Finally, Indiawin is the dependent variable 1-victory, 0-loss. The Table 1 shows the input data of 15 recent ODI matches against Australia. Table 1 Input Data of India vs Australia Date and Sl.No. Place Place Virat Kohli Score Bating First Wicket Taken in First 10 Overs Score of Opener -1 Score of Opener-2 India win 1 26.05.2015 Sydney 01 0 1 34 45 0 2 18.01.2015 Melbourne 09 1 2 138 02 0 3 02.11.2013 Bangalore 03 1 1 209 60 1 4 30.10.2013 Nagpur 115 0 1 79 100 1 5 19.10.2013 Mohali 68 1 0 11 08 0 6 16.10.2013 Jaipur 100 0 0 141 95 1 7 13.10.2013 Pune 61 0 0 07 42 0 8 26.02.2012 Sydney 21 0 2 05 14 0 9 19.02.2012 Brisbane 12 0 0 05 03 0 10 12.02.2012 Adelaide oval 18 0 2 92 20 1 11 05.02.2012 Melbourne 31 0 2 05 02 0 12 24.05.2011 Ahmadabad 24 0 1 15 53 1 13 20.10.2010 Visakhapatnam 118 0 1 0 15 1 14 02.11.2009 Mohali 10 0 2 30 40 0 15 25.10.2009 Vadodara 30 0 1 13 14 0 Objective To predict the likelihood of India winning a One Day International match against the Australia. Hypothesis There is no significant difference between the observed value and the model prediction. There is a significant difference between the observed value and the model prediction. Interpretation Table 2 shows the classification of India winning or losing a ODI match against Australia. The classification Table 2 shows that the overall correct classification rate of the model is 100%. It indicates that the logistic regression model fits well. The model predicts winning and losing the ODI match against Australia with same likelihood. Hosmer and Lemeshow test of goodness of fit or chi-square goodness of fit is the test which tests “how well the model fits”. From Table 3 it shows that the P-value is 1.0 which is greater than 0.05, so we may accept the ISBN 978-1-943295-00-5 58 International Conference on Management and Information Systems September 18-20, 2015 null hypothesis at 5% level of significance. It indicates that there is no significance difference between the observed value and the predicted value of the model. In other words, model fits well. Table 4 shows that the logistic regression model with one dependent/categorical variable (Y, win) and other 5 independent variables like ( Virat Kohli Scoring, , Bating first (categorical variable), Number of wicket before 10 overs, Scoring of opener-1 and Scoring of opener-2). From the model it is observed that India winning is directly related with all the five independent variables. The multivariate logistic regression model is: Table 5 shows the predicted probabilities and classification of India winning and losing in ODI match against Australia. Figure 1 shows observed groups and predicted probabilities classification of given data. Figure 2 shows graphical representation of logistic regression S-curve of Predicted value and one independent variable Virat Kohli Scoring. Table 2 Classification Table Predicted INDIAWIN Percentage Correct .00 1.00 .00 9 0 100.0 INDIAWIN Step 1 1.00 0 6 100.0 Overall Percentage 100.0 a. The cut value is .500 Observed Table 3 Hosemer and Lemeshow Test Step Chi-square Df Sig. 1 .000 5 1.000 Table 4 Variables in Logistic Regression Model B S.E. Wald Df Sig. Exp(B) VIRATSCOR 2.651 217.169 .000 1 .990 14.171 BATFIRST(1) 7.659 183830.908 .000 1 1.000 2119.718 WKTBTENOV 50.146 4775.057 .000 1 .992 5999877122161316000000.000 Step 1a SCOOPNR 2.027 175.255 .000 1 .991 7.588 SCOROPEN 5.719 475.773 .000 1 .990 304.734 Constant -439.502 186743.350 .000 1 .998 .000 a. Variable(s) entered on step 1: VIRATSCOR, BATFIRST, WKTBTENOV, SCOOPNR, SCOROPEN. Table 5 True Value and Predicted Value of Likelihood of India Winning a ODI Match against Austarlia Sl. No. 1 Kohli Wicket taken in first Batfirst cr 10 overs 01 0 1 Score of opener, 1 34 Score of opener, 2 45 India win 0 Predicted Probability .00000 Predicted Value 0.00 2 09 1 2 138 02 0 .00000 0.00 3 03 1 1 209 60 1 1.00000 1.00 4 115 0 1 79 100 1 1.00000 1.00 5 68 1 0 11 08 0 .00000 0.00 6 100 0 0 141 95 1 1.00000 1.00 7 61 0 0 07 42 0 .00000 0.00 8 21 0 2 05 14 0 .00000 0.00 9 12 0 0 05 03 0 .00000 0.00 10 18 0 2 92 20 1 1.00000 1.00 11 31 0 2 05 02 0 .00000 0.00 12 24 0 1 15 53 1 1.00000 1.00 13 118 0 1 0 15 1 1.00000 1.00 14 10 0 2 30 40 0 .00000 0.00 ISBN 978-1-943295-00-5 59 International Conference on Management and Information Systems 15 30 0 1 13 September 18-20, 2015 14 0 .00000 0.00 Step number: 1 Observed Groups and Predicted Probabilities 16 + + I I I I F I I R 12 + + E I I Q I I U I0 I E 8 +0 + N I0 I C I0 1I Y I0 1I 4 +0 1+ I0 1I I0 1I I0 1I Predicted ---------+---------+---------+---------+---------+---------+---------+---------+---------+---------Prob: 0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1 Group: 0000000000000000000000000000000000000000000000000011111111111111111111111111111111111111111111111111 Predicted Probability is of Membership for 1.00 The Cut Value is .50 Symbols: 0 - .00 1 - 1.00 Each Symbol Represents 1 Case. Figure 1 Observed Groups and Predicted Probabilities Classification India Winning India Winning vs Kohli Score 1.2 1 0.8 0.6 0.4 0.2 0 -0.2 0 predictedvalue Poly. (predictedvalue) 50 100 150 Kohli Scoring Figure 2 Logistic Regression Curve of Predicted Value and Virat Kohli Scoring Case Analysis-2 We wish to predict the likelihood of India winning a one day international match against the Australia. Data collected from 15 recent matches on the following variables. Viratscore is the score of Virat Kohli in the match. Since Kohli‟s batting is seen as instrumental to India‟s chances, we wish to see if his score has in fact impact on India‟s victories. Does batting first help or hinder? The variable Batfirst is coded as 1 otherwise 0. Taking early wickets helps, so say the experts. Wicket10 shows the number of Australia wickets to fall in the first 10 overs of their batting. Finally, Indiawin is the dependent variable 1-victory, 0-loss. The Table 6 shows the input data of 15 recent ODI matches against Australia. Objective To predict the likelihood of India winning a One Day International match against the Australia. Hypothesis There is no significant difference between the observed value and the model prediction. There is a significant difference between the observed value and the model prediction. Interpretation ISBN 978-1-943295-00-5 60 International Conference on Management and Information Systems September 18-20, 2015 Table 7 shows the classification of India winning or losing a ODI match against Australia. The classification Table 7 shows that the overall correct classification rate of the model is 73.3%. It indicates that the logistic regression model fits well. The model predicts India‟s wins better than it predicts loses the ODI match against Australia. Hosmer and Lemeshow test of goodness of fit or chi-square goodness of fit is the test which tests “how well the model fits”. From Table 8 it shows that the P-value is 0.058 which is greater than 0.05, so we may accept the null hypothesis at 5% level of significance. It indicates that there is no significance difference between the observed value and the predicted value of the model. In other words, model fits well. Table 9 shows that the logistic regression model with one dependent/categorical variable (Y, win) and other 3 independent variables like ( Virat Kohli Scoring, , Bating first, Number of wicket before 10 overs, ). From the model it is observed that India winning is directly related with all the two independent variables except bating first. The multivariate logistic regression model is: Table 10 shows the predicted probabilities and classification of India winning and losing in ODI match against Australia. Figure 3 shows observed groups and predicted probabilities classification of given data. Table 6 Input Data of India vs Australia Sl. No. Date and Place 1 26.05. 2015 2 18.01. 2015 3 02.11. 2013 4 30.10.2013 5 19.10.2013 6 16.10. 2013 7 13.10.2013 8 26.02.2012 9 19.02.2012 10 12.02.2012 11 05.02.2012 12 24.05.2011 13 20.10.2010 14 02.11.2009 15 25.10.2009 Place Virat Kohli Score Bating First Wicket Taken in First 10 Overs India win Sydney 01 0 1 0 Melbourne 09 1 2 0 Bangalore 03 1 1 1 Nagpur 115 0 1 1 Mohali 68 1 0 0 Jaipur 100 0 0 1 Pune 61 0 0 0 Sydney 21 0 2 0 Brisbane 12 0 0 0 Adelaide oval 18 0 2 1 Melbourne 31 0 2 0 Ahmadabad 24 0 1 1 Visakhapatnam 118 0 1 1 Mohali 10 0 2 0 Vadodara 30 0 1 0 Table 7 Classification Table Predicted INDIAWIN Observed Step 1 INDIAWIN Percentage Correct .00 1.00 8 1 88.9 1.00 3 3 50.0 .00 Overall Percentage 73.3 a. The cut value is .500 Table 8 Hosmer and Lemeshow Test Step Chi-square df Sig. 1 12.203 6 .058 Table 9 Variables in Logistic Regression Model Variables in the Equation B S.E. Wald df Sig. Exp(B) KOHLISCOR .032 .021 2.346 1 .126 1.032 BATFIRST(1) -.234 1.489 .025 1 .875 .792 Step 1a WKTTEN .555 .903 .377 1 .539 1.741 Constant -2.163 1.856 1.358 1 .244 .115 a. Variable(s) entered on step 1: KOHLISCOR, BATFIRST, WKTTEN. ISBN 978-1-943295-00-5 61 International Conference on Management and Information Systems September 18-20, 2015 Table 10 True Value and Predicted Value of Likelihood of India Winning a ODI Match against Austarlia Sl. No. Kohli cr batfirst Wicket taken in first 10 overs India win Predicted Probability Predicted Values 1 01 0 1 0 .14068 0.00 2 09 1 2 0 .31735 0.00 3 03 1 1 1 .18060 0.00 4 115 0 1 1 .86176 1.00 5 68 1 0 0 .50205 1.00 6 100 0 0 1 .68919 1.00 7 61 0 0 0 .38964 0.00 8 21 0 2 0 .35060 0.00 9 12 0 0 0 .11782 0.00 10 18 0 2 1 .32912 0.00 11 31 0 2 0 .42626 0.00 12 24 0 1 1 .25439 0.00 13 118 0 1 1 .87278 1.00 14 10 0 2 0 .27536 0.00 15 30 0 1 0 .29239 0.00 Step number: 1 Observed Groups and Predicted Probabilities 4 + + I I I I F I I R 3 + + E I I Q I I U I I E 2 + + N I I C I I Y I I 1 + 0 0 1 1 0 0 01 0 0 0 0 1 11 + I 0 0 1 1 0 0 01 0 0 0 0 1 11 I I 0 0 1 1 0 0 01 0 0 0 0 1 11 I I 0 0 1 1 0 0 01 0 0 0 0 1 11 I Predicted ---------+---------+---------+---------+---------+---------+---------+---------+---------+---------Prob: 0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1 Group: 0000000000000000000000000000000000000000000000000011111111111111111111111111111111111111111111111111 Predicted Probability is of Membership for 1.00 The Cut Value is .50 Symbols: 0 - .00 1 - 1.00 Each Symbol Represents .25 Cases. Figure 3 Observed Groups and Predicted Probabilities Classification 3. Conclusion The present study is focused on the prediction of likelihood of India winning or losing in One Day International (ODI) cricket match against Australia by fitting the logistic regression model. The two case analysis have been carried out one case has five independent variables and one dependent binary logistic variable and other case has three independent variables and one dependent binary logistic variable for 15 recent ODI matches. From the results, the prediction of India win or lose the ODI match against Australia to illustrate the implementation of this successful use of mathematical and statistical principles to the solution of a practical problem in one-day international cricket match. 4. References 1. 2. 3. 4. Carter, M., Guthrie, G., (2004), “Cricket interruptus: fairness and incentive in limited overs cricket matches”, Journal of Operational Research Society, Vol. 55, 822-829. Duckworth, F.C., Lewis, A.J, (2004), “A successful operational research intervention in one-day cricket”, Journal of Operational Research Society, Vol. 55, 749-759. McHale, I.G., Asif, M., (2013), “A modified Duckworth-Lewis method for adjusting targets in interrupted limited overs cricket”, European Journal of Operational Research, Vol. 225, 353-362. Nargundkar, R. (2008), “Marketing Research”, 3rd Edition, Tata McGraw-Hill, New Delhi, India. ISBN 978-1-943295-00-5 62
© Copyright 2026 Paperzz