An integer programming approach to classification

Noname manuscript No.
(will be inserted by the editor)
An integer programming approach to classification
Gurkan Ozturk · Refail Kasimbeyli
Received: date / Accepted: date
Abstract In this study we propose a novel multi objective integer programming approach for solving classification problems. By using an earlier developed Polyhedral Conic Functions based classification algorithm, we construct a
finite number of separating functions, and, then the optimal classifier is found
with respect to two criteria by maximizing the number of correctly classified
points using a minimal number of separating functions. The performance of the
developed method is demonstrated by testing it on some real-world datasets
Keywords First keyword · Second keyword · More
1 Introduction
In this paper the new approach for solving the classification problem is developed. The classification problem is solved by constructing a two-objective
integer programming mathematical model whose decision variables are just the
binary variables which determine whether the corresponding polyhedral conic
function will be chosen or not. The functions chosen by this way are then used
to determine the final classification function in the form of pointwise minimum
of the functions selected by solving the mathematical model.
Dr. Refail Kasimbeyli and Dr. Gurkan Ozturk are the recipient of an Scientific and Technological Research Council of Turkey (TUBITAK) Research Project (Project number: 107M472
)
G. Ozturk
Department of Industrial Engineering, University of Anadolu, Eskisehir, 26480, Turkey
E-mail: [email protected]
R. Kasimbeyli
Department of Industrial Systems Engineering, Izmir University of Economics, Izmir, 35330,
Turkey
E-mail: [email protected]
2
2 Polyhedral conic functions
A new class of funtions whose graph is a cone and level set is a convex polyhedron has recently been defined as polyhedral conic functions (PCFs) [1].
Several mathematical programing approaches are developed based on PCFs
are successfully used to solve classification problems.
Polyhedral conic functions (PCFs) which are used to construct a separation
function for the given two arbitrary finite point disjoint sets have been recently
proposed [1].
These functions are formed as an augmented l1 norm - with a linear part
added. A graph of such a function is a polyhedral cone with a sublevel set
including utmost an intersection of 2n half spaces.
A polyhedral conic function g(w,ξ,γ) : Rn → R is defined as follows:
g(w,ξ,γ,a) (x) = ⟨w, (x − a)⟩ + ξ ∥x − a∥1 − γ,
(1)
where w, a ∈ Rn , ξ, γ ∈ R, ∥x∥1 = |x1 |+· · ·+|xn | is a l1 -norm of the vector
x ∈ Rn .
When a PCF is defined as in Equation 1, the vertex point of this function
is (a, −γ). Projection of the vertex point on the level set can be considered
as center point of PCF and separation performance of the function is directly
depends on this point. How a PCF can separate two sets A and B in R2 is
shown in Figure 1. In this figure three different situations are illustrated by
(a), (b) and (c). In (a), though A and B are linearly inseparable, the obtained
PCF can completely separate these sets. Similarly in (b) and (c), when the
new points are added to set B obtained functions can also completely separates
two sets [1].
(
)
w ai − al + ξ ai − al 1 − γ + 1 ≤ yi ,
∀i ∈ Il ,
(2)
(
)
−w bj − al − ξ bj − al 1 + γ + 1 ≤ 0,
∀j ∈ J,
(3)
n
y = (y1 , . . . , ym ) ∈ Rm
+ , w ∈ R , ξ ∈ R, γ ≥ 1
kısıtları altında
(Pl )
min
( ye )
m
m
(4)
(5)
An iterative algorithm generating a nonlinear separating function by using polyhedral conic functions (PCF) and therefore called a PCF algorithm
is developed. This algorithm is based on solutions of linear programming subproblems. A solution of these subproblems at each iteration results in the
polyhedral conic function which separates a certain part of the set A from the
whole set B.
3
A = {(2, −1), (2, −4), (3, −1), (4, −2)}
B = {(−2, 2), (−2, −2), (−2, −6), (2, 2), (8, 2), (1, −6)}
g(x1 , x2 ) = −0.5x1 + 0.5x2 + 0.5(|x1 | + |x2 |) − 1
(a)
B = {(−2, 2), (−2, −2), (−2, −6), (2, 2), (8, 2), (1,-6)}
g(x1 , x2 ) = −2x1 + x2 + 2(|x1 | + |x2 |) − 5
(b)
B = {(−2, 2), (−2, −2), (−2, −6), (2, 2), (8, 2), (1, −6), (7,-4)}
g(x1 , x2 ) = −1.9x1 + 1.1x2 + 2.3(|x1 | + |x2 |) − 6.6
(c)
Fig. 1 PCF
4
Fig. 2 Changing ten fold cross validation test and training results for liver dataset with
respect to number of PCFs
3 Two objective integer programming approach to classification
problems
We consider to find the classification function g. This function is used to classify
any unlabeled data point to A or B.
A = {ai ∈ Rn
:
i ∈ I},
I = {1, . . . , m},
B = {bj ∈ Rn
:
j ∈ J},
J = {1, . . . , k},
Set for center points of Polyhedral Conic Functions which are used to
construct classification function [1].
C = {cl ∈ Rn
:
l ∈ L},
L = {1, . . . , q},
{
P = {Pil }∀i∈I,l∈L =
1, gl (ai ) ≤ 0
0, otherwise
3.1 Training Algorithm
– Construct the matrix P which shows the separated points cl ∈ C by the
polyhedral conic function gl . These points Cl either may or may not coincide with the points ai .
5
– to find gl (x) solve te following problem
1 ∑
yi
m
min
i∈I
subject to
w(ai − cl ) + ξ||ai − cl ||1 − γ + 1 ≤ yi
−w(bj − cl ) − ξ||bj − cl ||1 + γ + 1 ≤ 0
w ∈ Rn ,
ξ, γ ∈ R
– calculate the lt h column of matrix P .
– Solve the following two objective integer programming problem to obtain
classification function which is pointwise minimum of selected polyhedral
conic functions.
{
1, if the point ai is separated by at least one of the selected functions
xi =
0, otherwise
{
yl =
1, if the function gl is selected to construct classification function
0, otherwise
min
∑
yl
l∈L
max
∑
xi
i∈I
xi ≤
∑
subject to
yl Pil ,
∀i ∈ I
l∈L
By solving this model we obtain the minimum number of polyhedral conic
functions which serve for separating a maximum number of points of set
A from the set B. Let L̃ = {l|yl = 1, l ∈ L} be the index set which
represents the selected polyhedral conic functions. Hence the resulting separating function, g, can be defined as follows:
g(x) = min{gl (x)}
l∈L̃
Geliştirilmiş olan yöntemin avantajları 1. Yeni tamsayılı model farklı classification yontemlerinin ayrı ayrı ve combine edilmiş şekilde kullanılmasına
olanak sağlıyor. 2. PCF tipli algoritmalar kullanıldığında her iterasyonda bir
ayırıcı PCF oluşturulurken, algoritmayı tanımlayan matematiksel modelde B
kümesinin elemanlarını engelleyen kısıt esnetilebilerek sıfır sayıda elemanla
belli sayıda elemanın katılmasına imkan sağlayarak aslında esnek bir yaklaşım
sunabilmektedir. bu da algoritmanın overfittingini azaltmaya yarıyor.. 3. Tamsayılı modelde kullanılan P matrisini farklı ideyalar kullanarak genişletebilir,
böylece algoritmanın başarı oranının yükseltilmesi sağlanabilir.
6
Table 1 Ten fold cross validation results for test problems
Problem
Liver
WBCP
Ionosphere
Diabets
Heart
v1
12.3
23.6
99.4
44.6
44.4
v2
45.4
5.5
45.4
55.4
45.45
v3
45.34
89.45
45.54
77.44
76.77
Makale için yapılacaklar...
1. Tolerans 0 iken ve genişletilmemiş matris için çok amaçlı ve toplam
yj’ler < 5 vs. için sonuçlar elde edilecek ve alfa 0 için çözdürülecek.. Tablolar
oluşturulacak ve kıyaslanacak.
2. Çeşitli toleranslar için (0.2, 0.4 vs..) ve genişletilmiş P matrisleri için (1)
şıkdaki şekilde çözümler bulunacak.. Başarı oranının genişletilmiş matris için
yükseldiği vurgulanacak.
4 Computational results
In this section ten fold cross validation results for well known test problems
in the literature are given. All results are organized with respect to different
parameters. By using extended P matrix obtained results are given in table 4.
5 Conclusions and Future Works
Acknowledgements The Authors would like to thank Mumin Sonmez for the helps to
solve the test problems.
References
1. Gasimov, R.N., Ozturk, G.: Separation via polihedral conic functions. Optimization
Methods and Software 21(4), 527–540 (2006)