A Appendix - Free Walk Switzerland

486
487
A Appendix: Proofs
488
489
490
491
492
493
A.1 Theorem 1 [Base case]
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
When the tree is unexplored, the greedy solution finds arg maxŜ Popt (Ŝ) when exploring at most
depth d nodes.
Proof. When exploring at most d nodes, only one solution candidate can be built. We have
Pbest (Ak |Sk ) ≥ Pbest (Aj 6= Ak |Sk ) and Pbest (Ak |Sk ) ≥ Puni (from As. 1). If a candidate solution contains Pbest (Ak |Sk ), its expected probability is always higher than any other unexplored
candidate solution whose expected probability is uniform (from Eq. (2) and Lemma 1).
A.2 Theorem 2 [Inductive step]
Given the set of nth most likely candidate solutions have been explored, the (n + 1)th most likely
candidate solution can be inferred by exploring at most d more nodes.
Proof sketch. The nth solution has a probability to yield the optimal solution Popt (Ŝn ) larger than
any other unexplored candidate solutions Popt (Ŝ), Ŝ ∈ Ŝunexp . Hence, finding the Popt (Ŝn+1 ) is
equivalent to finding:
arg min(Lopt (Sˆn ) − Lα )
(8)
α
where L(x) = log(P (x)) is the positive log probability and α indicates the dilemmatic state Sα to
change. Thus, following Eq. (1) the current candidate solution is associated with the log probability:
Lopt (Sˆn ) =
d
X
(9)
Lbest (Sk )
k=1
where Lbest (Sk ) is the highest log probability. Further, following Eq. (2) + (8), we have:
depth(α)−1
X
Lα =
Lbest (Sk ) + L0best (Sα ) +
k=1
d
X
Luni
(10)
k=depth(α)+1
where L0best (Sα ) is the highest log probability associated with an unexplored child state at Sα .
Hence, replacing Eq. (10) + (9) in Eq. (8) gives:
Lopt (Sˆn ) − Lα = (Lbest (Sα ) − L0best (Sα )) +
d
X
k=depth(α)+1
(Lbest (Sk ) − Luni )
(11)
where the first term corresponds to the difference of log probabilities at the dilemmatic state Sα ,
while the second term is the expected gain in log probability over a uniform log probability distribution in the remaining tree. Once the state Sα to be modified has been found, a Ŝn+1 candidate
solution is built by a greedy search in the subtree whose root is Sα (see Theorem 1).
To simplify the computation, we can approximate Lbest (Sk ) − Luni by a positive constant: Luni is
constant and Lbest (Sk ) can be assumed to be constant on average. As a result, Eq. (11) can be
approximated by
Lopt (Sˆn ) − Lα ≈ Lbest (Sα ) − L0best (Sα ) + (d − depth(α)) · const.
(12)
Hence, minimizing Eq. (12) is equivalent to finding:
arg min(Lbest (Sα ) − L0best (Sα ) − depth(α) · const.)
α
10
(13)
540
541
542
543
544
545
546
547
where Sα indicates the state in the tree where the next most likely candidate solution should be
greedily grown from. The first term favours exploring states where the difference between the best
and the second best option is small, while second term gives preference to exploring candidate
solutions re-using a large part of the initial solution.
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
11
Supplementary material (paper ID 526: Dilemma First
Search for Effortless Optimization of NP-hard Problems)
In this document we study the relation between the energy gain and the depth, as well as
between the energy gain and the dilemma estimator for decision tree optimization on the ECP2011
and MNIST datasets.
MNIST
ECP2011
6
6
5
5
4
Depth
Depth
4
3
3
2
2
1
0
1
0
10
20
30
40
50
60
Energy gain
70
80
90
100
0
2
4
6
8
10
Energy gain
12
14
16
Figure 1: Depth w.r.t. energy gain for the MNIST (left) and ECP2011 (right) datasets
In Fig. 1, we notice that the energy gains obtained for a single iteration show no evidence of
correlation with the depth, especially for the ECP2011 dataset. This supports the idea that const.
in Eq. (8) can be neglected.
MNIST
ECP2011
1200
2500
2000
800
f(A1|S) - f(A2|S)
f(A1|S) - f(A2|S)
1000
600
400
1500
1000
500
200
0
0
0
10
20
30
40
50
60
Energy gain
70
80
90
100
0
2
4
6
8
10
Energy gain
12
14
Figure 2: f(A1|S) - f(A2|S) w.r.t. energy gain for the MNIST (left) and ECP2011 (right) datasets
In Fig. 2, we can see that the highest energy gains are obtained for smallest f(A1|S) - f(A2|S)
(i.e. larger dilemma estimators). Therefore, we can see how exploring nodes with higher dilemma
estimators first helps obtaining higher energy gains first.
16