486 487 A Appendix: Proofs 488 489 490 491 492 493 A.1 Theorem 1 [Base case] 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 When the tree is unexplored, the greedy solution finds arg maxŜ Popt (Ŝ) when exploring at most depth d nodes. Proof. When exploring at most d nodes, only one solution candidate can be built. We have Pbest (Ak |Sk ) ≥ Pbest (Aj 6= Ak |Sk ) and Pbest (Ak |Sk ) ≥ Puni (from As. 1). If a candidate solution contains Pbest (Ak |Sk ), its expected probability is always higher than any other unexplored candidate solution whose expected probability is uniform (from Eq. (2) and Lemma 1). A.2 Theorem 2 [Inductive step] Given the set of nth most likely candidate solutions have been explored, the (n + 1)th most likely candidate solution can be inferred by exploring at most d more nodes. Proof sketch. The nth solution has a probability to yield the optimal solution Popt (Ŝn ) larger than any other unexplored candidate solutions Popt (Ŝ), Ŝ ∈ Ŝunexp . Hence, finding the Popt (Ŝn+1 ) is equivalent to finding: arg min(Lopt (Sˆn ) − Lα ) (8) α where L(x) = log(P (x)) is the positive log probability and α indicates the dilemmatic state Sα to change. Thus, following Eq. (1) the current candidate solution is associated with the log probability: Lopt (Sˆn ) = d X (9) Lbest (Sk ) k=1 where Lbest (Sk ) is the highest log probability. Further, following Eq. (2) + (8), we have: depth(α)−1 X Lα = Lbest (Sk ) + L0best (Sα ) + k=1 d X Luni (10) k=depth(α)+1 where L0best (Sα ) is the highest log probability associated with an unexplored child state at Sα . Hence, replacing Eq. (10) + (9) in Eq. (8) gives: Lopt (Sˆn ) − Lα = (Lbest (Sα ) − L0best (Sα )) + d X k=depth(α)+1 (Lbest (Sk ) − Luni ) (11) where the first term corresponds to the difference of log probabilities at the dilemmatic state Sα , while the second term is the expected gain in log probability over a uniform log probability distribution in the remaining tree. Once the state Sα to be modified has been found, a Ŝn+1 candidate solution is built by a greedy search in the subtree whose root is Sα (see Theorem 1). To simplify the computation, we can approximate Lbest (Sk ) − Luni by a positive constant: Luni is constant and Lbest (Sk ) can be assumed to be constant on average. As a result, Eq. (11) can be approximated by Lopt (Sˆn ) − Lα ≈ Lbest (Sα ) − L0best (Sα ) + (d − depth(α)) · const. (12) Hence, minimizing Eq. (12) is equivalent to finding: arg min(Lbest (Sα ) − L0best (Sα ) − depth(α) · const.) α 10 (13) 540 541 542 543 544 545 546 547 where Sα indicates the state in the tree where the next most likely candidate solution should be greedily grown from. The first term favours exploring states where the difference between the best and the second best option is small, while second term gives preference to exploring candidate solutions re-using a large part of the initial solution. 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 11 Supplementary material (paper ID 526: Dilemma First Search for Effortless Optimization of NP-hard Problems) In this document we study the relation between the energy gain and the depth, as well as between the energy gain and the dilemma estimator for decision tree optimization on the ECP2011 and MNIST datasets. MNIST ECP2011 6 6 5 5 4 Depth Depth 4 3 3 2 2 1 0 1 0 10 20 30 40 50 60 Energy gain 70 80 90 100 0 2 4 6 8 10 Energy gain 12 14 16 Figure 1: Depth w.r.t. energy gain for the MNIST (left) and ECP2011 (right) datasets In Fig. 1, we notice that the energy gains obtained for a single iteration show no evidence of correlation with the depth, especially for the ECP2011 dataset. This supports the idea that const. in Eq. (8) can be neglected. MNIST ECP2011 1200 2500 2000 800 f(A1|S) - f(A2|S) f(A1|S) - f(A2|S) 1000 600 400 1500 1000 500 200 0 0 0 10 20 30 40 50 60 Energy gain 70 80 90 100 0 2 4 6 8 10 Energy gain 12 14 Figure 2: f(A1|S) - f(A2|S) w.r.t. energy gain for the MNIST (left) and ECP2011 (right) datasets In Fig. 2, we can see that the highest energy gains are obtained for smallest f(A1|S) - f(A2|S) (i.e. larger dilemma estimators). Therefore, we can see how exploring nodes with higher dilemma estimators first helps obtaining higher energy gains first. 16
© Copyright 2026 Paperzz