Traveling Salesman Problems Motivated by Robot Navigation Maria Minkoff MIT With Avrim Blum, Shuchi Chawla, David Karger, Terran Lane, Adam Meyerson A Robot Navigation Problem • Robot delivering packages in a building • Goal to deliver as quickly as possible • Classic model: Traveling Salesman Problem • Find a tour of minimum length • Additional constraints: • some packages have higher priority • uncertainty in robot’s behavior • battery failure • sensor error, motor control error Markov Decision Process Model • State space S • Choice of actions aA at each state s • Transition function T(s’|s,a) • action determines probability distribution on next state • sequence of actions produces a random path through graph • Rewards R(s) on states • If arrive in state s at time t, receive discounted reward gtR(s) for g(0,1) • MDP Goal: policy for picking an action from any state that maximizes total discounted reward Exponential Discounting • Motivates to get to desired state quickly • Inflation: reward collected in distant future decreases in value due to uncertainty • at time t robot loses power with fixed probability • probability of being alive at t is exponentially distributed • discounting reflects value of reward in expectation Solving MDP • Fixing action at each state produces a Markov Chain with transition probabilities pvw • Can compute expected discounted reward rv if start at state v: rv = rv + Sw pvw gt(v,w) rw • Choosing actions to optimize this recurrence is polynomial time solvable • Linear programming • Dynamic programming (like shortest paths) Solving the wrong problem • Package can only be delivered once • So should not get reward each time reach target • One solution: expand state space • New state = current location past locations (packages already delivered) • Reward nonzero only on states where current location not included in list of previously visited • Now apply MDP algorithm • Problem: new state space has exponential size Tackle an easier problem • Problem has two novel elements for “theory” • Discounting of reward based on arrival time • Probability distribution on outcome of actions • We will set aside second issue for now • In practice, robot can control errors • Even first issue by itself is hard and interesting • First step towards solving whole problem Discounted-Reward TSP Given • undirected graph G=(V,E) • edge weights (travel times) de ≥ 0 • weights on nodes (rewards) rv ≥ 0 • discount factor g (0,1) • root node s Goal find a path P starting at s that maximizes P(v) d total discounted reward r(P) = Sv P rv g Approximation Algorithms • Discounted-Reward TSP is NP-complete (and so is more general MDP-type problem) • reduction from minimum latency TSP • So intractable to solve exactly • Goal: approximation algorithm that is guaranteed to collect at least some constant fraction of the best possible discounted reward Related Problems Goal of Discounted-Reward TSP seems to be to find a “short” path that collects “lots” of reward • Prize-Collecting TSP • Given a root vertex v, find a tour containing v that minimizes total length + foregone reward (undiscounted) • Primal-dual 2-approximation algorithm [GW 95] k-TSP • Find a tour of minimum length that visits at least k vertices • 2-approximation algorithm known for undirected graphs based on algorithm for PC-TSP [Garg 99] • Can be extended to handle node-weighted version Mismatch Constant factor approximation on length doesn’t exponentiate well • Suppose optimum solution reaches some vertex v at time t for reward gtr • Constant factor approximation would reach within time 2t for reward g2tr • Result: get only gt fraction of optimum discounted reward, not a constant fraction. Orienteering Problem Find a path of length at most D that maximizes net reward collected • Complement of k-TSP • approximates reward collected instead of length • avoids changing length, so exponentiation doesn’t hurt • unrooted case can be solved via k-TSP • Drawback: no constant factor approximation for rooted non-geometric version previously known • Our techniques also give a constant factor approximation for Orienteering problem Our Results Using -approximation for k-TSP as subroutine • • (3/2+2)-approximation for Orienteering e(3/2+2)-approximation for DiscountedReward Collection • constant-factor approximations for tree- and multiple-path versions of the problems Our Results Using -approximation for k-TSP as subroutine substitute =2 announced by Garg in 1999 • • (3/2+25 -approximation for Orienteering e(3/2+13-approximation for DiscountedReward Collection • constant-factor approximations for tree- and multiple-path versions of the problems Eliminating Exponentiation • Let dv = shortest path distance (time) to v • Define the prize at v as pv=gdv rv • max discounted reward possibly collectable at v • If given path reaches v at time tv, define excess ev = tv – dv • difference between shortest path and chosen one • Then discounted reward at v is gev pv • Idea: if excess small, prize ~ discounted reward • Fact: excess only increases as traverse path • excess reflects lost time; can’t make it up Optimum path • assume g = ½ (can scale edge lengths) Claim: at least ½ of optimum path’s discounted reward R is collected before path’s excess reaches 1 Proof by contradiction: • Let u be first vertex with eu ≥ 1 • Suppose more than R/2 reward follows u • Can shortcut directly to u then traverse the rest of optimum • reduces all excesses after u by at least 1 • so “undiscounts” rewards by factor g -1 = 2 • so doubles discounted reward collected • but this was more than R/2: contradiction s 0 0.5 u 10 1.5 0.5 21 32 New problem: Approximate Min-Excess Path • Suppose there exists an s-t path P* with prize value of length l(P*)=dt+e • Optimization: find s-t path P with prize value ≥ that minimizes excess l(P)-dt over shortest path to t • equivalent to minimizing total length, e.g. k-TSP • Approximation: find s-t path P with prize value ≥ that approximates optimum excess over shortest path to t, i.e. has length l(P) = dt + ce • better than approximating entire path length Using Min-Excess Path • Recall discounted reward at v is gev pv • Prefix of optimum discounted reward path: • collects discounted reward S gev pv R/2 spans prize S pv R/2 • and has no vertex with excess over 1 • Guess t = last node on opt path with excess et 1 • Find a path to t of approximately (4 times) minimum excess that spans R/2 prize (we can guess R/2) • Excesses at most 4, so gev pv pv/16 discounted reward on found path R/32 Solving Min-Excess Path problem Exactly solvable case: monotonic paths • Suppose optimum path goes through vertices in strictly increasing distance from root • Then can find optimum by dynamic program • Just as can solve longest path in an acyclic graph • Build table • For each vertex v: is there a monotonic path from v with length l and prize p? Solving Min-Excess Path problem Approximable case: wiggly paths • Length of path to v is lv = dv + ev • If ev > dv then lv > ev > lv/2 • i.e., take twice as long as necessary to reach v • So if approximate lv to constant factor, also approximate ev to twice that constant factor Approximating path length • Can use k-TSP algorithm to find approximately shortest s-t path with specified prize • merge s and t into vertex r • opt path becomes a tour • solve k-TSP with root r • “unmerge”: can get one or more cycles • connect s and t by shortest path r s t Decompose optimum path monotone monotone wiggly monotone wiggly Divides into independent problems > 2/3 of each wiggly path is excess Decomposition Analysis • 2/3 of each wiggly segment is excess • That excess accumulates into whole path • total excess of wiggly segment excess of whole path total length of wiggly segments 3/2 of path excess • Use dynamic program to find shortest (min-excess) monotonic segments collecting target prize • Use k-TSP to find approximately shortest wiggles collecting target prize • Approximates length, so approximates excess • Over all monotonic and wiggly segments, approximates total excess Dynamic program for Min-Excess Path • For each pair of vertices and each (discretized) prize value, find • Shortest monotonic path collecting desired prize • Approximately shortest wiggly path collecting desired prize • Note: polynomially many subproblems • Use dynamic programming to find optimum pasting together of segments Solving Orienteering Problem: special case s • Given a path from s that • collects prize • has length D • ends at t, the farthest point from s 0 • For any const integer r 1, there 1 0.5 1.5 exists a path from s to some v with • prize /r • excess (D-dv)/r v 12 3 t Solving Orienteering Problem General case: path ends at arbitrary t • Let u be the farthest point from s • Connect t to s via shortest path • One of path segments ending at u s t • has prize /2 • has length D Reduced to special case • Using 4-approximation for Min-Excess Path get 8-approximation for Orienteering u Budget Prize-Collecting Steiner Tree problem Find a rooted tree of edge cost at most D that spans maximum amount of prize • Complement of k-MST • Create Euler tour of opt tree T* of cost 2D • Divide this tour into two paths starting at root each of length D • One of them contains at least ½ of total prize • Path is a type of tree • Use c-approximation algorithm for Orienteering to obtain 2c-approximation for Budget PCST Summary • Showed maximum discounted reward can be approximated using min-excess path • Showed how to approximate min-excess path using k-TSP • Min-excess path can also be used to solve rooted Orienteering problem (open question) • Also solves “tree” and “cycle” versions of Orienteering Open Questions • Non-uniform discount factors • each vertex v has its own gv • Non-uniform deadlines • each vertex specifies its own deadline by which it has to be visited in order to collect reward • Directed graphs • We used k-TSP, only solved for undirected • For directed, even standard TSP has no known constant factor approximation • We only use k-TSP/undirectedness in wiggly parts Future directions • Stochastic actions • Stochastic seems to imply directed • Special case: forget rewards. • Given choice of actions, choose to minimize cover time of graph • Applying discounting framework to other problems : • Scheduling • Exponential penalty in place of hard deadlines
© Copyright 2026 Paperzz