Olympiads in Informatics, 2015, Vol. 9, 45–55 DOI: http://dx.doi.org/10.15388/ioi.2015.05 45 Towards a Better Way to Teach Dynamic Programming Michal FORIŠEK Comenius University, Bratislava, Slovakia e-mail: [email protected] Abstract. We give an overview of how various popular algorithm textbooks deal with the topic of dynamic programming, and we identify various issues with their expositions. In the second part of the paper we then give what we believe to be a better way of presenting the topic. While textbooks only contain the actual exposition, our paper also provides the rationale behind our choices. In particular, we managed to divide the topic into a sequence of simpler conceptual steps that are easier to learn. Keywords: algorithms, dynamic programming, memoization, task analysis. 1. Overview Dynamic programming is a standard paradigm used in the design of efficient algorithms. This approach is usable for problems that exhibit an optimal substructure: the optimal solution to a given instance can be recursively expressed in terms of optimal solutions for some sub-instances of the given instance. Dynamic programming comes in two basic flavors. The top-down approach, usually called memoization, is based on implementing the computation of the discovered recursive relation as a recursive function, and then adding a cache so that each sub-instance only gets evaluated once. The bottom-up approach, usually called dynamic programming, essentially evaluates the same recurrence but in an iterative way: the algorithm designer specifies an order in which the sub-instances are processed, and this order is chosen in such a way that whenever we process a particular instance, all its needed subinstances have already been processed and their optimal solutions are already known by the algorithm. Below, we use the term dynamic programming (DP) to cover both flavors. When talking specifically about the iterative approach we will use the term iterative DP or bottom-up DP. Despite being conceptually easy, dynamic programming is notorious for being hard to learn. Quoting Skiena (2008): “[Until] you understand dynamic programming, it seems like magic.” 46 M. Forišek Different textbooks use very different approaches to present dynamic programming. The canonical way of presenting dynamic programming in algorithm textbooks is by showing a sequence of tasks and solving them using dynamic programming techniques. What is usually missing is: ●● Rationale for choosing these specific tasks and their order. ●● Notes on potential pitfalls when presenting the tasks and their solutions. In this paper we aim to fill in those missing gaps. More precisely, the paper consists of the following parts: ●● We present the way dynamic programming is exposed in multiple standard algorithm textbooks. ●● We analyse those expositions and identify a set of possible pitfalls that often confuse and mislead students. ●● We present our suggested version of a better order in which to teach the individual concepts related to dynamic programming, and we argue about the benefits of our approach. 2. Algorithm Textbooks Throughout this paper we are going to refer to the way dynamic programming is treated in some of the canonical algorithm textbooks. In particular, we examined the following ones: Cormen et al. (2001), Dasgupta et al. (2006), Kleinberg and Tardos (2006), Sedgewick (1998), Skiena (2008). When refering to these textbooks below, for better readability we will use the following shorthand instead citations: Cormen, Dasgupta, Kleinberg, Sedgewick, and Skiena. Below we give a brief summary how each of these textbooks introduces dynamic programming. Cormen prefers and almost exclusively uses a bottom-up approach. Dynamic programming is introduced using the following sequence of tasks and texts: 1. Assembly-line scheduling. 2. Matrix chain multiplication. 3. A general overview of iterative dynamic programming and memoization. 4. Longest common subsequence. 5. Optimal binary search tree. Dasgupta also prefers and almost exclusively uses a bottom-up approach, on a rather large set of solved tasks: 1. Shortest paths in a DAG. 2. Longest increasing subsequence. 3. A warning against exponential-time recursion. 4. Edit distance. 5. Knapsack. 6. A note about memoization. 7. Matrix chain multiplication. Towards a Better Way to Teach Dynamic Programming 47 8. Shortest paths. 9. Independent sets in trees. Kleinberg first introduces introduces dynamic programm programm Kleinberg first introduces dynamic programmingKleinberg using a top-down approach, but first dynamic Kleinberg first first introduces introduces dynamic dynamic programming programming using using aa top-down top-down approach, approach, Kleinberg first introduces dynamic programming using a top-down approach, Kleinberg first introduces dynamic programming using a top-down approach, Kleinberg first introduces dynamic programming using a top-down approach, but then uses a bottom-up iterative approach tKleinberg introduces dynamic programming using a top-down approach, but problems. then usesThe a bottom-up iterative approach then uses a bottom-up iterative approach in all following entire exposibut then uses uses aabut bottom-up bottom-up iterative iterative approach approach in in all all following following problems. problems. The The then uses a bottom-up iterative approach in all following problems. The but then uses a bottom-up iterative approach in all following problems. The butthen then uses a bottom-up iterative approach in all following problems. The entire exposition looks as follows: a but bottom-up iterative approach in all following problems. The entire exposition looks as follows: tion looks as follows: entire exposition looks looks as aslooks follows: follows: entire exposition as follows: entire exposition aslooks follows: entireexposition looks as follows: nentire looks asexposition follows: 1. Weighted interval scheduling. 1. Weighted Weighted interval interval scheduling scheduling 1. 1. 1.1.Weighted Weighted interval interval scheduling scheduling 1. AWeighted interval scheduling 1. Weighted interval scheduling 2. general overview of iterative dynamic programming and memoization. Weighted interval scheduling 2. aa general general overview of of iterative iterative dynamic dynamic prog pro terval scheduling 2. overview 2. 2.2.aa ageneral general overview of of iterative iterative dynamic dynamic programming programming and and memoization memoization 2.dynamic a general overview of iterative dynamic programming and memoization 2. a overview general overview of iterative dynamic programming and memoization overview of iterative dynamic programming and memoization 3. Segmented Segmented least squares squares erview ofgeneral iterative programming and memoization 3. Segmented least squares. 3. least 3. 3.3.Segmented Segmented least least squares squares 3. Subset Segmented least squares 3. Segmented least squares Segmented least squares 4. Subset Subset sums sums and and Knapsack Knapsack least squares 4. 4. sums and Knapsack. 4. 4. Subset Subset sums sums and and Knapsack Knapsack 4. Subset sums and Knapsack 4. Subset sums and Knapsack 4. Knapsack Subset sums Knapsack 5. RNA RNA secondary secondary structure structure s and 5. 5.and RNA secondary structure. 5. 5.5.RNA RNA secondary secondary structure structure 5.secondary RNA secondary structure RNA6. structure RNA5. secondary structure 6. Sequence alignment (and (and optimizations optimizations to to rr dary structure 6. Sequence Sequence alignment (and optimizations to reduce memory use).alignment 6. 6. Sequence Sequence alignment alignment (and (and optimizations optimizations to to reduce reduce memory memory use) use) 6. Sequence alignment (and to optimizations to reduce memory use)paths 6. Sequence alignment (and optimizations to reduce memory use) 6. Sequence alignment (and optimizations reduce memory use) 7. Shortest paths ignment (and optimizations to reduce memory use) 7. Shortest 7. Shortest paths. 7.7.Shortest Shortest paths paths 7. Shortest 7. Shortest paths paths Shortest paths ths7. Sedgewick prefers a top-down approach. Only presents two problems: Sedgewick prefers aa top-down top-down approach. approach. Only Only Sedgewick prefers Sedgewick Sedgewick prefers prefers aprefers aa top-down top-down approach. approach. Only Only presents presents two problems: problems: Sedgewick prefers a top-down approach. Only presents two problems: 1. Fibonacci Sedgewick anumbers. top-down approach. Only presents two problems: Sedgewick prefers top-down approach. Only presents twotwo problems: efers a top-down approach. Only presents two problems: 1. Fibonacci Fibonacci numbers numbers 1. 2. Knapsack. 1. 1.1.Fibonacci Fibonacci numbers numbers 1. Fibonacci 1. Fibonacci numbersnumbers Fibonacci numbers 2. Knapsack Knapsack umbers 2. 2. 2.2.Knapsack Knapsack Note that the new 4th edition (Sedgewick and Wayne, 2011) no longer contains a 2. Knapsack 2. Knapsack Knapsack Note that that the the new new 4th 4th edition edition (Sedgewick, (Sedgewick, Way Wa Note section on dynamic programming. Note Note that the the new new 4th 4th edition edition (Sedgewick, (Sedgewick, Wayne: Wayne: Algorithms) Algorithms) no no longer longer conconNote that the new 4thAlgorithms) edition (Sedgewick, Wayne: Algorithms) no longer con- programming. Note that the new 4th edition (Sedgewick, Wayne: Algorithms) longer conNote that the new 4th edition (Sedgewick, Wayne: Algorithms) notains longer contains ano section on dynamic programming. ew 4ththat edition (Sedgewick, Wayne: no longer cona section on dynamic Skiena is also indynamic favor of programming. starting with the top-down approach. His exposition protains section on dynamic programming. programming. tains a section on tains a on section on dynamic programming. tainsaa asection section on dynamic dynamic programming. ontains dynamic programming. Skiena isis also also in in favor favor of of starting starting with with the the top top Skiena ceeds in the following order: Skiena is isis also also in favor offavor starting starting with with the the top-down top-down approach. approach. His His exposition exposition Skiena isof also inoffavor ofapproach. starting with the top-down approach. His exposition is favor also in starting with the top-down approach. Hisin exposition Skiena alsoin in favor of starting with the top-down approach. His exposition proceeds in the following order: inSkiena favor ofSkiena starting with the top-down His exposition proceeds the following order: 1. Fibonacci numbers: recursively, with memoization, iteratively. proceeds proceeds in ininthe the following following order: inorder: the following proceeds in the following order: order: proceeds theproceeds following order: following order: 2. Binomial coefficients. 1. Fibonacci Fibonacci numbers: numbers: recursively, recursively, with with memoi memo 1. 1. 1.1.Fibonacci Fibonacci numbers: numbers: recursively, recursively, with with memoization, memoization, iteratively iteratively 1. Edit Fibonacci numbers: recursively, with iteratively memoization, 3. distance. 1. Fibonacci numbers: recursively, with memoization, iteratively Fibonacci numbers: recursively, with memoization, 2.iteratively Binomial coefficients coefficients umbers: recursively, with memoization, iteratively 2. Binomial 2. 2.2.Binomial Binomial coefficients coefficients 2. Longest Binomial coefficients 2. Binomial coefficients Binomial coefficients 3. Edit Edit distance distance 4. increasing subsequence. efficients 3. 3. 3. Edit Edit distance distance 3. Edit distance Edit5. distance 4. Longest Longest increasing increasing subsequence subsequence ce 3. Edit3.distance 4. Linear partition. 4. 4.4.Longest Longest increasing increasing subsequence subsequence 4. Context-free Longest increasing subsequence 4. Longest increasing subsequence Longest increasing subsequence 5. Linear Linear partition partition reasing subsequence 5. 6. grammar parsing. 5. 5.5.Linear Linear partition 5. partition Linear partition 5. partition Linear Linear partition 6. Context-free Context-free grammar grammar parsing parsing tion 6. 6. 6. Context-free Context-free grammar grammar parsing parsing 6. grammar 6. Context-free grammar parsingparsing 6. Context-free grammar parsing e grammar parsing Context-free 3. Critical Analysis 33 3 Critical Critical analysis 3analysis Critical analysis 3 Critical analysis Critical analysis analysis Critical analysis analysis 33 Critical In this section used we show show the results of of our our ana ana section we results In this section weresults show the results of our analysis ofIn thethis expositions in thethe textIn InIn this this section we we show show the the results of ofour our ourresults analysis analysis of of the the expositions used used in in used In wethe show the analysis ofexpositions the expositions In this section we show results of our ofanalysis ofexpositions the in in above. thissection section wethis show the results of analysis ofour theused expositions used in used the textbooks mentioned above. We We mostly mostly focu focu we show the results of section our analysis of the expositions inthe textbooks mentioned books mentioned above. We mostly focus on tasks used multiple textbooks. the the textbooks textbooks mentioned mentioned above. above. We We mostly mostly focus focus on on tasks tasks used used in ininmultiple multiple textthe mentioned above. We focus onmultiple tasks intextmultiple the textbooks mentioned above. Weused mostly focus on tasks used inused multiple text- textthe textbooks mentioned above. We focus onmostly tasks used in textbooks. mentioned above. We textbooks mostly focus on mostly tasks in multiple textbooks. books. books. books. books. books. 3.1. Matrix Chain Multiplication 3.1 Matrix Matrix chain chain multiplication multiplication 3.1 3.1 3.1 Matrix Matrix chain chain multiplication multiplication 3.1 chain multiplication 3.1 Matrix chain multiplication 3.1 multiplication Matrix chainMatrix multiplication chain Mnn of of Statement: Given Given isis aa sequence sequence M M11,,......,,M Statement: ,.,sequence ......,.M .M ,,M Mnof of rectangular rectangular matrices matrices such such that that Statement: Statement: Given Given is sequence M M .,M rectangular matrices such Statement: is M Statement: isrectangular a1a1,1sequence of rectangular matrices . .of ,rectangular M rectangular such Statement: 1 , n. . of n of matrices such that Statement: Given isGiven sequence M ×such ·× ×that Mnnthe can be be computed. computed. thematrices product M11that , .aaa. .sequence ,Given MisGiven matrices such that en is a sequence M1is 1n, .n × ·····that M can the product M n aofsequence × × ·M ×M M·nn·M can can be computed. Clearly, Clearly, it it can can be computed computed as as the the product M M ×Clearly, ·be · · can ×computed. Mbe be computed. Clearly, itbe beof as product can be computed. Clearly, itacomputed can computed sequence × Mbe computed. Clearly, it can becan computed as11astandard then product 1 n can ×··product ··computed. ···· × M computed. Clearly, it can be as the product M11the abe sequence ofcomputed n− −as standard matrix multiplic multiplic × · ·product ·× M can it can be computed as 1×× n sequence n matrix 1 be n ·can a a sequence sequence of of n n − − 1 1 standard standard matrix matrix multiplications, multiplications, and and the the result result does does not not a sequence of n − 1 standard matrix multiplications, and the result does not a sequence n − 1 standard matrix multiplications, and the result does order. not ofmatrix n − of 1 standard matrix multiplications, multiplications, and theresult result does not of standard matrix and the does not depend on their order.the depend on their order. Given the assumption assumption t n − a1 sequence standard multiplications, and the result does not depend on their Given depend on on their order. order. Given Given the the assumption assumption that that multiplying an an × × bab and aa× bΘ(abc) depend on theirthe order. Given the assumption thatan antakes aand depend on their order. Given the assumption that multiplying an × btakes aand a time, depend ontheir their order. Given assumption that and aand ba× ××ccaab matrix matrix Θ(abc) time, what what isis the the mos mos r depend order. Given the assumption that multiplying an amultiplying × b and Given the assumption that multiplying an multiplying and a abmultiplying matrix time, takes b b × × c c matrix matrix takes takes Θ(abc) Θ(abc) time, time, what what is is the the most most efficient efficient way way of of computing computing the the b × c matrix takes Θ(abc) time, what is the most efficient way of computing b time, × cwhat matrix takes time, is of the mostthe efficient way of product? computing the the × c matrix takes time, what is what theof most efficient way of computing the entire product? kes bΘ(abc) what is most theΘ(abc) most efficient way computing the entire isΘ(abc) the efficient way computing entire product? entire entire product? entire product? entire product? entireproduct? product? The problem problem isis solved solved by by dynamic dynamic program program The The problem isdynamic solved by dynamic programming over all intervals. I.e.,the the states can of indices i, j su The The problem problem is is solved solved by by dynamic dynamic programming programming over over all intervals. intervals. I.e., I.e., the the problem is by solved byall dynamic programming over all intervals. I.e., the problem is by solved dynamic programming over all intervals. The problem isThe solved programming over all all intervals. I.e., the states can beI.e., described by pairs pairs m is solved byThe dynamic programming over intervals. I.e., the states can be described by of indices i, j su states states can be be described by pairs pairs of ofpairs indices indices i,ji,j.jsuch jof such such that ijithat ≤ ≤For j. j. For For each each state state we we bedescribed described by pairs of indices For such that .iFor state compute thefor bestthe states canby be byofipairs indices such that iFor ≤ j. For each state we states can described by i,that j that such ≤ j.each each state we statescan can be described by of indices i,indices i i,≤ j. each state we compute thewe best solution for the matrices matrices in in th t scribed by pairs of be indices i,pairs jdescribed such that ≤ each state we compute the best solution compute compute the the best best solution solution for for the the matrices matrices in in the the given given range. range. compute the best solution for the matrices in the given range. compute best solution for therange. matrices in the given range.Issues compute the the bestthe solution for the matrices in the given range. Issues with with this this problem: problem: st solution for matrices in the given Issues Issues with this this problem: problem: Issues with this problem: Issues with this problem: Issueswith with this problem: this problem: 48 M. Forišek solution for the matrices in the given range. Issues with this problem: – – – – ●● Incomprehensible students who lack background in in linear algebra. The problem – Incomprehensible totostudents who lack background linear algebra. The The – Incomprehensible to students who lack background in linear algebra. feels unnatural and the cost function seems arbitrary. problem feels unnatural and the cost function seems arbitrary. problem feels unnatural and the cost function seems arbitrary. Incomprehensible to students who lack background in linear algebra. The ●● Unnecessary clutter: the input of ordered of integers. There are – Unnecessary clutter: the isa asequence sequence ofarbitrary. ordered pairs of integers. – Unnecessary clutter: the isinput is aseems sequence ofpairs ordered pairs of integers. problem feels unnatural and theinput cost function other similar problems on integer sequences and/or strings. There are other similar problems on integer sequences and/or strings. There are other similar problems on integer sequences and/or strings. Unnecessary clutter: the input is a sequence of ordered pairs of integers. ●are ● Lack of similar practical motivation. Finding a clear cleara practical application forforthis algo– Lack of practical motivation. asequences practical application this –other Lack of practical motivation. Finding clear practical application for this There problems onFinding integer and/or strings. rithm is probably impossible. algorithm is probably impossible. algorithm is probably impossible. Lack of practical motivation. Finding a clear practical application for this 3 ) Θ(n DP algorithm showninshown intext- in – The of of a much solution. The ●● The existence a of much betterbetter solution. The Θ(n DP 3algorithm shown ) DP algorithm –existence existence a better much solution. The algorithm isThe probably impossible. 3 textbooks is an overkill, Hu and Shing [4] gave a different O(n log n) solution textbooks is an overkill, Hu and Shing [4] gave aa different different O(n log books is an overkill, Hu and Shing (1982, 1984) gave solu) DP algorithm shown in n) solution The existence of a much better solution. The Θ(n for tion this problem. this for thisproblem. problem. textbooks isfor an overkill, Hu and Shing [4] gave a different O(n log n) solution for this problem. 3.2 Shortest paths in DAGs 3.2 Shortest in DAGs 3.2. Shortest Paths in paths DAGs 3.2 Shortest paths in DAGs Statement: Given Given is a weighted directed acyclicacyclic graph graph (DAG). Find the shortest Statement: is a weighted directed (DAG). Find the shortest path from vertex 1 to vertex n. path from vertex 1 to vertex n. acyclic Statement: Given is a weighted directed (DAG). the shortest path Statement: Given is a weighted directed acyclic graph graph (DAG). Find Find the shortest This isThis a 1very to be used some the instruction is vertex agood very problem good problem to beatused at point some during point during the instruction path from vertex vertex n. from vertex 1 to to . on dynamic programming – mostly because it is the most general one. Essentially dynamic programming mostly because it isduring the most one. Essentially This is aon very good problem to be –used at some point the general instruction This is aprogramming very good problem tosolutions becan usedbeat some point during the instruction dyall dynamic solutions viewed as computations on directed all dynamic programming can be viewed asone. computations onon directed on dynamic programming – mostly because it is the most general Essentially namicacyclic programming – mostly because it is the most general one. Essentially all dynaacyclic graphs: the states of the computation (i.e., sub-instances we are solving) graphs: solutions the statescan of the computation (i.e., sub-instances we are solving) all dynamic programming be viewed as computations on directed mic programming solutions can viewed as computations onwe directed acyclic are the vertices of the DAG, thebe recursive relation determines the and graphs: the are the vertices thecomputation DAG, the recursive relation determines the edges, and the acyclic graphs: states ofof the (i.e., sub-instances areedges, solving) the states of the computation (i.e., sub-instances we are solving) are the vertices of order in which an iterative DP solution evaluates the states must correspond to the to order which iterative DPrelation solutiondetermines evaluates the edges, states must correspond are the vertices ofin the DAG,anthe recursive and the relation determines the edges, and the order in which to an iterative topological order of thisof graph. a the topological order this graph. order ainDAG, which anrecursive iterative DP solution evaluates the states must correspond DP solution evaluates the states must correspond to a topological order of this Issues: Dasgupta uses this problem as the first problem on which a dynamic Issues: a dynamic a topological order of Dasgupta this graph. uses this problem as the first problem on which graph. programming approach is problem presented. strongly advise against wepro- we Dasgupta uses this the first problem on which a While dynamic programming is problem presented. Weproblem strongly against that. While Issues: Issues: Dasgupta usesapproach this as We the as first onadvise which athat. dynamic agree that the concepts mentioned in the previous paragraph are important, wethat we agree that the concepts mentioned in the previous paragraph arewe important, gramming approach presented. Westrongly strongly advise against that. agree programming approach is is presented. We advise against that.While While we believe that the proper time and way to learn them is by abstraction after already believe that the proper time and way to learn them is byimportant, abstraction the concepts mentioned in the paragraph are important, we believe thatalready the agree that the concepts mentioned inprevious the previous paragraph are weafter being with many specific problems solved using programming. being familiar with many problems using dynamic programming. proper time and way to learn is bythem abstraction after dynamic already being familiar with believe thatfamiliar the proper time and waythem tospecific learn is by solved abstraction after already Additionally, thisspecific problem requires students to dynamic be able store a Additionally, this problem requires students to betoprogramming. able to and storeaccess and access a being familiar with many problems solved using many specific problems solved using dynamic programming. graph, and the data structures needed to do so efficiently are more involved than graph,this andproblem the structures needed doable soable efficiently are more involved than Additionally, requires students totobe to store and access a a graph, Additionally, this data problem requires students to be to store and access static A detailed the of time and space complexity is also simple static arrays. A detailed theare time andinvolved spacethan complexity is also graph,simple and the the dataarrays. structures needed to do soofefficiently more than and data structures needed toanalysis do soanalysis efficiently are more involved simple static non-trivial as there are two different parameters (the number of vertices and the as there are two different parameters (the number of vertices and simple arrays. staticnon-trivial arrays. A detailed analysis of the time and space complexity is also A detailed analysis of the time and space complexity is also non-trivial as therethe number of edges). Thisdifferent isThis especially true if (the one if tries to solve this using using number of edges). isparameters especially one to solveproblem this problem non-trivial as are two number of vertices and the are two there different parameters (the number oftrue vertices andtries the number of edges). This is recursion without memoization. recursion without memoization. numberespecially of edges). This is especially truethis if one triesusing to solve this problem using true if one tries to solve problem recursion without memoization. recursion without memoization. 3.3 Longest common subsequence 3.3 Longest common subsequence 3.3. Longest Common Subsequence 3.3 Longest common subsequence Statement: Given Given two sequences (or strings), find one sequence that that Statement: two sequences (or strings), findlongest one longest sequence occurs as a (not necessarily contiguous) subsequence in each of them. occurs as necessarily contiguous) subsequence in each of that them. Statement: Given Given twoa (not sequences (or (or strings), find sequence Statement: two sequences strings), findone one longest longest sequence that occurs as a problems: Edit distance (Levenshtein distance) between two strings, problems: Edit distance (Levenshtein distance) between two strings, occurs (not asRelated anecessarily (notRelated necessarily contiguous) subsequence in each of them. contiguous) subsequence in each of them. DNA sequence alignment. DNA sequence Related problems: Edit alignment. distance (Levenshtein distance) between two strings, This is,This and certainly should be, (Levenshtein thebe, gold among introductory probis, and certainly should thestandard gold standard among introductory probRelated problems: Edit distance distance) between two strings, DNA DNA sequence alignment. lems using should dynamic The among solution only requires basic arlemscertainly solvable using dynamic programming. The solution only requires basic arsequence alignment. This is,solvable and be,programming. the gold standard introductory probrays,This the subproblems and the recurrence are natural, and each ofeach the of subprobrays, subproblems and the the recurrence are only natural, and subprobis,the and certainly should be, gold standard among introductory problems lems solvable using dynamic programming. The solution requires basic ar-the can be evaluated inprogramming. constant time. lems can beand evaluated in constant time. rays, lems the subproblems the recurrence are of the subprobsolvable using dynamic Thenatural, solutionand onlyeach requires basic arrays, the subthis problem can betime. used the differences between the the Later, this problem can bewhen used discussing when discussing the differences between lems can Later, be evaluated in constant top-down and the bottom-up top-down and the Later, this problem can be bottom-up used approach. when approach. discussing the differences between the top-down and the bottom-up approach. Towards a Better Way to Teach Dynamic Programming 49 problems and the recurrence are natural, and each of the subproblems can be evaluated in constant time. Later, this problem can be used when discussing the differences between the topdown bottom-up approach. roblem also leads toand anthe advanced topic: Hirschberg’s implementation This problem also leads to also an advanced Hirschberg’s implementation This problem leads to an antopic: advanced topic: Hirschberg’s ttom-up DP solution that can reconstruct optimal solution in linear implementation (Hirsch3] of a bottom-up DP solution that can reconstruct an optimal solution linear solution in linear berg, 1975) of a bottom-up DP solution that can reconstruct aninoptimal memory. memory. of the brute force approach (recursive search The time complexity Issues: The time complexity of the brute force approach (recursive search Issues: time complexity approach memoization) is hard toThe analyse exactly andofitthe is brute oftenforce neglected in (recursive search without without memoization) is hard to analyse exactly and it is often neglected in memoization) analyse exactly andwithout it is often neglected in textbooks. Cormen . Cormen only mentions itistohard be to “exponential-time” any deextbooks. Cormen only mentions it to be “exponential-time” without any dee Dasgupta and completely avoid mentioning it. any Skiena onlyKleinberg mentions it to be “exponential-time” without details, while Dasgupta and ails, while Dasgupta and Kleinberg completely avoid mentioning it. Skiena one to address it, showing completely a (non-tight) 3n lower bound for his version avoids mentioning it. bound Skiena theversion only one to address it, s the only one Kleinberg to address it, showing a (non-tight) 3n lower forishis t distance problem. showing a (non-tight) 3 lower bound for his version of the Edit distance problem. f the Edit distance problem. ed topic: Hirschberg’s implementation construct an optimal solution in linear knapsack 3.4. 0–1 Knapsack .4 0-1 knapsack rute force approach search : Given is an integer (recursive weight limit W and a collection of n items, each tatement: Given is an integer weight limit W and a collection of n items, each exactly and witi and is often neglectedcost in ci . Find a subset of items that nteger weight an arbitrary with an integerStatement: weight wi Given and an cost ci .limit Find a subset of items of that is arbitrary an integer and a collection items, each with el weight “exponential-time” without any denot exceeding the given limit andweight the largest possible total as a total weight not exceeding the given limit and the largest possible total anmentioning integer weight and an arbitrary cost . Find a subset of items that has a total weight mpletely avoid it. Skiena ost. n not exceeding the given limit many and thecopies largestofpossible totalare cost. n-tight) 3 lower bound for his version d problems: Knapsack where arbitrarily each item Related problems: Knapsack where arbitrarily many copies of each item are Coin change problems. Related problems: Knapsack where arbitrarily many copies of each item are availvailable. Coin change problems. a reasonably able. natural class of problems. Again, their advantage is that Coin change problems. This is a reasonably natural class of problems. Again, their advantage is that mentation only requires basic tools and that the recurrence relation This is arequires reasonably class of problems. Again,isrelation their advantage is that the he implementation only basicnatural tools and that the recurrence is implementation only requires basic tools and that the recurrence relation is simple. imple. itThe W and a collection of n items, each time. At some point, the stuwhole notion of pseudopolynomial Issues: The whole notion of pseudopolynomial time.point, At some Issues: The whole notion of pseudopolynomial time. At some thepoint, stu- the students need . Find a subset items that dytocost be cexplained why an of algorithm that runs in O(nW ) is not considi ents need to be explained why ananalgorithm in O(nW ) is consid- a polynomialto be explained why algorithmthat that runs runs in is not not considered n limit and the largest possible total ynomial-time algorithm. While this issue is orthogonal to the concept red a polynomial-time algorithm. thisisissue is orthogonal to the of concept time algorithm. WhileWhile this issue orthogonal to the concept dynamic programming, c programming, it is an inherent part of this task and it should come f dynamic programming, it ispart anof inherent part of this task and itduring should come it is an inherent this task and it should come up its analysis. From experibitrarily manyFrom copies of each item its analysis. experience, thisare may be the most challenging part p during its analysis. From experience, this may be the most challenging part ence, this may be the most challenging part of the problem for the students. blem for the students. f the problem forSedgewick the students. avoids the topictime of pseudopolynomial oblems. Again, their advantage is that wick avoids the topic of pseudopolynomial completely. It is time just completely. It is just menSedgewick avoids the of pseudopolynomial time completely. It is just Kleinberg also tioned that thetopic algorithm only usefulare if the are not huge. and the recurrence relation isiscapacities dols that thethat algorithm is only useful if the not capacities huge. Kleinmentioned thatavoids the algorithm is only useful if the capacities are not huge. Kleinthis topic completely. avoids this topic completely. erg also avoids this topic completely. Dasgupta addresses topic with“[...] a single brief ynomial time. At some point, the stu- the upta addresses the topic with a single brief note: they cannote: both“[...] they can both be solved Dasgupta addresses the topic with a single brief note: “[...] they can both hm that runs in O(nW ) is considin time, which is reasonable isbut small, but is not polynomial since the in O(nW ) time, which isnot reasonable when W when is small, is not e solved in O(nW ) time, which is reasonable when W is small, but is not this issue orthogonal to the conceptto al since theis input proportional to log W rather than.” W .” inputsize sizeisis proportional rather than olynomial since the input size is proportional to log W rather than W .” t part of this task and it should come his may be the most challenging part onacci numbers 3.5.numbers Fibonacci Numbers .5 Fibonacci opolynomial time completely. is just number. : Given n, compute the n-th It Fibonacci tatement: Given n, compute the n-th Fibonacci number. l if the capacities huge. source Klein-of what is possibly the simplest acci numbers are are an not excellent Statement: , compute the of -th Fibonacci number. Fibonacci numbers areGiven an excellent source what is possibly the simplest l recurrence relation. They can easily be used to demonstrate the efon-trivial recurrence relation. They can easily be used to demonstrate the efsingle briefasnote: “[...] theynumbers canrecursive both moization, a straightforward that computes their Fibonacci are anfunction excellent source of what is possibly the simplest nonect of memoization, as a straightforward recursive function that computes their onable when trivial W time. is small, but relation. is not They can easily be used to demonstrate the effect of s in exponential recurrence alues runs in exponential time. ional logissue W rather W .” The to only with than this as very simple problemrecursive is that the numbers a this straightforward Issues: Thememoization, only issue with very simple problemfunction is that that the computes numbers their values runs s grow exponentially and their values quickly exceed the range of stanin exponentially exponential time. hemselves grow and their values quickly exceed the range of stanger variables in most languages. And even if you use a programming ard integer variables in most languages. And even if you use a programming with arbitrary precision integers (e.g., Python), the size of these numanguage with arbitrary precision integers (e.g., Python), the size of these numbonacci a role innumber. estimating the time complexity of efficient programs. ers plays a role in estimating the time complexity of efficient programs. ource of what is possibly the simplest easily be used to demonstrate the ef- 50 M. Forišek Issues: The only issue with this very simple problem is that the numbers themselves grow exponentially and their values quickly exceed the range of standard integer variables in most languages. And even if you use a programming language with arbitrary precision integers (e.g., Python), the size of these numbers plays a role in estimating the time complexity of efficient programs. A common way to address this issue is to modify the problem: instead of computing common wayto toaddress address this issueiswe istoaim tomodify modify theproblem: problem: instead the exact value ofway the -th Fibonacci number to compute its value modulo some AAcommon this issue the instead ofof 9 Fibonacci computing the exact value the n-th number weaim aimto tocompute computing exact value ofofthe number we small integer.the (E.g., if the modulus isn-th 10 , weFibonacci are in fact computing the last 9compute decimal 9 9 , we are in fact value modulosome some smalljust integer. (E.g., the modulus itsitsvalue modulo small integer. (E.g., ififthe isis1010 ,sequence we are inmodulo fact digits of we would like to remark thatmodulus the Fibonacci .) Here .) Here we would just like to remark that computing the last 9 decimal digits of F .) Here we would just like to remark that computing the last 9 decimal digits of F n n any is necessarily periodic and this observation leads to asymptotically more efthe Fibonacci sequence modulo any m is necessarily periodic and this observation the Fibonacci sequence modulo any m is necessarily periodic and this observation ficient algorithms. leadstotoasymptotically asymptoticallymore moreefficient efficientalgorithms. algorithms. leads 4. Our Approach to Teaching Dynamic Programming Ourapproach approachtototeaching teachingdynamic dynamicprogramming programming 44 Our this final section wegive give detailed presentation how wesuggest suggest teach InInthis this final section a adetailed presentation ofofwe how we toto teach In final section wewe give a detailed presentation of how suggest to teach dynamic dynamic programming. For each task used we clearly state and highlight the new dynamic programming. For each task used we clearly state and highlight the new programming. For each task used we clearly state and highlight the new concepts it inconcepts it introduces, and we argue why our way of introducing them works. concepts it introduces, and we argue why our way of introducing them works. troduces, and we argue why our way of introducing them works. Note that we intentionally start with thetop-down top-down version dynamic proNote that intentionally the ofofdynamic proNote that wewe intentionally startstart withwith the top-down versionversion of dynamic programming, gramming,i.e., i.e.,bybyadding addingmemoization memoizationtotorecursive recursivefunctions. functions.This Thisisisintentional intentional gramming, i.e., by adding memoization to recursive functions. This is intentional and very signifiandvery verysignificant. significant.The Themain mainpurpose purposeofofthis thischoice choiceisistotoshow showthe thestudents students and cant. The main purpose of this choice is to show the students how to break up the design howtotobreak breakup upthe thedesign designofofananefficient efficientsolution solutioninto intomultiple multiplesteps: steps: how of an efficient solution into multiple steps: Implementa arecursive recursivealgorithm algorithmthat thatexamines examinesallallpossible possiblesolutions. solutions. 1.1.1. Implement Implement a recursive algorithm that examines all possible solutions. Usethe thealgorithm algorithmtotodiscover discovera arecursive recursiverelation relationbetween betweenvarious varioussubprobsubprob2.2.2. Use Use the algorithm to discover a recursive relation between various subproblems lemsofofthe thegiven givenproblem. problem. lems of the given problem. Addmemoization memoizationtotoimprove improvethe thetime timecomplexity, complexity,often oftensubstantially. substantially. 3.3.Add 3. Add memoization to improve the time complexity, often substantially. Optionally,convert convertthe thesolution solutioninto intoananiterative iterativebottom-up bottom-upsolution. solution. 4.4.Optionally, 4. Optionally, convert the solution into an iterative bottom-up solution. Themore moretraditional traditionalapproach approachthat thatstarts startswith withiterative iterative DPrequires requiresstudents students The The more traditional approach that starts with iterative DPDP requires students to do steps2 2and and4 4atatthe thesame sametime, time,without withoutgiving givingthem themgood goodtools toolstotododo totododosteps steps 2 and 4 at the same time, without giving them good tools to do the analysis and theanalysis analysisand andtotodiscover discoverthe theoptimal optimalsubstructure. substructure.InInour ourapproach, approach,step step the to1discover the optimal substructure. our approach, step solution, 1 gives them such a tool: givesthem them sucha atool: tool: oncewe weIn have therecursive recursive thearguments arguments 1 gives such once have the solution, the once we recursive have the function recursive define solution, the arguments of thewe recursive function define the thesubproblems, subproblems, and canexamine examine whether ofofthe recursive function define the and we can whether the subproblems, and we can examine whether the function gets called multiple times thefunction functiongets getscalled calledmultiple multipletimes timeswith withthe thesame samearguments. arguments.IfIfititdoes, does, we the we with the same arguments. If it does, we know that the problem does exhibit the optimal knowthat thatthe theproblem problemdoes doesexhibit exhibitthe theoptimal optimalsubstructure, substructure,and andininstep step3 3we we know substructure, and in step 3inefficient we mechanically convert our inefficient mechanically convert our inefficient solution intoanan efficient one.solution into an mechanically convert our solution into efficient one. efficient one. Lesson1:1:Fibonacci Fibonaccinumbers numbers Lesson Lesson 1: Fibonacci numbers Goals:Observe Observe arecursive recursivefunction functionwith withananexponential exponentialtime timecomplexity. complexity.DisDisGoals: Goals: Observe a arecursive function with an exponential time complexity. Discover the coverthe thesource sourceofofinefficiency: inefficiency:the thefunction functionexecutes executesthe thesame samerecursive recursivecall call cover source of inefficiency: the function executes the same recursive call many times. many times. many times. and Fibonaccinumbers numbershave haveaaawell-known well-knownrecurrence: recurrence: Fibonacci have well-known recurrence: FF and Fibonacci numbers 0 0==0,0,FF 1 1==1,1,,and = F + F . In our presentation we use the following Python ∀n > 1 : F wewe useuse the the following Python imple.. In In our ourpresentation presentation following Python ∀n > 1 : Fn n= Fn−1 n−1+ Fn−2 n−2 implementation: implementation: mentation: defF(n): F(n): def if n==1ororn==2: n==2: if n==1 return 1 return 1 else: else: Towards a Better Way to Teach Dynamic Programming 51 def F(n): if n==1 or n==2: return 1 else: Note that this implementation Note that this neglects implem Note thatreturn this implementation neglects theneglects case n = 0ncase and uses n base = 1 uses and F(n-1) + F(n-2) = 2 as the case. This intentional, Note that this implementation the n = 0 and nas =the 1isand n = 2 base case. Th Note that this implementation neglects the case n = 0 and uses n = 1 and n = 2 as thenbase case. intentional, the purposethe is apurpose more elegant analysis later. =this 2isas theThis baseiscase. This is intentional, elegant n = 2 as thelater. base Note case.that This intentional, theneglects purpose is acase more elegant analysis implementation the = 0 and uses is=a1 more and later. = 2 asanalysis We can now run We this program and have later. ater. the This is intentional, the purpose a more elegant analysis later.of the can now run this p Webase cancase. nowWe runcan this program and have it is compute consecutive values Fibonacci sequence: now run this program and have it compute consecutive values of the We can now run andnprogram have compute values of the Wethis now runuses this and have consecutive it compute consecutive values of Fibonacci the Fibo- sequence: n neglects the case ncan =program 0 and = 1 it and Fibonacci sequence: Fibonacci sequence: Fibonacci sequence: entional, the purpose is a more elegant analysis nacci sequence: for n in range(1,100): print( n, F(n for n in range(1,100): for nneglects in range(1,100): print( n, print( F(n) )and implementation the case n = 0 and uses n = 1 for n in range(1,100): n, F(n) ) for n in range(1,100): n, F(n) ) print( n, F(n) ) The first few rows of input will appear for n inprint( range(1,100): mase. andThis haveisitintentional, compute consecutive of the purpose is a this more elegant analysis The rows of i Noteofvalues that implementation neglects the case n = 0n and uses first n =few 1 and The first the few rows input will appear instantly but already around =slow 35 thepurpose program will down towill a crawl. W The first few rows of input will appear instantly but already around n = 35 The first few rows of input will appear instantly but already around 35 the prothe program slow dow n = 2 as the base case. This is intentional, the is a more elegant analysis = The first few rows of input will appear instantly but already around n = 35 this implementation neglects thedown case nto=a 0crawl. and uses n = empirically 1 and the program will slow Wea can measure that each that next value takes about 1.6 times longer to the program will slow down to crawl. We can empirically measure each nhethis program have itlater. compute values of the gram will slow down to1.6 aconsecutive crawl. can empirically measure that each next value nexttakes value takes about 1.6 program will slow down to crawl. We can empirically measure that each base case. This isand intentional, thea purpose is aWe more elegant analysis about times longer to compute than the previous one. t( n, F(n)next ) value takes What going onWhat here? way next value takes about 1.6 times longer to compute thanisthe previous one. e: value takes isThe going on here? We can now run this program and have it compute consecutive values ofeasiest the about 16 times longer to compute than the previous one. next about 1.6 times longer to compute than the previous one. What is going on is here? The easiest way easiest to see itway is to log each recursive call: What going on here? The to see it is to log each recursive call: Fibonacci sequence: isinstantly going on but here? Theit easiest way see itway is values to call: ow run this program and have compute What is already going on here? The easiest to log see each itofisthe torecursive logdef eachF(n): recursive call: ill What appear around n consecutive =to 35 1,100): print( n, F(n) ) def F(n): def F(n): uence: a crawl. We can empirically measure that each print(’calling F(’ + str(n) +F(’ ’) def F(n): def F(n): print(’calling for n in range(1,100): print( n, F(n) ) def F(n): print(’calling F(’ + str(n) ++ ’)’) longer to compute than instantly theprint(’calling previous one. ows of input will appear but already around n = 35 ... F(’ str(n) + ’)’) print('calling F(' + str(n) + ')') ... print(’calling F(’ + str(n) + ’)’) nge(1,100): print( n, F(n) ) ... asiest way to to asee it is to each recursive call: of input low down crawl. Welog can empirically that each The first few measure rows will appear instantly but already around n = 35 ... ... ... Already for small valuesthat like = 6 value we q bout 1.6 times longer than like the one. Already for n small the program will slow to a n crawl. can empirically measure each ew rows of input will to appear instantly butnprevious already around = 35 We Already forcompute small values = 6 down we quickly discover that the same function call is made multiple times. Here it is Already for small values like n = 6 we quickly discover that the same function Already for small like measure 6 times we that quickly discover that the same function on way to see to each call: is made multiple instr time next value takes about longer to compute thanrecursion thecall previous one. =1.6 Already foreasiest small like nit =isvalues 6empirically welog quickly discover the function willhere? slowThe down a values crawl. We can that each call istomade multiple times. Here it isrecursive instructional to same show thefor entire r(n) + ’)’) tree n = 6, we omit the picture here to call is made multiple times. Here it is instructional to show the entire recursion call is made multiple times. Here it is instructional to show the entire recursion tree tree for n = 6, we omit th What going on here? The easiest way to see it is to log each recursive call: all is made1.6 multiple times. Here it is instructional to show the entire recursion kes about times longer to compute than the previous one. tree for n =tree 6, we omit the picture herepicture to conserve space. Here, a suitable homework is to leave t for nthe =see 6, we omit the here to conserve space. Here, a suitable homew 6, we omit picture here to conserve space. ree for = 6,for we omit the picture here to conserve space. oing on nhere? The easiest way to it is to log each recursive call: Here,=+a suitable homework ishomework to leave the students analyse how many times ingHere, F(’ a+ suitable str(n) ’)’) F (n − k) gets called during the computa def F(n): Here, a suitable is to leave the students analyse how many times (n − k) gets called dur ishomework to leave the students how many times Here, a suitable is tocomputation leave the analyse students how many times ofFthe = 6 we quickly discover that the same function F (n − k)homework gets called during the of Fanalyse (n). For our version implementation the answeroftothe this print(’calling F(’ + str(n) + ’)’) F (n − k) gets called during the computation of F (n). For our version implementation thequesti answ F it (nis−instructional k) gets called during the computation of F (n). For our version of the gets called during computation of . For our version of the the implementation the to show the the entire recursion implementation answer to this question arequestion again precisely Fibonacci numbers. ... implementation the answer to this are again precisely the Fibonacci calling F(’ + str(n) + ’)’) numbers. mplementation the answer to this question are again precisely the Fibonacci toquickly this question are that againthe precisely the Fibonacci numbers. re values here tolike conserve space. all nanswer = 6 we discover same function numbers. Alternately, we can just directly estim numbers. Alternately, we can ju numbers. to times. leave the students analyse how many times we can just directly estimate the complexity: when computple Here itAlternately, is instructional to show the entire recursion Alternately, we can just directly estimate the time complexity: when Already for small values like n whole =whole 6 wetime quickly discover that the same function computing F (n), each leaf of the recursion Alternately, we can just directly estimate the whole time complexity: when computing F (n), each lea Alternately, we can just directly estimate the whole time complexity: when computation of here Fn(n). For our version of the ing ,6each leaf of the recursion tree contributes to the(This final result. omit thevalues picture conserve space. computing F (n), each leaf of the recursion tree contributes 1 to is the result. call is made multiple times. Here it is1instructional tofinal show the recursion or small like =to we quickly discover that the same function the rationale for choice to use computing F (n), each leaf of the recursion tree contributes 1 to theentire final result. (This is our the rationale for on omputing F (n), each ofstudents the recursion tree contributes 1= to the final result. question again precisely the Fibonacci ehis homework isare to leave the analyse how many times (This is the rationale for our choice to use 1 and 2 as base cases.) Hence, (This is the rationale for our choice to use n = 1 and n = 2 as base cases.) Hence, tree for n = 6, we omit the picture here to conserve space. multiple times. Here it leaf is instructional to show the entire recursion = nare there precisely F (n) leaves and thus (This is the rationale for our choice to use n = 1 and = 2 as base cases.) Hence, are precisely F (n) pl This is thethe rationale ourto choice to n our =and 1homework and =of2isas cases.) Hence, ed the computation of (n). For version the there are precisely FF(n) and thus precisely F (n) −recursion 1 students inner nodes inthere the Here, aleaves suitable tobase leave the analyse how many times there arefor precisely use leaves thusnprecisely inner nodes inIn the recur, weduring omit picture here conserve space. tree. other words, the running there are precisely F (n) leaves and thus precisely F (n) − 1 inner nodes in the recursion tree.ofInthe other w here are precisely Fto (n) leaves and thus precisely FFibonacci (n)of −the 1computation inner in the ctly estimate the time complexity: when eitable answer to this question again precisely the recursion tree. other the running time of the nodes computation of Fclearly (n) istoproFInare (nthe − k)words, gets called during the ofthe Fproportional (n). our version homework is leave students analyse how many times sion whole tree. In other words, theother running time computation isFor clearly the (n), w recursion tree. In words, the running time ofof computation of Fvalue (n) is of F to clearly proportional the ecursion tree. In other words, the running time of the computation of F (n) is tree contributes 1 value to to theof result. clearly proportional the ofthe F (n), which known toisgrow exponentially. implementation answer toof this question are again precisely the Fibonacci srecursion called during the computation offinal Fvalue (n). For our version of the portional to the ,the which is known tois(n), grow exponentially. clearly proportional to value F which known to grow exponentially. learly proportional value of F again (n), which is known to grow exponentially. ce to use ndirectly = 1 and nto=the 2 as basewhole cases.) Hence, can just the time complexity: numbers. on the answer to estimate this question are precisely the when Fibonacci Lesson 2: Memoization nd thus precisely F (n) −tree 1 inner nodes in the ach leaf of theLesson recursion contributes 1we to the just finaldirectly result. estimate Alternately, can the whole timeLesson complexity: when 2: Memoization 2:Memoization Memoization Lesson 2: he running time of the computation of F (n) is Lesson 2: Memoization ale for our choice to use n = 1 and n = 2 as base cases.) Hence, computing F (n), each leaf of the recursion tree contributes 1 to the final result. y, we can just directly estimate the whole time complexity: when Lesson 2: Memoization Goals: Learn about memoization andmem con LearnHence, about of F each (n), which known to grow Goals: Learn about memoization and conditions when it can F (n) leaves and thus precisely Fexponentially. (n) − 1 inner nodes in the (This is the rationale our to use n can =be1 applied. andapplied. n = 2 asGoals: base cases.) (n), leaf ofisthe recursion tree contributes 1for to thechoice final result. Goals: Learn about memoization and conditions when it be One of the points that is woefully neg Goals: Learn about memoization and conditions when it can be applied. Onenodes of theinpoints Goals: Learn about and conditions when applied. other words, the running time the computation ofcases.) Fit (n)can isinbe there are precisely F (n) leaves and thus precisely F (n) − 1isdifference inner the tha ationale for our choice to use n =of 1that and nwoefully = 2 as base One ofmemoization the points is woefully neglected traditional textbooks the One of the points neglected inHence, traditional textbooks is the difference between functions in the functi math One of that the is points thatin is woefully neglected inistraditional textbooks is the difference between the points that is woefully neglected traditional textbooks the al toOne the value of F (n), which is known to grow exponentially. recursion tree. In other words, the running time of the computation of F (n) is cisely F of (n) leaves and thus precisely F (n) − 1 inner nodes in the difference functions infunctions the mathematical and in the sense. programbetween between functions inbetween the mathematical sense andmathematical in sense the programming The outming sense. The output of a mathematical difference in the sense and in the programming sense. The output o difference functions intime the mathematical programclearly proportional to thesense value of Fin (n), which is known grow exponentially. e. In otherbetween words, the running computation ofand F (n) isthe ming sense. The output ofof athe mathematical function only depends on its to inputs: cos(π/3) iscos(π/3) the same valueisasthe cos(π/3 putoutput of a mathematical function onlyofdepends on its inputs: today today is the ming sense. The output a mathematical function only depends onsame its inputs: today sam ming The of a mathematical function only depends on its inputs: n andsense. conditions when it can be applied. rtional to the value of F (n), which is known to grow exponentially. cos(π/3) today is the today same is value as acos(π/3) tomorrow. For a function in aconsecutive comization puter program, two consecutive calls with t cos(π/3) the same value as cos(π/3) tomorrow. For a function in a comvalue as tomorrow. For function in a computer program, two puter program, two consec os(π/3) today is the same value as cos(π/3) tomorrow. For a function in a comefully neglected in traditional textbooks is the puter program, two consecutive calls with the same arguments may often return different values. There arevalues. lots ofThere differen 2: with Memoization program, twothe consecutive calls with thevalues. same arguments may often return calls withputer theLesson same arguments may oftenarguments return different There are lots of diffedifferent ar puter program, two same may often return thememoization mathematical sense and incalls the programut andconsecutive conditions when it can be applied. different values. There are lots of different reasons why this may happen. Forhappen. instance, the output of the function may de different values. There are lots of different reasons why this may For Memoization rent reasons why this may happen. For instance, the output of the function may depend instance, the output of the different values. There are lots of different reasons why this may happen. For hematical function only depends on its inputs: nts that is instance, woefully neglected inof traditional textbooks is the the output the function may depend on global variables, onvariables, Goals: Learn about memoization and conditions when itenvironcan beas applied. ment variables (such the current locale s instance, the output of the function may depend on global on environon global variables, on environment variables (such as the current locale settings), on ment variablesis(such thement output ofand the function may depend on global variables, on environasabout cos(π/3) tomorrow. Forconditions a function incurrent apoints comfunctions in the mathematical and the programvariables (such locale settings), onsettings), pseudorandom numbers, Oneassense ofthe the that is woefully neglected in input traditional textbooks the as th nnstance, memoization when itasin can be applied. on the from a user, etc. (Listing th ment variables (such the current locale on pseudorandom numbers, pseudorandom numbers, on the input from a user, etc. (Listing these is actually a lovely onfor thethe input from a user ment variables (such as the current locale settings), on pseudorandom numbers, with the arguments may often return utput of athat mathematical only depends ontextbooks its inputs: onsame input from a user, etc. these actually a lovely exercise difference between functions in isthe mathematical sense and in programealls points isthe woefully neglected in traditional is the students!) on function the input from a(Listing user, etc. (Listing these is actually a lovely exercise for students!) exercise for students!) on the input from a user, etc. (Listing these is actually a lovely exercise for of different reasons why this may happen. For he same value as in cos(π/3) tomorrow. a output function amathematical comstudents!) ming sense.For The ofinathe depends on its inputs: tween functions thestudents!) mathematical sense and in program- function Givenonly the above observation, memoizat Given the above obser tudents!) may depend on with global variables, on environoon consecutive same arguments may often return thethe above observation, memoization is a very straightforward concept cos(π/3) today is the same value as cos(π/3) tomorrow. For a function in ato comThe output of calls aGiven mathematical function only depends on its inputs: for students: we simply want avoid com Given the above observation, memoization is a very straightforward concept for students: we simply w Given the above observation, memoization is a very straightforward concept nt locale settings), on pseudorandom numbers, here aresame lots different reasons why this may happen. For forof students: we simply want to avoid computing the same thing twice. And itthatoften puter program, two consecutive calls with the same arguments may return ay is the value as for cos(π/3) tomorrow. For awant function in a computing comshould now be clear any function in o students: we simply to avoid the same thing twice. And it should now be clear that or students: we simply want to avoid computing the same thing twice. And it (Listing these is actually a lovely exercise for ut of theconsecutive function onthat global variables, on environshouldmay now be clear any function in function our program that is also that awhy function in different values. There are lots of return different reasons this may happen.inFor m, two callsdepend with the same arguments may often should now be clear that any in our program is also a function hould now be lots clear any function in ourthis program that is also function ch asThere the current locale settings), onthe pseudorandom numbers, instance, output of the function may on in global variables, on environes. are ofthat different reasons why may happen. Foradepend memoization is a very straightforward concept a user, etc. (Listing these is actually a lovely exercise for the base case. This is intentional, the purpose is a more elegant analysis ntation neglects the case n = 0 and uses n = 1 and an now run thisthe program compute consecutive values of the is intentional, purposeand is ahave moreitelegant analysis ci sequence: M. Forišek 52 ogram and have it compute consecutive values of the n range(1,100): print( n,above F(n)observation, ) Given the memoization is a very straightforward concept for stuthe mathematical sense can be memoized. (Such are sometimes called the mathematical sense be memoized. (Such functions are som dents: we simply want to avoid computing the functions same twice. And it should now first few mathematical rows of input will appear instantly but(Such already around n thing = 35can thebe mathematical s the mathematical sense can be memoized. (Such functions are sometimes called the sense can be memoized. functions are sometimes called athematical sense can be memoized. (Such functions are sometimes called print( n, F(n) ) thematical sense can be memoized. (Such functions are sometimes called pure functions.) pure functions.) matical sense can be memoized. (Such functions are sometimes called clear that any function in our program that is also a function in the mathematical sense ram will slow down to a crawl. We can empirically measure that each pure functions.) pureAn functions.) pure functions.) unctions.) nctions.) An sometimes interesting historical note: memoization only useful ons.) interesting historical note: memoization is not onlypure useful when it comes is not can be memoized. (Such functions are called functions.) ue takes about 1.6 times longer to compute than the previous one. put will appear instantly but already around n = 35 An interesting An interesting historical memoization isonly not only useful when it comes For instance, An interesting historical note:isnote: memoization isuseful not useful when it early comes nresting interesting historical note: memoization isonly not only useful when itcomes comes interesting historical note: memoization is not only when it to improving the asymptotic time complexity. many e historical note: memoization not useful when it comes to improving the asymptotic time complexity. For instance, many programs An interesting historical note: memoization is not only useful when it comes to imtproving is on here? The easiest way to see it is to log each recursive call: n togoing a crawl. We can empirically measure that each to improving the as toasymptotic improving the asymptotic time complexity. Forearly instance, many early programs toasymptotic improving the time complexity. For instance, many early programs the asymptotic time complexity. Forinstance, instance, many early programs roving the time complexity. For many programs that produced computer graphics used precomputed tables of sin ng the timeasymptotic complexity. For instance, many early programs that produced computer graphics used precomputed tables of sines and cosines proving the asymptotic time complexity. For instance, many early programs that protimes longer to compute than the previous one. that produced com that produced computer graphics used precomputed tables of sines and cosines thatcomputer produced computer graphics used precomputed tables ofcosines sines and cosines produced computer graphics used precomputed tables of sines and cosines hematical sense can be computer memoized. (Such functions are sometimes called oduced graphics used precomputed tables of sines and because table lookup was faster than the actualtable evaluation of a ced computer graphics used precomputed tables of sines and cosines ): because table lookup was faster than the actual evaluation ofand a floating-pointduced graphics used precomputed tables of sines cosines because The easiest way to see it is to log each recursive call: because table look because table lookup was faster than the actual evaluation of a floating-pointbecause table lookup was than the actual evaluation of a floating-pointsetable table lookup was faster than theactual actual evaluation ofa afloating-pointfloating-pointnctions.) ent(’calling lookup was than the evaluation of valued ble lookup was faster than the actual evaluation of afunction. floating-pointF(’ +faster str(n) +faster ’)’) valued function. lookup was faster than the actual evaluation of a floating-point-valued function. valued function. valued function. valued function. d function. nteresting historical note: memoization is not only useful when it comes function. ction. oving the asymptotic time complexity. For instance, many early programs + str(n) + ’)’) Lesson 3: same Fibonacci numbers revisited Lesson Fibonacci numbers revisited ady for small values n3:=used 6 weprecomputed quickly discover thatofthe function Lesson 3:like Fibonacci numbers revisited oduced computer graphics tables sines and cosines Lesson 3: Fibona Lesson 3: Fibonacci numbers revisited 3: Fibonacci numbers revisited on Fibonacci numbers revisited nade 3:3:Lesson Fibonacci numbers revisited Fibonacci numbers revisited multiple Here than it is instructional to Goals: show the entire recursion table lookuptimes. was faster the actual evaluation of a floating-pointSee the stunning effect memoization can have. Goals: the stunning effect memoization canhave. have. Goals: See the stu See theSee stunning effect memoization can n =See 6, we6Goals: omit the picture here tocan conserve space.can Goals: See the stunning effect memoization can have.memoization (using a simple array) Goals: the stunning effect memoization have. like nstunning = weSee quickly discover that the same function s: the stunning effect memoization can have. unction. See the stunning effect memoization can have. By applying to the Fibon the effect memoization have. By applying m By applying memoization (using a simple array) to the Fibonacci function, By applying (using a simple array) to the Fibonacci function, each of aapplying suitable homework is(using to leave the students analyse how times applying memoization (using ato simple array) the Fibonacci function, applying memoization (using atorecursion simple array) to many thetofunction, Fibonacci function, Here itBy ismemoization instructional to the entire ylying memoization (using simple array) to the Fibonacci function, applying a amemoization simple array) the Fibonacci each of the values F (1) through F (n) is only computed once, using memoization (using ashow simple array) the Fibonacci function, each of the values F each of the values F (1) through (n) is only computed once, using a asingle addi)values gets called during the computation of Fis FF (n). For our version of the each the values F (1) (n) only computed once, using single addithe values isthrough through isisonce, only computed once, using aa single addition. Thereof the values F(n) (1) through F (n) only computed once, using a single addipicture here to conserve space. of theeach values F(1) (1) through F (n) only computed once, using asingle single addithe values F through F (n) is only computed using a addition. Therefore, we suddenly have program that only performs Θ F (1) through F is only computed once, using a single addi3: Fibonacci numbers revisited tion. Therefore, we ntation to this question are again precisely the Fibonacci tion. Therefore, we suddenly aprogram program that only performs Θ(n) tion. Therefore, we suddenly have aperforms that only performs additions Therefore, wewe suddenly have athat program that only performs additions fore, suddenly have ahave program that Θ(n) only performs additions to compute the ork istion. to the leave the students how many times Therefore, weanswer suddenly aprogram program that only performs Θ(n) additions herefore, we suddenly have aanalyse only performs Θ(n) additions to compute the value FΘ(n) (n): Θ(n) quite an additions improvement over the origin fore, we suddenly have ahave program that only additions to compute the val to compute the value Fimprovement (n): quite an improvement over theexponential original exponential to the value F (n): quite an improvement over the original exponential to compute value F (n): quite an improvement over the original exponential g.pute the computation of F (n). For our version of the mpute the value Fvalue (n): quite over the original exponential :an quite an can improvement over the original time. The See the stunning effect memoization have. the value Fthe (n): quite an improvement over the original exponential time. The above Python program can now above easily Pycompute F (1000 the value Fcompute (n): quite an improvement over the original exponential time. The above P time. The above program can now easily compute F (1000). nately, we can just directly estimate the whole complexity: when time. The above Python program can now compute F (1000). to above this question are again precisely the Fibonacci time. The above Python program can now easily compute F (1000).with arbitrarily The above Python program can now easily compute F (1000). thon program now easily compute . function, applying memoization (using acan simple array) toeasily the Fibonacci The Python program can now easily compute F (1000). (Note that Python operates large integers. above Python program canPython now easily compute Ftime (1000). (Note that Pyt that operates with arbitrarily large integers. The Fing (n), each leaf of the recursion tree contributes 1 using tonumber theaThe final result. (Note that Python operates with arbitrarily large integers. The n-th FiNote that Python operates with arbitrarily large integers. The n-th Fi-The (Note that Python operates with arbitrarily large integers. The n-th Fi- num(Note that Python operates with arbitrarily large integers. Fibonacci the values F (1) through FPython (n) is only computed once, single addiote Python operates with arbitrarily large integers. Fibonacci is n-th exponential in-th nn-th and therefore has Θ(n) digits hatFthat Python operates with arbitrarily large integers. The n-th Fibonacci number is bonacci number isais inin n therefore has Θ(n) digits. InRAM the RAM the rationale for our choice to use ntherefore = 1nand n = 2Θ(n) as base cases.) Hence, bonacci number ishave exponential in and therefore has Θ(n) digits. Indigits. the estimate the whole time complexity: when number exponential in nexponential and therefore has digits. the RAM bonacci number nand and therefore Θ(n) Inmodel, the RAM herefore, we program that only performs Θ(n) additions ber is exponential in and therefore has has digits. Incomplexity the RAM the actual icidirectly number isissuddenly exponential in nexponential and has Θ(n) digits. InIn the RAM model, the actual time of the above algorithm is O mber is exponential in n and therefore has Θ(n) digits. In the RAM 2 2 2 model, the actual 2 model, the actual time complexity ofof above algorithm is addition O(n ),each as), each el,ute precisely Fthe (n) leaves and precisely Fover (n) −the 1is inner nodes in the 2 algorithm model, actual time complexity of the above algorithm isaseach O(n ), of the recursion contributes to the final result. actual time complexity the above algorithm is2O(n O(n each model, the actual time complexity the above is as O(n as O(n) each -digit the value F tree (n): quite anthus improvement the original exponential the actual time complexity ofof1of the above algorithm is ),,), as each addition of two O(n)-digit numbers takes steps.) time complexity the above algorithm is as of two actual time complexity of the above algorithm O(n ), as each addition of two O( addition of two takes O(n) nrhe tree. Intoaddition other words, the running time of the computation of F (n) isto highlight the contrast: exponential addition ofn two numbers takes O(n) steps.) choice use = 1 O(n)-digit and nO(n)-digit = 2takes as base cases.) Hence, on of two O(n)-digit numbers takes O(n) steps.) above Python program can now easily compute F (1000). n of two O(n)-digit numbers O(n) steps.) Here it steps.) is steps.) important tim of two O(n)-digit numbers takes O(n) two O(n)-digit numbers takes O(n) steps.) numbers takes numbers steps.) Here it is impo Here it is important to highlight the exponential time without roportional to the value F to (n), isthe known tocontrast: grow exponentially. Here itto isoperates important highlight contrast: exponential time without vs. aves and thus precisely Fofit(n) 1 which inner nodes in the ere itis isimportant important toHere the contrast: exponential time without vs. e it that Python with arbitrarily large integers. The n-th Fito highlight the contrast: exponential time without vs. polynomial time with memoization. It isvs. also instructional to dra is important highlight the contrast: exponential time without vs. Here it ishighlight important to highlight the contrast: exponential time without vs. is − important to highlight the contrast: exponential time without vs. polynotime w polynomial time with memoization. It(n) is also to draw a new, col-for npolynomial polynomial time with memoization. It isFalso instructional tocoldraw arecursion new, colds, the running time ofinthe computation of is omial time withmemoization. memoization. It isalso alsoinstructional instructional draw athe new, colnumber is exponential n and therefore has digits. In mial time with Italso is totoinstructional draw new, colversion ofathe entire tree = 6. time with memoization. Itwith is instructional tois draw ainstructional new, polynomial time with memoization. Italso also to draw a new, col-lapsed mial time memoization. It lapsed isΘ(n) instructional toRAM draw a new, collapsed version of version of th 2 lapsed version ofentire theofto entire recursion treen for = 6. ), as each lapsed version of is the recursion = is 6.n O(n value ofthe Flapsed (n), known grow exponentially. d version of thewhich entire recursion tree for n=tree =algorithm the actual time complexity the above version of the entire recursion for n 6.6.=for ion of entire recursion tree for ntree = 6. version ofrecursion the tree entire recursion n = 6. 2: Memoization the entire for 6.tree for n of two O(n)-digit numbers takes O(n) steps.) Lesson 4: Maximum weighted independent set on a line Learn about memoization and conditions when iton cana be applied. it 4: is important to4: highlight the contrast: exponential time without vs. Lesson 4: Maxim Lesson Maximum weighted independent set on line 4: Maximum weighted independent set on aon line on Maximum weighted independent setaindependent line nMaximum 4: Lesson Maximum weighted independent set on a line Lesson 4: Maximum weighted setset aathe line weighted independent set on line Lesson 4: weighted independent on a line of the points is Maximum woefullyItneglected in traditional textbooks is mial time with that memoization. is also instructional to draw a new, colGoals: Encounter the first problem solvable using dynamic progra Goals: Encounter Goals: Encounter the first problem solvable using dynamic programming. Goals: Encounter the first problem solvable using dynamic programming. Learn s: Encounter the first problem solvable using dynamic programming. Learn ezation between functions inwhen the sense and in the programversion of the entire recursion tree for n =problem 6.dynamic and conditions itmathematical can be applied. Goals: Encounter the first solvable using dynamic programming. how to Encounter the first problem solvable using Learn how toprogramming. write a brute force solution inLearn a Learn good way, and how to use counter the first problem solvable using dynamic programming. Learn Goals: Encounter the first problem solvable using dynamic programming. Learn how to write a bru how to write a brute force solution in a good way, and how to use memoization how to write a brute force solution in a good way, and how to use memoization o write a brute force solution in a good way, and how to use memoization nse. The output of a mathematical function only depends on its inputs: is woefully neglected in traditional textbooks is the write a brute force solution inand ato good way, andmemoization how “magically” a brute force solution in a good way, how to memoization use “magically” turntoit use intomemoization an efficient to algorithm. tewrite a brute force solution in a good way, and how to use how to write a it brute force solution in a agood way, in and how to use memoizationto “magically” tur to “magically” turn it into an efficient algorithm. “magically” turn into an efficient algorithm. agically” into efficient algorithm. today isturn the same value as cos(π/3) tomorrow. For function a comns into the mathematical sense and in the programgically” ititinto an efficient algorithm. Statement: Given is a sequence of n bottles, their volumes a turn itan into an efficient algorithm. lly” turn itturn into an efficient algorithm. 4: Maximum weighted independent set on a line Statement: Giv to “magically” turn it into an efficient algorithm. Statement: Given isnn abottles, sequence of nvolumes their v0 through Statement: aof sequence n bottles, their volumes are v0arethrough through atement: Given isa aGiven sequence of bottles, their volumes are v volumes two calls the same arguments mayvare often return aogram, mathematical function only depends onofits inputs: tement: Given sequence their vmuch .bottles, Drink as you can, given that you cannot drink vn−1 ent: Given isconsecutive a is sequence ofis nwith bottles, their volumes are 0 0through 0asthrough v Statement: Given is a sequence of bottles, their volumes are through through Statement: Given is a sequence of n bottles, their volumes are v n−1 . Drink as mu Drink ascan, much as you can, given that youhappen. cannot drink any 0 two .vmuch Drink much assolvable you can, given you cannot drink from any two Drink much you can, given that you cannot drink from any two from n−1 values. There are lots of different reasons this may For value as cos(π/3) tomorrow. For a function inwhy athat comthe problem using dynamic programming. Learn Drink asas asas you given that you cannot drink from any two adjacent bottles. kEncounter asvn−1 much as . first you can, given that you cannot drink from any two adjacent bottles. Drink as much as you can, given that you cannot drink from any two adjacent bottles. . Drink as much as you can, given that you cannot drink from any two v adjacent adjacent n−1 ent bottles. the output ofbottles. the bottles. function may on global variables, on environtive calls with same arguments may often return write a brute force solution in a depend good way, and how to problem use memoization nt bottles. This has a very short and simple statement and o ottles. This problem adjacent bottles. problem has a settings), very short and simple statement and requires a input. But the main rh This problem has alocale very short and statement and requires a the his problem has avery very short and simple statement andonly only requires aonly riables as the current onsimple pseudorandom numbers, of (such different why this may happen. For ically” turn it into an efficient algorithm. slots problem has areasons short and simple statement and requires aonly simple one-dimensional array torequires store oblem has aThis very short and simple and only requires aonly This problem has astatement very short and simple statement and a simple one- one-dimens simple array store the input. But the main reason why we become problem has athese very and simple statement and onlywe requires asimple simple one-dimensional array to store the input. But the main reason why eement: one-dimensional array tostore store the input. But the main reason why we nput from aThis user, etc. (Listing istoshort actually aBut lovely exercise for unction may depend on global variables, on environGiven is one-dimensional aarray sequence of ninput. bottles, their volumes are vwhy one-dimensional to the But the main reason why we elected to use this as the first example apparent once -dimensional array to store the But the main reason we 0 through dimensional array to input. store the input. the main reason why we electedwill to use this as elected toyou use this as the first example will become apparent once wereason implement elected to use this as thewill first example become apparent once we implement simple one-dimensional array towill store the input. But the main weelected to use this drink use this the firstcan, example will become apparent once implement !) current locale settings), on pseudorandom numbers, asthis much as given that you cannot drink from any two totothis use asasfirst the first will become apparent once we implement and examine awe brute force solution for why this problem. use as the example become apparent once we implement the firstexample example will become apparent once we implement and examine a brute force and examine a bru and examine aactually brute force solution forstraightforward this a brute force solution this problem. xamine brute force solution for this problem. n the above observation, memoization isfor aexercise very concept elected to use this as the first example willproblem. become once implement etc. (Listing these is aproblem. lovely for tamine bottles. a aexamine brute force solution for this problem. Our goal is toapparent implement a we recursive solution that generates ne a and brute force solution for this problem. solution for this Our goal is to Our goal is implement athat recursive solution that generates all among valid goal is implement asolution recursive solution that generates all valid sets sets ur is implement ato recursive solution that generates all valid ents: weisOur simply to computing thethat same thing twice. And itsets and examine aaavoid brute force solution forbottles this problem. has awant very short and simple statement and only requires a best r problem goal toto implement recursive generates all valid sets of and chooses the them. Such a recursive so al isgoal to implement ato recursive solution generates all valid sets Our goal is to implement a recursive solution that generates all valid sets of bottles of bottles and cho of bottles and chooses the best them. acan recursive can ofand bottles and chooses the best among them. Such areason solution can ttles chooses the among them. athe recursive solution can ow clear that any function inthe our program that ison also arecursive function inbebesolution ation, memoization is abest very straightforward concept Our goal is to implement aSuch recursive solution that generates allbe valid sets the last bottle o one-dimensional array to store input. But main why we les be and chooses the best among them. Such abased recursive solution can aSuch simple observation: either webe choose and chooses the best among them. Such aamong recursive solution be and chooses the best among them. Such a recursive solution can be based on a simple based on a simple based aand simple observation: either we choose the last bottle ordon’t. we don’t. If from on a on simple observation: either we choose the last bottle or Ifwe If can onuse asimple simple observation: either we choose the last bottle or we don’t. If nt to avoid computing the same thing twice. And itthem. tosimple this as the first example will become apparent once we implement of bottles chooses the best among Such adon’t. recursive solution be among the first on abased observation: either we choose the last bottle or we we don’t, want to find the best solution observation: either we choose the last bottle or we we don’t. If observation: either we choose the last bottle or we don’t. If we don’t, we want to find the we don’t, we want wefind don’t, want to find the best solution from among the first nbottles. − bottles. we don’t, we want to find the best from among the n − 1to n’t, we want to find the best solution from among the first − bottles. ny function into our program that isfrom also asolution function in mine awant brute force solution for this problem. ’t,want we find the best solution from among the first 1 1first bottles. If do, not allowed take thedon’t. penultimate bottle and th based on awe simple observation: either we choose the last bottle or1we If we to the best solution among thewe first nwe −are 1nn− bottles. If we do, we are no best solution from among the first – 1 bottles. If wewe do, are not allowed to the take the penultimate bottle and therefore we are Ifnot we do, are not allowed to take penultimate bottle and therefore we are we are not allowed takepenultimate the penultimate bottle and therefore we are goal is to implement ato recursive solution that generates all among valid sets o, we are not allowed to take the penultimate bottle and therefore we are looking forfrom the best solution among the first n − 2 bottles. edo, are allowed towe take the bottle and therefore we are we don’t, we want to find the best solution the first n− 1 bottles. looking for the bes If we do, we are not allowed to take the penultimate bottle and therefore we are looklooking for the best solution first2nbottles. − 2 bottles. looking for the best solution among the first ng for the best solution among the nbottles. − bottles. and chooses the best among them. athe recursive solution can beand therefore we are ges for the best solution among the first nSuch − 2 2bottles. the best among the nfirst − 2among Ifsolution we do, we are notfirst allowed to take then − penultimate bottle ing for the best we solution among 2 we bottles. vthe= first [ 3, –1, 4, 1, 5, If9, 2, 6 ] n a simple observation: either choose the last bottle or don’t. looking for the best v = [ 3, 1, 4, 1 v 5, = [9, 3, 1, 4, 5,2,9,6 2, 6 ] the first n − 2 bottles. v =1, [ 3, 1, 4, 1, 5, 9, ]among [1, 3, 1, 4, 5, 9, 2, ]solution 1, 4, 1, 5, 9, 661, ]3, 4, 2, ][ t,3, we want to1,find the best solution among the 2, first 6n − v 62, 1, from 4, 1, 5, 9, ] 1 bottles. = , we are not to take the 5, penultimate and therefore we are vdef =allowed [ 3, def 1, 4,solve(k): 1, 9, 2, 6 bottle ]def solve(k): def solve(k): solve(k): def solve(k): solve(k): olve(k): (k): for the best solution among the first n − 2 bottles. def solve(k): 3, 1, 4, 1, 5, 9, 2, 6 ] lve(k): The first few rows of input will appear instantly but already around n = 35 the program will slow down to a crawl. We can empirically measure that each next value takes about 1.6 times longer to compute than the previous one. ’’’ returns best soluti ’’’the returns the b What is going onreturns here? The easiest way to see itfor is tothe log first each recursive call: ’’’ ’’’ best solution k bottles ’’’ returns the bestthe solution for the kfirst k bottles ’’’ if k == 1: return v[0] ’ returns the best solution for the first bottles ’’’ thematical sense can be memoized. (Such functions are sometimes called ’’’ returns the best if k == 1: return ’’’ returns the best solution for the first k bottles ’’’ Towards a Better Waythe to Teach Dynamic Programming ’’’ returns best solution for first k bottles ’’’ if k == 2: 53 kreturn ==the 1: return if k return == 1: v[0] return max( v[0] first kkif bottles ’’’ kthe ==F(n): 1: v[0] nctions.) if k if == v[0 def turns the best solution forv[0] thev[0] first k bottles ’’’ k1:==return 2: return if == 1: return if k == 1: return v[0] if k == 2: return max( v[0], v[1] ) if k == 2: return max( v[0], v[1] ) return max( solve(k-1), v[k k == 2: return max( v[0], v[1] ) interesting historical note: memoization is not only useful when it comes ''' returns the best solution for the first k bottles ''' if k return == 2: return max F(’ + str(n) + ’)’)v[1] ) = 1:print(’calling return v[0] max( solve if k == 2: return max( v[0], ifasymptotic k == 2:solve(k-1), return max(v[k-1] v[0], v[1] ))many return max( solve(k-1), v[k-1] + solve(k-2) ) return max( + solve(k-2) ) ) ’’’ returns the best solution for the first turn max( solve(k-1), v[k-1] + solve(k-2) oving the time complexity. For instance, early programs return max( solve(k-1 if ksolve(k-1), 1:) return v[0] = 2:... return max(max( v[0], v[1] == return v[k-1] + solve(k-2) return solve(k-1), v[k-1] +tables solve(k-2) ) )k cosines It should be obvious th solve(k-2) ) max( if == v[0] now oduced computer graphics used precomputed of sines v[1] and max( solve(k-1), +2: solve(k-2) ) ifv[k-1] k == return max( v[0], ) 1: return It should now that be obvi It should now bethat obvious that the time complexity of this program is expoAlready for small values like n =actual 6complexity we quickly discover that the same function Itnow should now be obvious the time complexity of this program is exponential – in fact, the number of rec be obvious that the time of this program is expoif k == 2: return max( v[0], v[1] ) ehould table lookup was faster than the evaluation of a floating-pointIt should now be obvious t return max( solve(k 1), v[k-of 1]of solve(k ) nential – in fact, the num should now be obvious that the time complexity program is expo+this -complexity It –It should now obvious that time this program is-2) exponential –program inbe fact, the number of recursive calls to max( evaluate solve(n) issame call made multiple times. Here itrecursive is the instructional to show the entire recursion in fact, the number of calls needed toneeded evaluate solve(n) is nential precisely the as the number of complexity of this is expo–nential inisfact, the number of recursive calls needed to evaluate solve(n) is return solve(k-1), v[k-1] + solve(k-2) function. – in fact, the number d now be obvious that the time complexity of this program is expothe’’’ same as the nential – in fact, the number of recursive calls needed to evaluate solve(n) is kprecisely nential –as fact, number recursive calls needed to solve(n) is ’’’ returns the best solution for first bottles the same as the number ofevaluate callsto needed to evaluate Fthe (n) our first It omit should now be obvious that the time complexity of this program isin exponential – insame tree for n precisely = 6,insame we the picture here to to conserve space. precisely the asthe the number of calls needed evaluate F (n) in our first lesson. alls needed to evaluate solve(n) isofneeded y the same the number of calls Fto (n) inevaluate our precisely the as the num fact, the number of recursive calls needed to evaluate solve(n) is first lesson. precisely the same as the number of calls needed evaluate F (n) in our first precisely the same asinthe calls needed evaluate F (n) inisbe our first if k of==calls 1: needed returntoto v[0] irst ktobottles ’’’ Here, alesson. suitable homework is to leave the students analyse how many times fact, the number ofnumber recursive evaluate () precisely the same astime lesson. A key observation to make here eeded evaluate F (n) our first It should now obvious that the complexity lesson. e same as the number of calls needed to evaluate F (n) in our first lesson. A key observation toisom nF3: Fibonacci numbers revisited lesson. ==here 2: return max( v[0], v[1] )tion. Ato key observation toif make isbe that solve can bein considered anumber pure − key k) gets called during the computation of F (n). For our version the A observation to make here isk that solve can be considered a pureof funcEvenfuncthough it does access the the number of calls needed to evaluate in our first lesson. nential – fact, the of recursive calls needed ey(n observation make here is that solve can considered a pure funcA key observation to make A key observation to make here is that solve can be considered a pure func- )tion. Even though it doe A be key observation tothe make here is that solve can be considered funcreturn max( solve(k-1), v[k-1] +a pure solve(k-2) tion. Even though itfuncdoes access the global variable v,the its contents remain the implementation the to this question are again precisely the Fibonacci tion. Even though itanswer does access the global variable v, its contents remain the same throughout the execution of th precisely same as the number of calls needed to ev olve can considered a pure ven though it does access global variable v, its contents remain the A key observation to make here is that solve can be considered a pure function. tion. Even though it does acc bservation to make here is that solve can be considered a pure funcSee the stunning effect memoization can have. Even though it does access the global variable v, its contents remain the same throughout the exe tion. Even though it remain does the global variable v,Hence, its contents remain the k-2) )tion. same throughout theaaccess execution of the program. we may apply memoizanumbers. same throughout the execution of the program. Hence, we may apply memoizationthe to solve in order to reduce the lesson. variable v, its contents the hroughout the execution of the program. Hence, we may apply memoizaEven though it does access the global variable v, its contents remain same throughsame throughout the execution hough it does access the global variable v, its contents remain the applying memoization (using simple array) to the Fibonacci function, tion to solve isinexpoorder to same throughout execution of the program. Hence, we apply memoizan Ittothe should be obvious the time complexity of here thissolve program same throughout thethe execution of the program. Hence, we may apply memoizanmay ntime tion to solve in order reduce the time complexity from Θ(φ ) toto Θ(n). Alternately, we can just directly estimate the whole complexity: when tion to in order to reduce timenow complexity from Θ(φ )observation to Θ(n). (Note can easily be tu am. Hence, may apply memoizaA key to make is that solve can be solve insolve order to reduce time complexity from Θ(φ )that to Θ(n). n tionthat toinsolve in order to redu out the execution of the program. Hence, we may apply memoization solve order ghout the execution of the program. Hence, we may apply memoizathe values Fwe (1) through Fthe (n) isreduce only computed once, using afrom single naddition to solve in order to the time complexity Θ(φ ) to Θ(n). (Note that solve nential –the inatime fact, the number of recursive calls needed to evaluate solve(n) iscan tion tofrom solve in order to reduce complexity Θ(φ )final towe Θ(n). ity of this program exponis (Note that solve can easily be turned into afrom true pure ifaweaccess pass athe n computing Fsuddenly (n), each leaf of the recursion tree contributes 1function to the result. (Note that solve can easily be turned into a true pure iffunction pass constant reference to v to solve as mplexity Θ(φ ) to Θ(n). tion. Even though it does global variable v, te that solve can easily be turned into true pure function if we pass a (Note that solve can easil ve in order to reduce the time complexity from Θ(φ ) to Θ(n). herefore, we have a program that only performs Θ(n) additions to reduce the time complexity from to . reference to v toa (Note that solve can easily be turned into a number true pure if we pass a constant precisely the same as offunction calls needed to F (n) in our Hence, first ded to evaluate solve(n) issolve that solve can easily be turned into athe true pure function if Hence, we pass aevaluate constant reference to v to solve as a second argument.) (This is(Note the rationale for our choice to use n = 1 and n = 2 as base cases.) reference to v to as a second argument.) same throughout the execution of the program. to asolve true pure function if we pass a tconstant reference to v to solve as a second argument.) constant reference to v to solw pute the value F (n): quite an improvement over the original exponential at can easily be turned into a true pure function if we pass a (Note that can be turned into a true pure function if we pass a constant constant reference to vlesson. to solve as a second argument.) othere evaluate F (n) in our first constant reference to v solve to solve aseasily a second argument.) are precisely F (n) and thus precisely F (n) − 1 inner nodes in the tion to solve in order to reduce the time complexity fro argument.) he above Python program can now easily compute F (1000). erence to v toreference solve astoavleaves second argument.) to solve as a second argument.) Lesson An iterative solution t A key observation make here isthat thatofsolve solve can be5:considered funcrecursion tree. Inoperates other words, the running time integers. oftothe computation F (n) is easily Lesson 5:a pure An iterative (Note can be turned into a true pu te that Python with arbitrarily large The n-th FiLesson 5: An iterative solution to the previous problem Lesson 5: An iterative solution to the previous problem 5: iterative solution to the problem tion. Even though itis previous does access thereference global variable v, its contents remain thesolu nclearly be An considered a pure funcLesson 5:aAn iterative proportional toiterative the value of Fprevious (n), known to grow Lesson 5: solution to the problem constant as second argument.) number isproblem exponential in iterative n and therefore has Θ(n) digits. In theexponentially. RAM to v to solve Lesson 5:Lesson AnAn iterative solution towhich the previous problem about thememoizaduality bet 5: An solution tothe the previous problem previous same throughout execution of2 the program.Goals: Hence,Learn weGoals: may apply eAn v, iterative its contents remain the Learn about the solution to the previous problem Goals: Learn about duality between the and the proach. bottom-up ap-Learn the actual time complexity ofthe the above algorithm is top-down O(n ), asbottom-up each Goals: Learn about the duality between the top-down and the apLearn about the duality between the top-down and the bottom-up apn Goals: about the duali tion to solve in order to reduceand thethe time complexity from Θ(φ ) to Θ(n). ce, we may Learn apply memoizaGoals: Learn about the duality between top-down bottom-up ap- proach. Goals: about the duality between thethe top-down bottom-up apGoals: Learn about theO(n) duality between the top-downand andthe the bottom-up approach. proach. n of two 2: O(n)-digit numbers steps.) proach. The subproblems solved by the m top-down and the bottom-up apn Memoization proach. rn about the duality betweentakes the top-down and the bottom-up ap-An into Lesson 5: iterative solution to subproblems the previous p yeLesson from Θ(φ ) to Θ(n). (Note that solve can easily be turned a true pure function if we pass asolv proach. The esubproblems it proach. is important to highlight the contrast: time without vs.naturally The subproblems solved by theexponential memoized recursive solution can be be naturally The subproblems solved by the memoized recursive solution can be ordered by size. The same recurren solved by the memoized recursive solution can be naturally The subproblems solved by The subproblems solved by the memoized recursive solution can naturally orconstant reference torecursive vrecursive to solve as a second argument.) e pure The function if we passsolved asolved The subproblems by the memoized solution can be naturally ordered by size. The sam subproblems the memoized can be naturally Goals: Learn about memoization and conditions when it solution can be applied. with memoization. It by issame also instructional toused draw a write new, colordered by size. The recurrence can now be used to write an duality iterative ordered by size. The same recurrence can now be to an iterative solution. A solution. side-by-side comparison Goals: Learn about the between top-down dmial recursive solution can be naturally by time size. The same recurrence can now be used to write an iterative ordered by size. The same re problems solved bysize. the memoized recursive solution can be naturally dered by size. The same recurrence can now be used to write an iterative solution. ent.) A the side-by-side co ordered by The same recurrence can now be used to write an iterative ordered by size. The same recurrence can now be used to write an iterative One of the points that is woefully neglected in traditional textbooks is the version of the entire recursion tree for n = 6. solution. A side-by-side comparison of both programs helps highlight parts thatsame solution. A side-by-side comparison of both programs helps highlight partsremained that solution. the / only changed s proach. be used to write an iterative n.now A side-by-side comparison of both programs helps highlight parts that A side-by-side compa size. The same recurrence can now be used to write an iterative A side-by-side side-by-side comparison of of both programs helps highlight parts that remained the the same / only remained solution. A side-by-side comparison both programs helps highlight parts that solution. comparison of both programs helps highlight parts that difference between functions thesyntactically. mathematical sense and remained the same /in only changed syntactically. Lesson 5: An iterative solution tothe theprogramprevious problem remained the same / only changed programs helps highlight parts that The in subproblems solved by the memoized s ed the same /Asame only changed syntactically. remained the same /recursive only chan side-by-side comparison ofchanged both programs helps highlight parts that remained the same / only changed syntactically. / only syntactically. remained the output same / of only changed syntactically. ming sense. The a mathematical function onlyordered depends onsize. its inputs: us problem ally. by The same recurrence can now be use e same / only changed syntactically. n 4: Maximum weighted independent onthe a line Optionaland lesson 6: Paths in gr Goals: Learn set about duality thea top-down theofbottom-up apcos(π/3) today is the same value as cos(π/3) tomorrow. For a between function in comOptional lesson 6:aPat solution. A side-by-side comparison both programs he Optional lesson 6: Paths in a grid Optional lesson 6: Paths in a grid nal lesson 6: Paths in a grid proach. own and thethe bottom-up ap-Paths Optional lesson 6: Paths in grid a grid lesson 6: Paths in puter program, two consecutive calls with the same arguments may often Encounter first problem solvable dynamic programming. Learn Optional lesson a remained the samereturn / onlyOptional changed syntactically. Optional lesson 6: 6: Paths in in a using grid Goals: Developing a bottom-up solu The subproblems solved by the memoized recursive solution canDeveloping be naturally Goals: a bot esson Paths in a grid different values. There are lots different reasons why this may happen. For write a6:brute force solution inaabottom-up good way,directly. and how to use memoization Goals: Developing solution directly. Developing aof bottom-up solution directly. Goals: Developing a bottom-up solution Statement: Given is aiterative grid. Count Developing aGoals: bottom-up directly. Goals: aGiven bottom-u ordered bysolution size. The same recurrence can now be used to Developing write an ve solution can be naturally Goals: Developing asolution bottom-up directly. Statement: is a instance, the output of the function may depend on global variables, on environgically” turn it into an efficient algorithm. Goals: Developing a bottom-up solution directly. Statement: isshortest a grid. Count all shortest paths along the6: grid from (0, 0) Given isCount aGiven grid. Count allside-by-side shortest paths along the grid from (0,Paths 0) to (a, b).helps ectly. ement: Given is an a grid. all paths along the grid from (0, 0) Statement: Given isthat a grid. eloping a Statement: bottom-up solution directly. Optional lesson in a grid solution. A comparison of both programs highlight parts usedStatement: to write iterative to (a, b). Given is a grid. Count all shortest paths along the grid from (0, 0) Statement: Given is a grid. Count all shortest paths along the grid from to . ment variables (such as the current locale settings), on pseudorandom numbers, through tement: Given is a sequence of n bottles, their volumes are v Statement: Given is a grid. Count all shortest paths along 0) answer is obviously the bino 0 the grid from (0, (a, b). to Given (a, b). The test paths along the gridthat from (0, 0) ). to (a, b). The answer is obviou nt: isto ab). grid. Count all etc. shortest paths along grid from (0, 0) two �syntactically. �changed a lovely remained the same only ms highlight parts �a+b drink (a, onhelps theto input from a user, (Listing these is/the actually forpath-based a+b exercise a+bfrom Drink as much asThe you can, given that you cannot any to (a, b). � The answer is obviously the binomial coefficient , but the answer is obviously the binomial coefficient , but the path-based point of a is The answer is obviously the binomial coefficient , but the path-based point of view allows natural � Goals: Developing a bottom-up solution directly. answer is obviously the binomial coefficient , but the path-based a+b answer obviously th a a a+b acoefficient �The answer point of view allowsformu a na is obviously the binomial , but the path-basedThe students!) �a+bcoefficient tefficient bottles. a+b The answer is obviously the binomial the path-based a, but point of view allows a natural formulation of a recurrence relation: Let P (x, y) a point of view allows a natural formulation of a recurrence relation: Let P (x, y) view allows a natural formulation of a recurrence relation: Let be the number be the number of ways to reach (x, ya Statement: Given is a grid. Count all shortest paths , but the path-based f view allows a natural formulation of a recurrence relation: Let P (x, y) point of view allows a natural wer is obviously the binomial coefficient , but the path-based a point of view allows a natural formulation of aand recurrence relation: Let P (x, y) be the number of ways to Given the above observation, memoization aEach very straightforward concept a is sfbe problem has a very short and simple statement only requires a point of view allows a natural formulation of a recurrence relation: Let P (x, y) be the number of ways to reach (x, y). path that reaches (a, b) goes either Optional lesson 6: Paths in a grid the number of ways to reach (x, y). Each path that reaches (a, b) goes either through (a − 1, b) or through (a, b − 1 to (a, b). a recurrence Let (x, y)a. recurrence ofrelation: ways to reach Each path that reaches goes either through or(aof− ways number ofthe to reach (x, P y). Each path that reaches (a, b)Preaches goes either be the number tothrou reac w a ways natural formulation of relation: Let (x, y) �a+ through 1, b) or be number of ways to reach (x, y). Each path that (a, b) goes forallows students: we array simply want to avoid computing the same thing twice. itPeither one-dimensional to store the input. But main reason why we the number of − ways to reach y). path reaches (a, b) goes either through (a 1, b)− or through (a, bEach − 1). Hence, PP (a, b) = (a1). −(a, 1,And (a, bthe − 1).binomial through (a −through 1, b) or through (a, b −(x, 1).that Hence, Pthe (a, b)−that = (a − 1, b)P + P bb)−+ 1). Statement 2: Now some grid poin path that reaches (a, b) goes either The answer is obviously coefficient hber (a be − 1, b) or (a, b 1). Hence, P (a, b) = P (a 1, b) + P (a, b − through . Hence, . through (a − 1, b) or through (a of ways to reach (x, y). Each path reaches (a, b) goes either a through (a −first 1, that b) or through (a, b1). − 1). Hence, Ponce (a, b) = Palso (a1, −ab) 1,function b)P+(a, P (a, b1). − 1). Statement 2: Now som should nowas beStatement clear any function in our program that is in to use this the example will become apparent we implement Goals: Developing a bottom-up solution directly. through (a − 1, b) or through (a, b − Hence, P (a, b) = P (a − + b − 2: Now some grid points are blocked by some obstacles. Count allfrom (0, Statement 2: Now some grid points are blocked by some obstacles. Count all paths thatStatement go 0) to (a, b) in e,ement P b) (a, b) = P (a − 1, b) + P (a, b − 1). point of view allows a natural formulation of a recurren 2:through Now some grid points are blocked by some obstacles. Count all 2: Now some gri − 1, or (a, b − 1). Hence, P (a, b) = P (a − 1, b) + P (a, b − 1). paths that go from (0, 0) Statement 2: Now some grid points are blocked by some obstacles. Count all amine aStatement brute force solution for this problem. Statement: Now some grid are blocked by obstacles. Count all paths that Statement: Given is asteps grid. Count all shortest paths along the grid from (0, that 0) 2: Now some grid points are blocked by obstacles. Count paths that go0) from 0) to b) inavoid aobstacles. + ball and avoid allofobstacles. paths that go (0, to (a, b) in a(a, +points bsome steps and avoid allsome obstacles. The recurrence remains besome the number ways toall reach (x, y). locked by some obstacles. Count all hat from (0, from 0) tofrom (a, b) in a(0, + b(a, steps and obstacles. paths that goEach from path (0,the 0) rema tosam (ar nt 2:goNow some grid points are blocked by Count all obstacles. paths that go (0, 0) to b) in a + b steps and avoid all The recurrence goal is to implement a recursive solution that generates all valid sets to (a, b). hs along the grid from (0, 0) go (0,same, steps and avoid all obstacles. paths that gofrom from 0) to to (a, b) in in aonly +the b now steps andnow avoid allgrid obstacles. The recurrence remains the same, only the blocked grid points have The(0, recurrence remains the same, the blocked points Phave (x, y)�= 0.PIt is instructional to solv through (a − 1, b) or through (a, b − 1). Hence, P (a, b) = eps and avoid all obstacles. recurrence remains the only now blocked grid points have The recurrence remains th go from 0) to (a, b) in a + b steps and avoid all obstacles. a+b y)the = 0. It is instruct The recurrence remains the same, now the blocked grid points have , (x, les and chooses the best among them. Such recursive solution can be The answer isaonly obviously the binomial coefficient but path-based The recurrence remains the same, only now the blocked grid points have P0.(x, = 0. It is instructional to solve small instances on this problem on paper a P (x, y) =instructional Ity) is instructional to solve small instances on this problem on paper �a+b the –have insome fact, many of our students are fa Statement 2: Now grid points are blocked by so now blocked grid points have = 0. It is to solve small instances on this problem on paper P (x, y) = 0. It is instructional The recurrence remains the same, only now the blocked grid points . It urrence remains the same, only now the blocked grid points have in fact, many our stu P, (x, y) = It 0. It instructional is instructional to solve small instances on this problem on paper –relation: on– ain either we choose the last bottle or we don’t. Ifof point of view allows athis natural formulation aclasses. recurrence Let Pof (x, y) avo but the path-based Psimple (x, y)–on =observation: 0. is to solve small instances on this problem on paper in fact, many of our students are familiar with this problem from earlier Math a fact, many ofproblem our students are familiar with problem from earlier Math paths that go from (0, 0) to (a, b) in a + b steps and this on paper t,instances many of our students are familiar with this problem from earlier Math – in fact, many of our students It is instructional to solve small instances on this problem on paper is instructional to solve small instances on this problem on paper – in fact, many of our classes. –want in fact, many our students are familiar with this from earlier Math ’t, we tomany find the best solution from among theto first nproblem − bottles. beMath theare number of ways reach (x,1y). Each path that reaches (a, b) goesnow either rrence relation: Let Pof (x, y) –this in fact, of our students familiar with this problem from earlier Math classes. classes. with from earlier The recurrence remains the same, only the bl classes. any ofare ourproblem students are familiar with this problem from earlier Math students are familiar with this problem earlier Math classes. classes. o, not allowed to take the penultimate bottle and therefore we are through (a − 1, b) or through (a, b − 1). Hence, P (a, b) = P (a − 1, b) + P (a, b − 1). o atwe reaches (a, b) goes either classes. P (x, y) = 0. It is instructional to solve small instances Lesson 7: Longest subs best first n − 22:bottles. )for = Pthe (a − 1, b)solution + P (a, bamong − 1). theStatement Now some –grid points are of blocked by some obstacles. Count all Lesson 7:common Longest com in fact, many our students are familiar with this pro Lesson 7: Longest common subsequence Lesson 7: Longest common subsequence Lesson 7: Longest common subsequence 7: Longest common subsequence paths that go from (0, 0) to (a, b) in a + b steps and avoid all obstacles. by some obstacles. Count allcommon subsequence Lesson 7: Longest common Lesson 7: Longest classes. Lesson 7: Longest common subsequence Goals: Investigating the differences 3, 1, all 4, obstacles. 1, 5, 9, 2, 6 ] The recurrence remains the same, only now the blocked ce avoid grid points have the Goals: Longest common subsequence Goals: Investigating differences between the top-down andthe thebottom-up bottom-up ap-Investigating Goals: Investigating thethe differences between the top-down and Goals: Investigating the differences the top-down the bottom-up approach. Investigating the differences between theIttop-down andtop-down the bottom-up Goals: Investigating the diffe Pdifferences (x, y) =between 0. is instructional to and solve small instances on this problem on paper e blocked grid points have approach. Goals: Investigating the differences between the and the bottom-up Goals: Investigating the between the top-down and the bottom-up proach. approach. approach. nes the top-down and the bottom-up ch. lve(k): approach. estigating the differences between the top-down and the bottom-up Lesson 7: Longest common subsequence – in fact, many of our students are familiar with this problem from earlier Math on this problem on paper approach. approach. For this problem, we first show the entire process. First, we show a recursive soluclasses. problem from earlier Math Goals: between the top-do tion that generates all common subsequences, startingInvestigating by comparingthe thedifferences last elements approach. of both sequences. Then, we show that adding memoization improves this solution from Lesson 7: Longest common subsequence Goals: Investigating the differences between the top-down and the bottom-up p-down and the bottom-up approach. Goals: See the stunning effect memoization can have. tion. Therefore, we suddenly have a program that By onlyapplying performsmemoization Θ(n) additions (using a simple array) to the to compute the value F (n): quite an improvement over the original each of the values F (1)exponential through F (n) is only computed once, time. The above Python program can now easily compute F (1000). Therefore, we suddenly have a program that only perfo this problem, we first show the entire process. First,tion. we show a recursive (Note that Python operates with arbitrarily large integers. The n-th FiForišek the 54 common subsequences, starting by to M. compute value n that generates all comparing the last F (n): quite an improvement over the o bonacci number is exponential in n and therefore has Θ(n) digits. In the RAM can now easily compute F time. Theshow above Python program ts of both sequences. Then, we show memoization improves Forthat thisadding problem, we first the entire process. First, we show a recursive 2 model, the aactual time exponential complexityone of into the aabove algorithm is O(n ), as each with arbitrarily large inte worst-case solution that runs in . 2 (Note that Python ution from a worst-case exponential one into solution that runs in O(n ). operates solution thatagenerates all common subsequences, starting by comparing the last addition of twoFinally, O(n)-digit numbers O(n) steps.) wean convert thetakes solution into an equivalent iterative one. in n and therefore has Θ(n) bonacci number is show exponential , we convert the solution into equivalent iterative one. elements of both sequences. Then, we that adding memoization improves Here it is important to we highlight the contrast: exponential time without vs. Afterwards, focus on the following points: model, theexponential actual time of the above algorithm erwards, we focus on the following thispoints: solution from a worst-case onecomplexity into a solution that runs in O(n2 ). polynomial time with memoization. It is also instructional to draw a new, coladdition of two O(n)-digit numbers takes O(n) steps.) ● ● Memory complexity. The iterative solution can easily be optimized to use Finally, we convert the solution into an equivalent iterative one. ntire process. First,The wethe show a recursive version of entire recursion treeeasily for n be = Here 6.optimized morylapsed complexity. iterative solution can to use to highlight the contrast: exponenti it isfollowing important memory only, the onefocus cannot. Afterwards, on the points: equences, starting by comparing thecannot. lastrecursivewe n) memory only, the●recursive one polynomial timebetter with as memoization. It is also ● Execution time. So far, iterative solutions were they didn’t perform theinstructional t how that adding memoization improves ecution time. 4: So Maximum far, iterative weighted solutions were better as they didn’t perform – Memory complexity. The iterative solution can easily be optimized lapsed version of the entire recursion tree for n = 6. to use Lesson independent set on a line 2 additional related to function calls. However, in this problem we can easily aladditional one into awork solution that runs inwork O(n calls. ). related to function However, inthe thisrecursive problemone we cannot. O(n) memory only, find inputs (e.g., two identical sequences) where the recursive solution outperGoals: Encounter thetwo first identical problem solvable using dynamic programming. Learn nequivalent easily finditerative inputs one. (e.g., sequences) where the recursive – Execution time. So far, iterative solutions were better as they didn’t perform forms the solution iterative one. lesson here is that theuse top-down approachindependent only evalu- set on a g points: Lesson 4: Maximum weighted to write athe brute force in a The good way, and how to memoization utionhow outperforms iterative one.the Theadditional lesson here is that the top-down work related to function calls. However, in this problem we ates the subproblems it actually needs, while the iterative approach doesn’t know toonly “magically” turn it into an efficient algorithm. proach the subproblems it actually needs, while the iterative can easily find inputs (e.g., two identical sequences) where the recursive solution canevaluates easily be optimized to use Goals: Encounter the first problem solvable which subproblems will be needed later and thus it must always evaluate allusing of dynamic p Statement: a sequence of outperforms nbe bottles, their volumes are vThe proach doesn’t knowGiven which issubproblems will needed later anda thus itforce 0 through solution thewrite iterative one. lesson here isgood thatway, the top-down e cannot. how to brute solution in a and how them. Drink as asall much as you can, givenonly thatevaluates you cannot drink any vwere st always of them. n−1 .evaluate approach the subproblems it actually needs, while the iterative tions better they didn’t perform to “magically” turnfrom it into antwo efficient algorithm. adjacent bottles.in this problemapproach doesn’t know which subproblems will be needed later and thus it tion calls. However, we Statement: Given is a sequence of n bottles, their volum This problem has a very short and simple statement and only requires a given that you cannot d must always evaluate all of them. Lesson 8: Longest increasing subsequence in quadratic time entical sequences) where the recursive . Drink as much as you can, v n−1 time n 8: Longest increasing subsequence inthe quadratic simple one-dimensional to store input. But the main reason why we ne. The lesson here is that thearray top-down adjacent Goals: Examining the first example where a bottles. subproblem evaluated in constant elected to use this as the first example will become apparent once weisn’t implement Examining theneeds, first example a subproblem isn’tThis evaluated in conems it actually while thewhere iterative problem has a veryinshort and simple Lesson 8: Longest increasing subsequence quadratic timestatement a time. Understanding howfor thisthis is reected in the time complexity estimates. Most imporandwill examine a how brute force solution problem. me. Understanding this is reflected the time complexity estimates. array to store the input. But the m oblems be needed later and thus it in simple one-dimensional that the don’t have togenerates be instances of the original problem. Ourseeing goaltantly, is to seeing implement a subproblems recursive solution that sets mportantly, that the subproblems don’t have to be instances ofwhere the Goals: Examining the first example a first subproblem in conelected to use this asall thevalid exampleisn’t will evaluated become apparent of bottles and chooses the best among them. Such a recursive solution can be l problem. Statement: Given a sequence of numbers, compute length of in itsthe longest stant time. Understanding how this is reflected time complexity estimates. and examine athe brute force solution forincreasing this problem. based on aa simple observation: we choose the last bottle we don’t. tement: Given sequence of numbers, compute the length of the its longest Most either importantly, seeing that don’t have to be instances of gene the Our goal issubproblems to or implement aIfrecursive solution that subsequence. quence in quadratic time we don’t, we want to find the best solution from among the first n − 1 bottles. ing subsequence. original problem. of bottles and chooses the best among them. Such a recurs the mathematical sense be memoized. functions are sometimes called Innot ourcan opinion, problem is one of the most important ones in are teaching dynamic If we do,this we problem are allowed tothis take the(Such penultimate bottle and therefore we our is one in of the most important ones teaching Statement: Given a sequence of numbers, compute thewe length of its based on ainsimple observation: either choose thelongest last bo ere aopinion, subproblem isn’t evaluated conpure functions.) programming properly. When compared to previous problems, this one is much harder looking for the best solution among the first n − 2 bottles. ic programming properly. When compared to previous problems, this one increasing subsequence. we don’t, we want to find the best solution from among the flected the time complexity estimates. Anininteresting note: memoization is notThe only useful when it comes forhistorical beginners. Here’s the main reason: subproblems weof need tomost solve aren’t actually hoblems harderdon’t for beginners. Here’s the main reason: The subproblems we need In our opinion, this problem is one the important ones in teaching If we do, we are not allowed to take the penultimate bottle a have to of the to improving the asymptotic complexity. For instance, many early programs v = [ 3, 1, 4,be1,instances 5, time 9,original 2, 6 ] instances of e aren’t actually instances of the thedynamic originalproblem. problem. looking programming properly. When compared to previous problems, this one for the best solution among the first n − 2 bottles. that produced computer graphicsisused precomputed tables of sines and cosinesDasgupta just reThis problem used by Dasgupta and Skiena. However, s problem is used Dasgupta and Skiena. However, Dasgupta just much harder for beginners. Here’s the main reason: The subproblems we need mbers, compute theby length itsislongest because table lookup was of faster than the actual evaluation of a floating-pointdef solve(k): v the =instances [conceptual 3, 1, of4,the 1,the 5, 9, problem. 2, 6step ] where we to pathsto insolve a DAG without explicitly mentioning conceptual s it to paths in a duces DAG itwithout explicitly mentioning step aren’t actually original valued function. change the problem. Skiena treats the problem properly: notably, asking the question we change the problem. Skiena treats the problem properly: notably, This problem is used by Dasgupta and Skiena. However, Dasgupta just of the most important ones in teaching def solve(k): the question “what information about the first n − 1 elements of [the “what information about the first elements of [the sequence] would help you reduces it to paths in a DAG without explicitly mentioning the conceptual step ompared to previous problems, this one Lesson 3: Fibonacci numbers revisited ce] would help you find the solution for the entire sequence?” (Still, note find the solution for the entire sequence?” (Still, note that this question is factually where we change the problem. Skiena treats the problem properly: notably, main reason: The subproblems we need is question is factually incorrect: you are, in fact, supposed to look for asking the question “what information about the first n − 1 elements of [the incorrect: you are, in fact, supposed to look for information about the first eleoriginal Goals: problem. See the stunning effect memoization can have. ation about the first n − 1 elements that would help you find the same sequence] would help you find the solution for the entire sequence?” (Still, note ments that would help you find the same information for all elements.) andBy Skiena. However, Dasgupta applying memoization (usingjust a simple array) to the Fibonacci function, ation for all n elements.) that this question is factually incorrect: you are, in fact, supposed to look for actually the redefinition the problem. xplicitly mentioning the conceptual step each of the values FWe (1) suggest through F (n) isemphasizing only computed once, usingof a single addi- First, we illustrate suggest actually emphasizing the redefinition of the problem. First, we information about the first n − 1 elements that would help you find the same on examplehave that length longest Θ(n) increasing subsequence in the first treats the problem properly: notably, tion. Therefore, we an suddenly aknowing programthe that onlyof performs additions te on an that knowinginformation the length of longest increasing subseall n the elements.) about theexample first n− 1 Felements [the elements isof useless forfor solving problem for the first elements. Only to compute the value (n): quite an improvement over same the original exponential in the firstentire n − 1 sequence?” elements isthe useless forsuggest solving the same problem forathe the We actually of useful. the problem. First, we on for The the (Still, note we ask question howeasily to modify theemphasizing problem in wayredefinition that would be And time. abovethen Python program can now compute F (1000). elements. Only then we ask the question how to modify the problem in illustrate on an example that knowing the length of longest increasing subse: you(Note are, in fact, to look for thesupposed question is easily answered by using approach: wen-th can easily that Python operates with arbitrarily largeour integers. The Fi- write a recursive that would be useful. And the question is easily answered by using our quence in the first n − 1 elements is useless for solving the same problem for the nts that number would help youthat findgenerates the solution all therefore increasinghas subsequences by In choosing where to end and then bonacci is exponential in same n and Θ(n) digits. the RAM ch: we can easily write a recursive that Only generates first solution n elements. then all we increasing ask the question how to modify the problem in model, the actual time complexity of the above algorithm is O(n2 ), one as each going backwards. Converting this program into an requires just the mechauences by choosing where to end and then going backwards. Converting a way that would be useful. And the question is easily answered by using our eaddition redefinition of O(n)-digit the2problem. First,memoization. we O(n) steps.) of two numbers takes nical adding )step oneofrequires just the mechanical step aofrecursive adding solution that generates all increasing ogram into of anlongest O(n approach: we can easily write the Here length increasing subseit is important to highlight the contrast: exponential time without vs. zation. subsequences by choosing where to end and then going backwards. Converting ess for solving the with samememoization. problem for the polynomial time It is also instructional 2 to draw a new, colOptional follow-up lessons ) one requires just the mechanical step of adding this program into an O(n question how toofmodify the recursion problem in lapsed version the entire tree for n = 6. memoization. uestion is easily answered by using our After the above sequence of problems and expositions the students should have a decent ve solution that generates all increasing grasp of weighted the basic techniques and they be ready to start applying them to new Lesson Maximum independent set should on a line and then4:going backwards. Converting res just Encounter the mechanical step of adding Goals: the first problem solvable using dynamic programming. Learn how to write a brute force solution in a good way, and how to use memoization to “magically” turn it into an efficient algorithm. Statement: Given is a sequence of n bottles, their volumes are v through Towards a Better Way to Teach Dynamic Programming 55 problems. Still, there are more faces to the area of dynamic programming. Here, we suggest some possibilities for additional content of lectures: ●● Problems where the state is a substring of the input: Edit distance to a palindrome. Optimal BST. ●● Problems solvable in pseudopolynomial time: Subset sum, knapsack, coin change. ●● Optimalizations of the way how the recurrence is evaluated: Hirschberg’s trick for LCS in linear memory; Knuth optimization for Optimal BST; improving Longest increasing subsequence to ( log ) by using a balanced tree to store solutions to subproblems. 5. Conclusion Above, we have presented one possible way in which dynamic programming can be introduced to students. Our opinion is that the main improvement we bring is the systematic decomposition of the learning process into smaller, clearly defined conceptual steps. References Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C. (2001). Introduction to Algorithms. MIT Press, 2nd edition. Dasgupta, S., Papadimitriou, C.H., Vazirani, U.V. (2006). Algorithms. McGraw-Hill. Hirschberg, D. (1975). A linear space algorithm for computing maximal common subsequences. Communications of the ACM, 18(6), 341–343. Hu, T.C., Shing, M.T.(1982). Computation of Matrix Chain Products (part 1). SIAM Journal on Computing, 11(2), 362–373. Hu, T.C., Shing, M.T. (1984). Computation of Matrix Chain Products (part 2). SIAM Journal on Computing, 13(2), 228–251. Kleinberg, J., Tardos, É. (2006). Algorithm Design. Addison-Wesley. Sedgewick, R. (1998). Algorithms in C++. Addison-Wesley, 3rd edition. Sedgewick, R., Wayne, K. (2011). Algorithms. Addison-Wesley Professional, 4rd edition. Skiena, S.S. (2008). The Algorithm Design Manual. Springer-Verlag, 2nd edition. M. Forišek is an assistant professor at the Comenius University in Slovakia. Since 1999 he has been involved in organizing international programming competitions, including the IOI, CEOI, and ACM ICPC. He is also the head organizer of the Internet Problem Solving Contest (IPSC). His research interests include theoretical computer science (hard problems, computability, complexity) and computer science education.
© Copyright 2026 Paperzz