Kompilatorns första faser – automater och grammatik Komplexitet Henrik Björklund Umeå University 19. augusti, 2014 Moore’s law Moore’s law The famous claim known as Moore’s law states that the capacity of computer hardware doubles every 1.5 years. Moore’s law The famous claim known as Moore’s law states that the capacity of computer hardware doubles every 1.5 years. If Moore’s law holds true, in 20 years time, computers will be about 3300 times faster than today. Moore’s law The famous claim known as Moore’s law states that the capacity of computer hardware doubles every 1.5 years. If Moore’s law holds true, in 20 years time, computers will be about 3300 times faster than today. Suppose that we have an algorithm that runs in time 2n , where n is the size of its input. How much larger instances of this problem will we be able to run in 20 years? Complexity theory Complexity theory is the study of how hard computational problems are. Complexity theory Complexity theory is the study of how hard computational problems are. To be able to achieve this, we have to reason abstractly, mostly disregarding such things as the setup of individual computers, programming languages, etc. Complexity theory Complexity theory is the study of how hard computational problems are. To be able to achieve this, we have to reason abstractly, mostly disregarding such things as the setup of individual computers, programming languages, etc. What we are really interested are the inherent features of computational problems. Complexity theory Complexity theory is the study of how hard computational problems are. To be able to achieve this, we have to reason abstractly, mostly disregarding such things as the setup of individual computers, programming languages, etc. What we are really interested are the inherent features of computational problems. We are particularly interested in the growth rate of functions and are mostly satisfied with getting the order of magnitude right. Asymptotic notation Let f : N → N and g : N → N be two functions. If there is a positive constant c and a number n0 such that f (n) ≤ c · g(n) for all n > n0 , we say that f has order at most g and write f (n) = O(g(n)). Asymptotic notation Let f : N → N and g : N → N be two functions. If there is a positive constant c and a number n0 such that f (n) ≥ c · g(n) for all n > n0 , we say that f has order at least g and write f (n) = Ω(g(n)). Asymptotic notation Let f : N → N and g : N → N be two functions. If there are positive constants c1 and c2 and a number n0 such that c1 · g(n) ≤ f (n) ≤ c2 · g(n) for all n > n0 , we say that f has order g and write f (n) = Θ(g(n)). Efficiency of computations I Efficiency is measured by resource requriements, such as time and space. We therefore talk about time-complexity and space complexity. Efficiency of computations I Efficiency is measured by resource requriements, such as time and space. We therefore talk about time-complexity and space complexity. I Time-complexity is a rough measure of how long time it takes to solve a particular problem on a computer. Efficiency of computations I Efficiency is measured by resource requriements, such as time and space. We therefore talk about time-complexity and space complexity. I Time-complexity is a rough measure of how long time it takes to solve a particular problem on a computer. I How long does it take to sort a list of integers? Efficiency of computations I Efficiency is measured by resource requriements, such as time and space. We therefore talk about time-complexity and space complexity. I Time-complexity is a rough measure of how long time it takes to solve a particular problem on a computer. I How long does it take to sort a list of integers? I Depents on the number of items, the computer used, the algorithm used, how the algorithm is written, etc. Efficiency of computations In order to get anything resembling a general picture, we need to focus on the fundamentals: I The size of the problem instance will be denoted by n. I In analyzing an algorithm, we study how the resource requrements of the algorithm grows with n. Efficiency of computations In order to get anything resembling a general picture, we need to focus on the fundamentals: I The size of the problem instance will be denoted by n. I In analyzing an algorithm, we study how the resource requrements of the algorithm grows with n. Goal (time complexity): Characterize the time required to solve instances of a problem as a function of their sizes. Computational models and complexity Unlike the decidability of a problem, its time complexity can be affected by the choice of computational model, in particular whether the model is deterministic or non-deterministic and how it accesses it’s storage. Computational models and complexity Unlike the decidability of a problem, its time complexity can be affected by the choice of computational model, in particular whether the model is deterministic or non-deterministic and how it accesses it’s storage. Example: Consder the language {an bn | n ∈ N}. A single-tape TM for this problem needs Ω(n2 ) moves. A modern computer, running an progra written in some programming language, on the other hand, needs only O(n) moves. Michael O. Rabin Michael O. Rabin (1931–) is an Israeli computer scientist, born in Poland. He was a student of Alonzo Church at Princeton and has over the years made a great number of important contributions to computer science. Dana Scott Dana Scott (1932–) is an American computer scientist and mathematician. He studied first under Alfred Tarski at Berkeley and then under Alonzo Church at Princeton. Introducing nondeterminism Together, Rabin and Scott wrote the paper Finite Automata and Their Decision Problem. IBM Journal of Research and Development 3(2): 114–125, 1959. The paper introduces the concept of nondeterministic finite autmata. In 1976, Rabin and Scott recieved the Turing award for this work. Two possible interpretations We can interpret what a nondeterministic algorithm does when performs a computation in two different ways: I At each point in the computation, the algorithm “guesses” the best next action to take among those allowed. I The algorithm explores all possible computations in parallel. At each point in the computation, the algorithm is in a set of different configurations. What nondeterminism is not I Nondeterminism is not not probailistic What nondeterminism is not I Nondeterminism is not not probailistic I Nondeterminism is not not quantum computation Nondeterminism Example: Consider the CFG membership problem, i.e., we are given a context-free grammar G and a word w and are asked to decide whether w ∈ L(G). Nondeterminism Example: Consider the CFG membership problem, i.e., we are given a context-free grammar G and a word w and are asked to decide whether w ∈ L(G). A deterministic algorithm, such as CYK, needs Ω(n3 ) moves. Nondeterminism Example: Consider the CFG membership problem, i.e., we are given a context-free grammar G and a word w and are asked to decide whether w ∈ L(G). A deterministic algorithm, such as CYK, needs Ω(n3 ) moves. A nondeterministic algorithm can guess what rules are used in the shortest derivation of w in G. Since the length of the shortest derivation is linear in w, such an algorithm needs O(n) steps. Propositional formulas We consider the following definition of propsitional logic: I A Boolean variable is one that can take exactly two values, true and false or, equivalently 1 or 0. I A Boolean constant is either 1 or 0. I Boolean operators are used to construct Boolean expressions or propositional formulas from variables and constants. We consider only the Boolean operators I I I I ∨ (or) ∧ (and) ¬ (negation) Conjunctive normal form A literal is a variable x or a negation of a variable ¬x. Conjunctive normal form A literal is a variable x or a negation of a variable ¬x. A clause is a disjunction of a number of literals, e.g., (x1 ∨ ¬x2 ∨ x3 ∨ ¬x4 ). Conjunctive normal form A literal is a variable x or a negation of a variable ¬x. A clause is a disjunction of a number of literals, e.g., (x1 ∨ ¬x2 ∨ x3 ∨ ¬x4 ). A formula on conjunctive normal form (CNF) is a conjunction of a number of clauses. If C1 , C2 , . . . , Cm are clauses, then C1 ∧ C2 ∧ · · · ∧ Cm is a CNF formula. The SAT problem The SAT problem is the satisfiability problem for CNF formulas, i.e., we are given a formula φ on CNF and are asked if it is satisfiable, i.e., whether there exists an assignment of Boolean values to all variables in φ that makes φ true. SAT and computational models The following deterministic algorithm for SAT is trivial: I Let φ be the input formula and x1 , . . . , xn the variables that appear in φ. I For every possible assignment of Boolean values to x1 , . . . , xn , check if it makes φ true. The algorithm has exponential time complexity, since there are 2n possible variable assignments. SAT and computational models The following deterministic algorithm for SAT is trivial: I Let φ be the input formula and x1 , . . . , xn the variables that appear in φ. I For every possible assignment of Boolean values to x1 , . . . , xn , check if it makes φ true. The algorithm has exponential time complexity, since there are 2n possible variable assignments. There is an even simpler nondeterministic algorithm: I If φ is satisfiable, guess an assignment and verify that it satisfies φ. This algorithm is “essentially” O(n). Complexity classes Definition. A language L is a member of the class DTIME(T (n)) if there is a deterministic multitape Turing machine that decides L in time O(T (n)). Complexity classes Definition. A language L is a member of the class DTIME(T (n)) if there is a deterministic multitape Turing machine that decides L in time O(T (n)). Definition. A language L is a member of the class NTIME(T (n)) if there is a nondeterministic multitape Turing machine that decides L in time O(T (n)). Complexity classes Definition. A language L is a member of the class DTIME(T (n)) if there is a deterministic multitape Turing machine that decides L in time O(T (n)). Definition. A language L is a member of the class NTIME(T (n)) if there is a nondeterministic multitape Turing machine that decides L in time O(T (n)). Example relation: For every integer k ≥ 1, DTIME(nk ) ⊂ DTIME(nk +1 ). The class P The difference between DTIME(nk ) and DTIME(nk +1 ) at least partly depends on the details of the computational model. The class P The difference between DTIME(nk ) and DTIME(nk +1 ) at least partly depends on the details of the computational model. This leads us to disregard this difference and introduce the “rock star” of all complexity classes: [ P= DTIME(ni ) i≥1 This class includes all languages that have a deterministic Turing machine that decides them in polynomial time, no matter what the degree of the polynomial is. The class NP Symmetrically, we can define the class of all languages that have a nondeterministic Turing machine that decides them in polynomial time, no matter what the degree of the polynomial is: [ NP = NTIME(ni ) i≥1 P and tractability The interest in the class P stems largely from the following facts: I It is robust in the sense that whatever reasonable deterministic model of computation we use to define P, we end up with the same class. I P is often viewed as more or less synonymous with tractability. I This is because most natural problems that have been proved to belong to P actually belong to the “lower levels” of P, i.e., DTIME(n), DTIME(n2 ), DTIME(n3 ). The million dollar question It is obvious that P ⊆ NP, but it is not known whether the inclusion is proper or, in other words, whether P = NP. The million dollar question It is obvious that P ⊆ NP, but it is not known whether the inclusion is proper or, in other words, whether P = NP. This is one of seven Millenium prize problems selected by the Clay Mathematics Institute. P vs NP To prove that P = NP, one has to be able to argue that every problem in NP can be solved by some deterministic program in polynomial time. P vs NP To prove that P = NP, one has to be able to argue that every problem in NP can be solved by some deterministic program in polynomial time. To prove that P 6= NP, one has to be able to argue that there is some problem in NP for which there is no deterministic program that solves it in polynomial time. A closer look at non-determinism A nondeterministic algorithm decides a language L in time T (n) if for every word w, there is some computational path that halts and accepts if and only if w ∈ L and, furthermore, the shortest such path is at most T (|w|) steps long. A closer look at non-determinism A nondeterministic algorithm decides a language L in time T (n) if for every word w, there is some computational path that halts and accepts if and only if w ∈ L and, furthermore, the shortest such path is at most T (|w|) steps long. A polynomial time nondeterministic algorithm can be seen as operating in two steps: 1. Nondeterministically “guess” a solution (certificate) 2. Check that the solution really is a solution (verification) C LIQUE C LIQUE Polynomial time reductions Definition. A language L1 over Σ is polynomial time reducible to a language L2 over Γ if there exists a deterministic algorithm M that computes a function f : Σ∗ → Γ∗ such that I M runs in polynomial time and I w ∈ L1 ⇔ f (w) ∈ L2 . Conclusions from reductions Just as with general reductions, we can draw two types of conclusions from a polynomial time reduction, depending on what we already know about the languages involved. Conclusions from reductions Just as with general reductions, we can draw two types of conclusions from a polynomial time reduction, depending on what we already know about the languages involved. Assume that L1 is plolynomial time reducible to L2 . Conclusions from reductions Just as with general reductions, we can draw two types of conclusions from a polynomial time reduction, depending on what we already know about the languages involved. Assume that L1 is plolynomial time reducible to L2 . 1. If L2 belongs to P, then so does L1 . Conclusions from reductions Just as with general reductions, we can draw two types of conclusions from a polynomial time reduction, depending on what we already know about the languages involved. Assume that L1 is plolynomial time reducible to L2 . 1. If L2 belongs to P, then so does L1 . 2. If L1 does not belong to P, then neither does L2 . NP-completeness Definition. A language L is NP-complete if 1. L ∈ NP (membership) and 2. every language L0 in NP is polynomial time reducible to L (hardness). Cook’s theorem Theorem. SAT is NP-complete. Cook’s theorem Theorem. SAT is NP-complete. Corollary. 3-SAT is NP-complete. Cook’s theorem Theorem. SAT is NP-complete. Corollary. 3-SAT is NP-complete. In general, if we can reduce a known NP-complete problem to a problem L in polynomial time, then L is NP-hard. If L also belongs to NP, then L is NP-complete C LIQUE C LIQUE V ERTEX C OVER V ERTEX C OVER
© Copyright 2025 Paperzz