Fall 2015-2016 Compiler Principles Exercise Set 1: Lexical Analysis Roman Manevich Ben-Gurion University Exercise 1 • Assume the following lexical specification: R = (1-9)(0-9)*. (0-9)*(1-9) | 0. (0-9)*(1-9) D = (0-9)(0-9).(0-9)(0-9).(0-9)(0-9)(0-9)(0-9) • Construct the scanner automaton and label the accepting states by the appropriate tokens • Run it on the following inputs using the maximal munch algorithm 1.23.2 1.230.2 01.11.2015 2 Exercise 2 (2015 midterm) 3 Exercise 3 a) Given an example of a lexical specification R1,…, Rk and a word in the language (R1 | … | Rk)* for which the maximal munch policy does not yield “Success” b) [harder] Can you think of an algorithm that satisfies both properties (successfully splits any word in the language of R1 | … | Rk and produces the longest tokens)? 4 Exercise 4 • Which of the following conditions yields worst-case linear running time for the naïve maximal munch algorithm: 1. 2. 3. 4. All tokens have constant length The languages of different tokens do not intersect The languages of different tokens do not intersect Each token starts with a unique letter and ends with a unique letter 5. The language of each token is prefix-closed (i.e., if w is accepted than each prefix of w is also accepted) • Find conditions as weak as you can for which the worst-case running time of the naïve maximal munch algorithm is linear 5 Exercise 5 – research challenge • The worst-case of Tom Reps’s algorithm is O(nk) where n is the length of the input and k is the number of automaton states • Can you find an algorithm that reduces the k constant factor to a constant? 6 Exercise 6 – research challenge • Develop an asymptotically efficient parallel algorithm for maximal munch-based scanning 7
© Copyright 2026 Paperzz