Overview Compiler Baojian Hua [email protected] Compiler Compilers are the most fundamental developer tool Long history of study in CS Almost all of the key ideas in compiler design are also important in other problem domains You will end up using compiler design principles in almost every software development project Compilers are Fundamental Compilers are fundamental elements of computer systems Every new machine architecture defines standard calling conventions and comes with a compiler Chip performance is measured by using a compiler Even embedded processors are now so complex that compilers are necessary Still learning about compilers Compiler design is still an active area of research In recent years, there has been a tremendous amount of activity in various forms of security and safety-related compiling We will do some reading of recent research papers, in addition to working on our own compiler projects What is a Compiler? A compiler translates source programs into target programs. In a high-level source language, programs specify both static (compiletime) and dynamic (run-time) computations. Usually, the target language specifies only dynamic computations Static and dynamic source program compiler static computations target program machine dynamic computations results Compiler Structure Compilers are structured in highly modularized fashion promotes better correctness, maintenance, etc permits easier retargeting for new target architectures naturally follows the static/dynamic staging of high-level source programs Compiler structure string of characters lexical analyzer sequence of tokens parser symbol table abstract syntax tree semantic analyzer intermediate code code generator relocatable object code The UNCOL argument SML x86 Java Sparc C MIPS C++ PPC C# ARM n×m compilers! The UNCOL argument SML x86 Java Sparc C IR MIPS OCaml PPC C# ARM vs n+m compilers assembler linker Translate Assem control flow analysis Flow Graph instruction selection Machine code Abstract syntax semantic analysis Relocatable code code emission Reductions canonicalize parsing actions IR trees Tokens parse Assembly code register allocation IR trees translate Register assignment Translate lex Flow Graph Source program Compiler phases, more detailed… Optimizations The most important function of a compiler is code optimization For modern architectures, register allocation is the most important optimization But there are many, many other optimizations as well A serious design challenge is how to order the optimizations Declarative Specifications Another interesting aspect of compilers is the level of automation Some phases are automatically generated from declarative specifications Most of this is derived directly from CS theory, such as automata theory and type theory Every phase is different All in all, every phase presents unique challenges, and makes use of different (math) concepts, data structures, and algorithms A major aspect of compiler design, therefore, is how to synthesize all of this into a coherent, reliable, and robust system How This Course Works Structure of this course Lectures Readings Textbook & research papers Exercises & quizzes Monday, 2:00-4:00 paper+pencil exercises + 2 quizs Projects development of a compiler from scratch Online Resources Web site: Critical course information: http://staff.ustc.edu.cn/~bjhua/courses/spring10 course policies and schedule of lectures readings and exercises project information development resources discussion boards late-breaking announcements some lecture notes Read the web site every day! Course staff resources TAs Me Zhong Zhuang ([email protected]) [email protected] See the webpage for office hours, etc. The best way to get quick answers is to use email Textbooks & Reference Modern compiler implementation in ML (tiger book) Compilers: principals, techniques and tools (dragon book) Advanced compiler design and implementation (whale book) Engineering a compiler (ark book) Projects 7 projects planned a trivial warm-up Worth a combined total of 70/100 points Each project involves developing: a complete working compiler component plus test programs Each project will build on the previous one You should work by yourself Project mechanics Projects due at midnight on specified date Test programs due one week prior handin automatically Test programs will be published for all to use Your project score is the percentage of all test programs that your compiler passes some TA discretion is allowed Grading your work Homework exercises to be turned in at lecture, to be graded by TAs For projects: It must be possible to build and execute your compilers This means following the directions given on the webpage automatically precisely Make files, test file formats, etc We officially support SML All compilers must target the x86 by generating assembly code that can be assembled by gcc running under Linux Coding style For grading purposes, we will not read your code But you will be living with your code all term, so attention paid to commenting, good structure, and (especially) good modularity will be critical to staying sane! Also, you should understand your compiler completely Think: Extreme Programming Or at least do detailed design and interface development Summary of Grading 70% for projects 10% for homework exercises determined by successful tests of your compiler on test programs given roughly every other week, to be handed in to TAs during lecture 20% for middle and final quizzes Late policy Firm due dates, so you should be able to plan and manage this hence, late submissions generally not accepted See me if something serious comes up that causes you to need more time About cheating Each team is required to do its own project work read the cheating and collaboration policy on the Blackboard, if you want a course grade We may use code-similarity checkers Collaboration is OK! Sharing ideas, approaches, limited debugging help, etc. are all good But you must write your own code Choice of Programming Language Since we will not read your code, any language can be used but we must be able to build and test your compiler automatically SML are officially supported see webpage for more details Choice of Language SML provides major advantages for compiler construction If you are unfamiliar with SML, then probably best to stick with Java But pay extra-extra-extra-special attention to good modularity Will say more about this during the term Why SML? Each project will extend/enhance the previous project In SML, changes to an interface or module will cause the SML compiler to check all other modules that depend on it In Java, changes to a class often do not cause the Java compiler to complain, even if the changes affect other classes Why SML? cont’ As a result, in Java, you will often be forced to test your changes by executing/testing your project compiler This is slow, painful, error-prone; and you get almost no help in locating the source of bugs If you use Java, it is super-important to be very very disciplined See the basic principles in Ch.1! It is probably important to make good use of the visitor pattern Other Development Resources All projects will develop compilers that target the Intel x86 architecture x86 development tools will be GNUbased, running under Linux For some development, can use MinGW system for Windows (but note incompatibilities with Linux gcc) Mac OS X (with development tools installed) See the webpage for details Summary This is intended to be a fun and engaging project-oriented class By the end of the term, you will have implemented a serious compiler for a nontrivial programming language! Stay engaged, and pay attention to coding style and modularity, and it is fun and profit Last Thing Prepare textbook Read the online SML book Take your laptop to class next time
© Copyright 2026 Paperzz