|Table of Contents|
The labs will use increasingly more complete subsets of the C0 programming language that is designed for the 15-122 Principles of Imperative Computation intro-level course in Computer Science at Carnegie Mellon University. C0 is a safe subset of the C programming language. All students are strongly encouraged to learn the C0 language that they will write a compiler for in this course.
x86-64 Machine-Level Programming
The following documents will help you fathom the depths of machine-level programming on the x86-64 machines, a 64-bit extension of the Intel instruction set.
- x86-64 Machine-Level Programming. This document supplements Chapter 3 of the textbook for 15-213 Computer Systems: A Programmer's Perspective by Randal E. Bryant and David R. O'Hallaron.
- GNU Assembler User Guide. This is version (2.15) available on the lab machines. Contains i386-specific features with notes on the difference between x86 and x86-64.
- x86-64 Application Binary Interface (ABI). Specifies the rules for compilers and linkers.
- Official Intel Processor Manuals (not for the faint of heart) including the Instruction Set Reference in two volumes.
IA32 and Assembler Reference Material
The following are for the older Intel x86 architectures. See the newer references above for the x86-64 (also known as IA32-EM64T).
- LLVM Home Page
- On-line demo (for producing LLVM source from C)
- LLVM Notes for this class, on the lab machines
Java Virtual Machine (JVM)
Programming Languages for Compiler Implementation
You are free to choose from a subset of programming languages or even different programming languages (caveats apply) as the language for writing your compiler. This course requires you to be familiar with the programming language that you chose. You should learn the language before the course so that you do not struggle with too many difficulties at once.
Learning New Programming Languages for the LabsIf you want to learn a new programming language for your lab, consider the following. Students are always encouraged to learn new things and new programming languages. Haskell, for instance, is also a particularly good language for the labs. Because ML, Haskell, and Scala have built-in pattern matching, several transformations are easier to implement than in Java.
- Unfortunately, course staff cannot provide much assistance in learning new languages. You have to learn your programming language of choice from a language reference or tutorials.
- Since each lab builds on work done in the previous labs, you have to be completely committed to work with your programming language of choice. You cannot change in between without having to redo the work from all previous labs.
- If you are already familiar with one functional and/or typesafe programming language, it is a lot easier to learn Haskell or ML in the given time frame.
- There is a tradeoff between the time investment of learning a new programming language in the beginning of the course and potential extra effort spent implementing more advanced compiler features in more verbose languages at the end of the course.
- Compilers have been written successfully in a lot of different programming languages.
Standard MLStandard ML Implementations:
Standard ML of New Jersey (SML/NJ)
Default (v110.59) on the lab machines; invoke with sml
Recent versions are likely to be compatible with SML/NJ v110.59
A highly optimizing, whole program compiler mostly compatible with SML/NJ
Another high quality compiler
- The Standard ML Basis Library
- SML/NJ Libraries
- ML-Lex Manual
- ML-Yacc Manual
- ML-ANTLR in ml-lpt
- Programming in Standard ML by Robert Harper
- Java API
- JLex: lexer generator
- JFLex: lexer generator
- CUP: LALR parser generator
- ANTLR: lexer and LL parser generator
- JavaCC: lexer and LL parser generator
- Haskell books and tutorials
- Parsec: monadic parser combinator library for Haskell
- Happy: parser generator for Haskell
- Scala reference manuals and tutorials
- Scala API
- scala.util.parsing.combinator package
- ScalaBison: recent LR parser generator