The Practice of Type Theory - Carnegie Mellon University

The Practice of Type Theory
in Programming Languages
Robert Harper
Carnegie Mellon University
August, 2000
Acknowledgements
• Thanks to Reinhard Wilhelm for inviting
me to speak!
• Thanks to my colleagues, former, and
current students at Carnegie Mellon.
An Old Story
• Once upon a time (es war einmal), there were
those who thought that typed high-level
programming languages would save the
world.
– Ensure safety of executed code.
– Support reasoning and verification.
– Run efficiently (enough) on stock hardware.
• “If we all programmed in Pascal (or Algol or
Simula or …), all of our problems would be
solved.”
What Happened Instead
• Things didn’t worked out quite as
expected or predicted.
– COTS software is mostly written in lowlevel, unsafe languages (ie, C, C++)
– Some ideas have been adopted (eg,
objects and classes), most haven’t.
– Developers have learned to work with lessthan-perfect languages, achieving
astonishing results.
Languages Ride Again
• But the world has changed: strong safety
assurances are more important than ever.
– Mobile code on the internet.
– Increasing reliance on software in “real life”.
• Schneider made a strong case for languagebased security mechanisms.
– “Languages aren’t just languages any more.”
– Rich body of work on logics, semantics, type
systems, verification, compilation.
Language-Based Security
• Key idea: program analysis is more
powerful than execution monitoring.
• This talk is about one approach to
taking this view seriously, typed
certifying compilation.
Type Theory and Languages
• Type theory has emerged as the central
organizing principle for language …
– Design: genericity, abstraction, and
modularity mechanisms.
– Implementation: type inference, flow
analysis.
– Semantics: domain theory, logical relations.
What is a Type System?
• A type system is a syntactic discipline
for enforcing levels of abstraction.
– Ensures that bad things do not happen.
• A type system rules out programs.
– Adding a function to a string
– Interpreting an integer as a pointer
– Violating interfaces
What is a Type System?
• How can this be a good thing?
– Expressiveness arises from strictures:
restrictions entail stronger invariants
– Flexibility arises from controlled relaxation
of strictures, not from their absence.
• A type system is fundamentally a
verification tool that suffices to ensure
invariants on execution behavior.
Types Induce Invariants
• Types induce invariants on programs.
– If e : int, then its value must be an integer.
– If e : int  int, then it must be a function
taking and yielding integers.
– If e : filedesc, then it must have been
obtained by a call to open.
– If e : int{H}, then no “low clearance”
expression can read its value.
Types Induce Invariants
• These invariants provide
– Safety properties: well-typed programs do
not “go wrong”.
– Equational properties: when are two
expressions interchangeable in all
contexts.
– Representation independence
(parametricity).
Types as Safety Certificates
• Typing is a sufficient condition for these
invariants to hold.
– Well-typed implies well-behaved.
– Not (necessarily) checkable at run-time!
• Types form a certificate of safety.
– Type checking = safety checking.
– A practical sufficient condition for safety.
The HLL Assumption
• This is well and good, but …
– Programs are compiled to unsafe, low-level
machine code.
– We want to know that the object code is safe.
• HLL assumption: trust the correctness of the
compiler and run-time system.
– A huge assumption.
– Spurred much research in compiler correctness.
Certifying Compilers
• Idea: propagate types from the source
to the object code.
– Can be checked by a code recipient.
– Avoids reliance on compiler correctness.
• Based on a new approach to
compilation.
– Typed intermediate languages.
– Type-directed translation.
Typed Intermediate
Languages
• Generalize syntax-directed translation
to type-directed translation.
– intermediate languages come equipped
with a type system.
– compiler transformations translate both a
program and its type.
– translation preserves typing: if e:T then
e*:T* after translation
Typed Intermediate
Languages
• Classical syntax-directed translation:
Source = L1  L2  …  Ln = Target
:
T1
• Type system applies to the source
language only.
– Type check, then throw away types.
Typed Intermediate
Languages
• Type-directed translation:
Source = L1  L2  …  Ln = Target
:
:
:
T1  T2  …  Tn
• Maintain types during compilation.
– Translate a program and its type.
– Types guide translation process.
Typed Object Code
• Typed Assembly Language (TAL)
– type information ensures safety
– generated by compiler
– very close to standard x86 assembly
• Type information captures
– types of registers and stack
– type assumptions at branch targets (including join
points)
• Relies heavily on polymorphism!
– eg, callee-saves registers, enforcing abstraction
Typed Assembly Language
fact:
ALL rho.{r1:int, sp:{r1:int, sp:rho}::rho}
jgz r1, positive
mov r1,1
ret
positive:
push r1 ; sp : int::{t1:int,sp:rho}::rho
sub r1,r1,1
call fact[int::{r1:int,sp:rho}::rho]
imul r1,r1,r2
pop r2
; sp : {r1:int,sp:rho}:: ret
Tracking Stronger Properties
• Familiar type systems go a long way.
– Ensures minimal sanity of code.
– Ensures compliance with interfaces.
– Especially if you have polymorphism.
• Refinement types take a step further.
– Track value range invariants.
– Array bounds checks, null pointer checks,
red-black invariants, etc.
Refinement Types
• First idea: subset types.
e : { x : T | P(x) } iff e:T and |= P(e)
• Examples:
– Pascal-like sub-ranges
0..n = { n : int | 0  n < length(A) }
– Non-null objects
– Red-black condition on RBT’s
Refinement types
• Checking value range properties is
undecidable!
– eg, cannot decide if 0  e < 10 for general
expressions e
• Checker must include a theorem prover
to validate object code.
– either complex and error prone, or
– too weak to be useful
Refinement Types
• Second idea: proof carrying code.
(e, ) : { x:T | P(x) } iff e:T and  |- P(e)
• Provide a proof of the range property.
– How to obtain it?
– How to represent it?
• Verifier checks the types and the proof.
– using a proof checker, not a proof finder
Finding Proofs
• To use A[n] safely, we must prove that
0  n  size(A).
• If we insert a run-time check, it’s easy!
– if 0  n  size(A) then *(A+4n) else fail
• In general we must find proofs.
– Instrumented analysis methods.
– Programmer declarations.
Representing Proofs
• How do we represent the proofs?
– Need a formal logic for reasoning about
value range properties (for example).
– Need a proof checker for each such
formalism.
• But which logic should we use?
– How do we accommodate change?
– Which properties are of interest?
Logical Frameworks
• The LF logical framework is a universal
language for defining logical systems.
– Captures uniformities of a large class of logical
systems.
– Provides a formal definition language for logical
systems.
• Proof checking is reduced to a very simple
form of type checking.
– One type checker yields many proof checkers!
General Certified Code
• The logic is part of the safety certificate!
– Logic of type safety.
– Logic of value ranges.
– Logic of space requirements.
• Proofs are LF terms for that logic.
– Checker is parameterized on specification of the
logic (an LF “signature”).
– LF type checker checks proofs in any logic
(provided it is formalized in LF).
Some Challenges
• Can certified compilation really be made
practical?
– TALC [Morrisett] for “safe C”.
– TILT [CMU] for Standard ML [in progress].
– SML/NJ [Yale] for Standard ML [in
progress].
– Touchstone [Necula, Lee] for “safe C”.
Some Challenges
• Can refinements be made useful and
practical?
– Dependent ML [Pfenning, Xi]
– Dependently-Typed Assembly [Harper, Xi]
• Experience with ESC is highly relevant.
– A difference is that refinements are built in
to the language.
Some Predictions
• Certifying compilation will be standard
technology.
– Code will come equipped with checkable safety
certificates.
• Type systems will become the framework for
building practical development tools.
– Part of the program text.
– Mechanically checkable.
Further Information
http://www.typetheory.com

Download Report

The Practice of Type Theory - Carnegie Mellon University

Paperzz.com

Your Paperzz