Slides (Powerpoint)

OODL Runtime Optimizations II
Jonathan Bachrach
MIT AI Lab
Mini Assignment
• Distinguished runtime panel!
• Come up with three questions based on
today’s and other runtime lectures for our
runtime panel.
• Due next Thursday, March 15th.
Schedule Has Changed
• Check out website
• Kostas is getting more time on proof-based
compilation
Project Looming
• After spring break we will go into project
mode.
• Make note of interesting subjects
Outline
•
•
•
•
Boxing
Type checks
Slot access
Calling conventions
Issues
• Redefinition
• Multiple threads
• Predictable performance
Tagging Basics
• Objects all the way down
• Integers are objects and must be selfidentifying
• Integer ops must then first untag before
running machine level integer op and finally
tag result
(dm + ((x <int>) (y <int>) => <int>)
(%ib (%i+ (%iu x) (%iu y)))
Tag Operations
•
•
•
•
Tag insertion
Tag extraction
Tag removal
Tag checking
-- tag data
-- return tag
-- return data
-- type-check
Boxing (Indirect Tagging)
traits
traits
119
owner
...
• Pros
– Uniform representation -- object-traits is simple
• Cons
– Conses
– Memory indirections
Immediate Tagging
data
00
<int>
data
01
<chr>
address
10
<loc>
address
10
<any>
• Immediate tag:
– Tag bits encode type
– Remaining bits encode data/pointer
Low Tagging
• Integers are tagged with 0’s in low bits
• Pointers are tagged with 1’s in low bits
– Tag removed using an addressing mode
• Pros
– Hardware overflow
– Can be added/subtracted without untagging
• Cons
– Must untag to multiply
High Tagging
• Integers are tagged as follows:
– 0…0 for positive integers
– 1…1 for negative integers
• Pros
– All operations work without untagging
• Cons
– Must manually check for overflow
– Reduce address space
Fat Tags
• Tags and data are both a whole word
– Extra word passed with each argument and stored in
objects when can’t prove exact type
• Depends on type-inference and parameterized
objects to remove need for tags
Pros
–Uniform representation
–Lightweight arithmetic
Cons
–Slot write atomicity
–Requires type-inference
Type Checks
• Type checks are highly frequent during
compile-time and runtime
• Fast type checks can greatly speed up
compiler/runtime
Type Checking Example
A
B
D
C
Search Parents
(dv <traits> (<any>))
(slot <traits> (traits-parents <lst>))
(df isa? (x y => <log>)
(mem? (object-parents y) (object-owner x)))
• Pros:
– simple
– space efficient
• Cons:
– slow!
– Not constant time
Bit Matrix
(dv <traits> (<any>))
(slot <traits> (traits-type-pos <int>))
(slot <traits> (traits-type-row <vec>))
(df isa? (x y => <log>)
(let ((row-idx (traits-type-pos (object-traits y)))
(word
(elt (traits-type-row (object-traits x)) row-idx))
(= word 1)))
• Pros
– Constant time
– Fast
• Cons
– Quadratic size
Cohen’s Algorithm
(dv <traits> (<any>))
(slot <traits> (traits-tid <int>))
(slot <traits> (traits-type-level <int>))
(slot <traits> (traits-type-row <vec>))
A
(df isa? (x y => <log>)
(let ((x-traits (object-traits x))
(y-traits (object-traits y))
(y-level (traits-type-level y-traits)))
(and (<= (traits-type-level x-traits) y-level)
(= (elt (traits-type-row x-traits) y-level)
(traits-tid y-traits))))
B
D
• Pros
– Constant time
– Reasonably fast
– Incremental
• Cons
– Must be generalized for multiple inheritance
C
A
B
C
D
tid
1
2
3
4
lvl
0
1
1
2
row
1
1 2
1 3
1 2 4
Compressed Bit Matrix
• Insight: type inclusion bit matrices are
sparse
• Idea: reuse columns for unrelated types
• Vitek et al. 1997
• 4 instruction sequence
• Tremendous compression
Slot Access
• Consider generic slot access
– Find most applicable method
– Find slot offset
– Access slot value
• Incredibly expensive!
Combining Slot Access with
Method Dispatch
• Dispatch table leaves become either
methods or slot-offsets
– Works cause using concrete classes -- giving
enough precision to resolve slots at different
offsets in different objects
Fast Generic Slot Access
(pointext
t-x pt)
0
<pt>
0
catext
che
point-x
program
<cpt>
Calling Conventions
• Tie down protocol for function calling
• Provide pay as you go mechanisms
Entry Points
• Internal entry points
– for intimate calls already proving conformance
• External entry points
–
–
–
–
Checks number of arguments
Checks argument types
Stack conses optionals
Shareable based on arity (and nary)
Lightweight Optionals
• Calling convention:
– Caller stack allocates optionals
– Callee heapifies optionals unless it can prove
dynamic extent
Open Problems
• Is indirect tagging feasible?
• Type checking in the face of redefinition
• Communicate compile-time analysis to
runtime
Reading List
• Vitek et al. 1997
• Chambers 1999
• Peter Lee 1993 *
• * on reserve in ai lab reading room
Dispatch Cache Assignment
Due Next Tuesday
• Write table-based decision tree method
dispatch cache
Proto Feedback