Four-A — Component Adaptation and Assurance Bill Scherlis Institute for Software Research School of Computer Science CMU [email protected] 412-268-8741 DARPA ITS PI Meeting 22 Feb 00 Aspen, CO With: John Tang Boyland (UWM), Aaron Greenhouse, Edwin Chan Carnegie Mellon Scherlis This Presentation • Technical objectives – Code-level assurance • In development and adaptation – Application to specific assurance properties • Code safety and threading. • Frameworks. • Technical approach – Semantics-based manipulation • Structural. Threads. Etc. – Annotation and analysis • Uniqueness. Effects. Etc. – Tool-based studies • Java source-level manipulation • Existing practice • Accomplishments & plans • Premises and scope • Transition – Adaptation: JDK evolution – Security: CERT data Carnegie Mellon – Schedule – Expected accomplishments – Tool – Infrastructure Scherlis Four-A Technical Objectives 1. Improve source-level software assurance – Systematically improve code safety, tolerance, etc., using source-level analysis, annotation, transformation. – Improve the extent of formal assurance using analyses, annotation, transformation. – Provide scalable and composable approaches for a variety of codesafety properties, based on annotations. 2. Provide ongoing assurance thru evolution – Avoid re-verification of code safety, tolerance, and other properties as software components and systems evolve. – Support programmer through adaptation by formally analyzing and carrying out changes, preserving and enhancing assurance where possible. Carnegie Mellon Scherlis A Simple Motivating Example: Thread Safety Annotation. Manipulation. Analysis 1. Thread safety and security • • CERT vulnerability data Exploitation scenario: incremental thread capture 2. Locks and code evolution 3. Technical elements Carnegie Mellon Scherlis Two Documented Vulnerabilities (CERT) Name: ibm/mknod • Keywords: IBM, AIX, setuid, root access, race condition • Description: Some (if not all) versions of AIX have a setuid /usr/sbin/mknod so that ordinary users may create name pipes. This is done with a mknod(2) systemcall followed by a chown(2) system call, this opens for a race condition if the user renames the names pipe, and links it to another file before the chown(2) call. So ordinary users may “steal” other users files, and thereby gain unauthorized root access. • Impact: local user gains root access Name: noclobber timing window • Keywords: noclobber; timing window; race condition • Description: There is a race condition with respect to the shell variable noclobber in some implementations of csh/tcsh. Noclobber is supposed to prevent files from being overwritten if they exist. If the file doesn’t exist, some implementations of csh determine that fact with a stat() call. If stat returns ENOENT then the shell proceeds to write on the file. However, the file could be created between then stat and write calls, thus defeating the purposes of the noclobber variable. • Impact: files are overwritten. Carnegie Mellon Scherlis An Aside: The CERT Vulnerability Taxonomy (~ 1200 vulnerabilities) • • • • Assumptions wrong or changed Design errors Errors in requirements specifications Implementation errors – – – – – – Basic programming practices Improper use of a well understood algorithm Privileged programs Timing windows Trusts something not designed to support trust Trusts untrustworthy information • Other problems • User interface Carnegie Mellon Scherlis Evolving MultiThreaded Code Work in progress – Aaron Greenhouse Why – Improve code safety and robustness – Improve performance and flexibility How – Annotations • Locks associated with regions (encapsulated sets of fields) • Assignment of locks to (final) fields or instance variables • Lock ordering – Manipulations • Shrink lock • Split/merge locks – Analyses • (multiple) – Tool support Carnegie Mellon Scherlis EventQueue • EventQueue – Sends an event to listeners on dequeue. – Priority levels. • Initial code state – Free of race conditions • All methods are declared to be synchronized. • [NB. Deadlocks are still possible.] • Evolution goal – Performance • Synchronization is too coarse • Remove unneeded synchronization. • Introduce multiple locks. – Appropriate simultaneous access Code fragments below illustrate the systematic refinement process. [ Work in progress by Aaron Greenhouse ] Carnegie Mellon Scherlis class EventQueue { public region Listeners; public region Normal; public region Priority; private private private private private final unshared List listeners in Listeners { Instance in Instance }; final unshared List normal in Normal { Instance in Instance }; final unshared List high in Priority { Instance in Instance }; int numNormal in Normal; int numHigh in Priority; lock this protects Instance; public EventQueue() reads nothing writes nothing { /* ... */ } // Continued Carnegie Mellon Scherlis public synchronized void addEQListener( final EQListener l ) reads nothing writes Listeners { listeners.add( l ); } private synchronized void fireEQEvent( final Object o ) reads nothing writes All { final EQEvent evt = new EQEvent( this, o ); final List copy = (List)((ArrayList)listeners).clone(); for( int i = 0; i < copy.size(); i++ ) { final EQListener l = (EQListener)copy.get( i ); l.dequeued( evt ); } } public synchronized int getSize() reads Normal, Priority writes nothing { return numNormal + numHigh; } private synchronized void dispatchEvent() reads nothing writes All { final Object o = dequeue(); fireEQEvent( o ); } . . . } // End of class Carnegie Mellon Scherlis Shrink synchronized Blocks Step 1: Shrink synchronized blocks. – Convert synchronized methods to methods with synchronized bodies (trivial). – Use effects analysis exclude statements not affecting region associated with lock. • The signature of methods are not changed. – Call sites are not affected. – Other implementations of the method are not affected. Carnegie Mellon Scherlis class EventQueue { //... private void fireEQEvent( final Object o ) reads nothing writes All { //... List copy; synchronized( this ) { copy = (List)((ArrayList)listeners).clone(); } //... } //... private Object dequeue() reads nothing writes Normal, Priority { Object o = null; while( o == null ) { if( (o = tryGetPriority()) == null ) { o = tryGetNormal(); } } return o; } Carnegie Mellon Scherlis Split the lock Step 2: Split the lock used by EventQueue. – In general, replace a lock L on a region R with locks Li on subregions Ri. – Replace uses of L with uses of appropriate Li. • Use effects analysis to determine affected Ri. • May need to use multiple locks. – Avoid deadlock by enforcing lock ordering – Changes how fields must be accessed • Affects: ancestors and descendent classes. – Why do this: • Improve concurrency • E.g., Agenda queue— potential simultaneous actions – “Edit” separate queue elements (tasks) – Reorder spine Carnegie Mellon Scherlis class EventQueue { public region Listeners; public region Normal; public region Priority; private private private private private lock lock lock sync final unshared List listeners in Listeners { Instance in Instance }; final unshared List normal in Normal { Instance in Instance }; final unshared List high in Priority { Instance in Instance }; int numNormal in Normal; int numHigh in Priority; listeners protects Listeners; normal protects Normal; high protects Priority; high before normal; public EventQueue() reads nothing writes nothing { /* ... */ } public void addEQListener( final EQListener l ) reads nothing writes Listeners { synchronized( listeners ) { listeners.add( l ); } } // Continued Carnegie Mellon Scherlis private void fireEQEvent( final Object o ) reads nothing writes All { final EQEvent evt = new EQEvent( this, o ); List copy; synchronized( listeners ) { copy = (List)((ArrayList)listeners).clone(); } for( int i = 0; i < copy.size(); i++ ) { final EQListener l = (EQListener)copy.get( i ); l.dequeued( evt ); } } public int getSize() reads Normal, Priority writes nothing { synchronized( high ) { synchronized( normal ) { return numNormal + numHigh; } } } private Object tryGetPriority() reads nothing writes Priority { Object o = null; synchronized( high ) { if( numHigh > 0 ) { o = high.remove( 0 ); numHigh -= 1; } } return o; } Carnegie Mellon Scherlis Case study summary • The code improvements are routine, but risky – Motivated for good reasons … – Each entails many small changes … – Any change, improperly executed, can create new vulnerabilities • Much can be done with annotation and manipulation – Enabling ongoing assurance with tool support • For threading – Manipulations: Shrink lock, Split/Merge locks, etc. – Annotations: • Locks and regions, Lock order, Lock variables, Effects, etc. – Analyses: Effects, etc. • Issue: What portion of this activity is “tool feasible”? – Interactive tool (manipulation, analysis, annotation) – Programmer guidance Carnegie Mellon Scherlis Four-A Hypotheses • In evolving Java systems, semantics-based annotation and analysis techniques can provide a component-based approach to the assurance of a useful range of safety and tolerance properties. – Many code-safety properties can be composable on a basis of added specifications for “mechanical” properties • Thread-safety and race conditions • Array bounds, exceptions, extended type safety, null references, etc. – Annotations and analysis provide a mechanism • Effects. Unique references. Uses limitations. • Regions for effects, locks. • Cf. Extended Static Checking (ESC) • The safety risks of complex restructuring tasks can be reduced through the use of systematic manipulations – Administrative structural changes • Boundary movement. Hierarchy restructuring. • Representation change. – Performance improvements • Lock skrink/split. Inlining. – Robustness improvements • Method harmonization Carnegie Mellon Scherlis Four-A Hypotheses • Manipulations can improve software with respect to safety, tolerance, and robustness properties – Examples • Introduce redundancies • Insert/remove audits, checks, logging • Insert techniques for graceful degradation • The annotation, manipulation, and analysis techniques can be supported in Java-based tools – 99% Java – Basis for experimentation and evaluation – Usable and adoptable • These techniques can be combined to better support the iterative development of intrusion tolerant systems Carnegie Mellon Scherlis [ Preliminary JDK Census results ] Carnegie Mellon Scherlis This Presentation • Technical objectives – Code-level assurance • In development and adaptation – Application to specific assurance properties • Code safety and threading. • Frameworks. • Technical approach – Semantics-based manipulation • Structural. Threads. Etc. – Annotation and analysis • Uniqueness. Effects. Etc. – Tool-based studies • Java source-level manipulation • Existing practice • Accomplishments & plans • Premises • Transition – Adaptation: JDC evolution – Security: CERT data Carnegie Mellon – Schedule – Expected accomplishments – Tool – Infrastructure Scherlis Four-A Premises • Work from code level thru design toward spec – Why: Code as ground truth. Snapshot problem. – Why: Legacy code. Exploit and improve partial specs. – Why: Manage detail design. • Use partial information about components in a system – Why: Trade secret (COTS). Security. Distributed development. – Cf. whole-program analysis • Rely on encapsulation, type safety, composable props – Java, (modified) beans, etc. – Why: Scalability. Partial information. Manipulation soundness. • Focus on administrative change in routine SWE – Why: Appropriate roles for programmers and tools. Adoptability. – Why: Tune for performance, security, robustness Carnegie Mellon Scherlis Four-A Technologies (Adaptation, Analysis, Annotation, Accounting) • Semantics-based program manipulation – – – – Source-code and design level Structural manipulations Run-time manipulations Meta-manipulations • Analysis and models – OO effects, mutability, uniqueness, aliasing, uses, . . . • Annotation and specification – Mechanical properties • Tools for assured adaptation of Java components – Information loss and chain of evidence – Use of audit data Carnegie Mellon Scherlis Systematic Software Adaptation Routine software structural evolution – Examples: • • • • • • • • • • • • API change Data representation change Class hierarchy restructuring Signature change Introduce self-adaptation Mobility Encapsulation Split into phases / stages Cloning to produce specialized variants Merging of related functions Replication for robustness Threading changes Provide tool support for these operations – With predictable impact on functional and mechanical program properties Carnegie Mellon Scherlis Assured Software Change Structural change in practice • Costly – Changes can be distributed throughout a system. – Complex analysis (program understanding) is required. • Risky – Invariants and specifications are not present. – Many code elements may need to be changed. – Code elements may be inaccessible for analysis or change. • Avoided – Why are we stuck with bad structural design decisions? • Decisions are made early • Consequences are understood late • They often start wrong and stay wrong – Why do we tolerate brittleness? • Code rot = persistence of abstractions beyond their time. – Why do commercial APIs accrete? – Why does ad hoc code persist? – Why is it so costly to navigate structural trade-offs? • Revise interface and component structure • Trade-off generality and performance Carnegie Mellon Scherlis Assured Software Change Structural change in practice • Costly • Risky • Avoided • Necessary – Structural change enables functional change • Localize/encapsulate related software elements • Sustain compatibility with evolving APIs • Address performance issues – Structural change enables code management • Code rot = persistence of abstractions beyond their time. • Create views to support programming aspects • Cf. AOP. SOP. N-Dim. – Navigate structural trade-offs during design/evolution • Support iterative software processes Carnegie Mellon Scherlis Example: Move Field Move field f from class C to class A. Programmers can do this using drag-and-drop. A f Checks – – – – – – – C is descendent of A. If A is interface, f must be public static final. Shadowing: A and B have no use of ancestral f. Unshadowing: No f field in B (capture C’s f uses). D (and other sibs) have no uses of f. Initializer code can be reordered, by field type. Reordering is acceptable for interleaved constructor and field code. Actions D – Adjust access tags – Handle special cases B Caveats C Carnegie Mellon foo bar f – – – – Visibility in D and other sibs Visibility in C’s subs Promises introduced Changes in binary compatibility Scherlis Example: Rename Method Rename methods m from oldName to newName. Programmers could do this with a simple gesture oldName() C m oldName() Carnegie Mellon • Callsites used to dispatch to unchanged methods in override group – Name conflict • Callsites now dispatch to methods in a previously existing override group Actions oldName() m – Methods called at a callsite for oldName() or newName() are the unchanged – Bindings – Uses checks and annotations to assure binary compatibility A B Checks D oldName() newName() – Rename methods – Rename proved callsites – Name checks/maps for dynamic sites/classes Caveats – Deletion from override and olverload groups for A.oldName() – Addition to override and overload groups for newName() – Promises introduced – Changes in binary compatibility (modulo uses annotations) Scherlis Manipulations • Manipulations enable systematic structural change – – – – Trade-off generality and performance Sacrifice (or introduce) abstractions Reorganize component boundaries Introduce or adjust run-time (later stage) manipulations • Managed self-adaptivity • Manipulations are idiomatic program evolution steps – Precise expression of “patterns of evolution” or “refactorings” – Enable rapid/dynamic structural change (fluid programming) – Enable model-based programming (analytic views) • Tool role – Programmer: Design intent, exploration of structural options – Tool: Mechanical details, soundness, design record Carnegie Mellon Scherlis Manipulation Techniques (Examples, 1) • Boundary movement (ISAW’98) – Code relocation (expression, statement, method, class) – Abstract/unfold (method, variable, class) – Clone (class, method, etc.) • Frequency change – Pass separation – Tabulation/closure • Data representation change (ESOP’98) – Shift – Idempotency, Projection – Destructive operations = = • Hierarchy restructuring – Hoist – Insert – Split/clone Carnegie Mellon = = = ... Scherlis Manipulation Techniques (Examples, 2) • Staging, specialization, splitting – (Partial evaluation) – Merging and generalization – Pass separation • Thread management – Shrink, Split, Merge – Insert, Remove • Self-adaptation – Meta-manipulation – Polyvariance and domain-tolerance • Integrity – Replication – Redundant checks Carnegie Mellon Scherlis Four-A Technologies (Adaptation, Analysis, Annotation, Accounting) • Semantics-based program manipulation – – – – Source-code and design level Structural manipulations Run-time manipulations Meta-manipulations • Analysis and models – OO effects, mutability, uniqueness, aliasing, uses, . . . • Annotation and specification – Mechanical properties • Tools for assured adaptation of Java components – Information loss and chain of evidence – Use of audit data Carnegie Mellon Scherlis Specifications for mechanical properties • Manipulations require analyses – Example • Manipulation: Reorder code • Analyses: Effects, aliasing (may-equal and uniqueness), uses. • At scale: – Development is distributed/collaborative. – Functional specifications (and source code) may be lacking. – Programs are dynamically linked, mobile, etc. • Analyses for manipulation – Composable: Whole-program analysis are infeasible – Goal-directed: Compiler analyses are “opportunistic” • Analyses require mechanical assertions – Annotations (promises) about components and their elements Carnegie Mellon Scherlis Properties specified by assertions • Mechanical properties specified (examples) – Read/write effects in OO systems • Enable reordering • Use aliasing and uniqueness information (ECOOP’99) • Region designation – Unique references • Tolerate temporary loss of uniqueness (borrowed) – Structure declarations • Precise control over uses – Mutability • Promises as a currency of flexibility (ICSE’98) – Promises change less frequently than code – Tools identify potential promises • Programmer chooses which to offer clients – Programmer can request specific promises – Tool manages dependency and validation information Carnegie Mellon Scherlis Effects Analysis for Manipulation Manipulation example Analyses Goal: Move statement C; 1. What are the effects for a given computation? A; B; C; C; A; B; 1. Compute sets of effects For each of: A; B; C; 2. Test for interference among computations: 2. Do two (or more) given targets overlap? For A; , C; and B; , C; Carnegie Mellon Scherlis Key Ideas: OO Effects (ECOOP’99) • Source-level analysis of partial programs – Do not want, and may not have, the whole program – Use annotations on methods as surrogates for components • Use of regions and aliases to analyze OO programs – Encapsulate state of objects in regions to protect programmer abstractions – Use aliasing information (may-equal and unique) to improve results • Programmer-guided source-level manipulation – Carnegie Mellon Goal-directed analysis (vs. compile-time opportunistic analysis) Scherlis Code safety: Why Unique Variables? • Sole access to an object entails certain privileges: – Mutations can be performed without regard to rest of program (no other read access) – Invariants can be maintained without regard to rest of program (no other write access) • Program invariants are ideally – Explicit (code readability) – Checked (code maintainability) Carnegie Mellon Scherlis Uniqueness examples • String buffer character array – If unique: • Can be coerced to immutable when final string is desired. • Vector internal array – If unique: • Mutations of separate vectors can be reordered. • Hashtable internal array – If unique: • One can enforce hashing invariants, • And can rehash without interference. Carnegie Mellon Scherlis Four-A Technologies (Adaptation, Analysis, Annotation, Accounting) • Semantics-based program manipulation – – – – Source-code and design level Structural manipulations Run-time manipulations Meta-manipulations • Analysis and models – OO effects, mutability, uniqueness, aliasing, uses, . . . • Annotation and specification – Mechanical properties • Tools for assured adaptation of Java components – Information loss and chain of evidence – Use of audit data Carnegie Mellon Scherlis Information Management The Internal Representation (IR) • Features – Global name spaces • Entitles (fluid.ir.IRNode) • Attributes (fluid.ir.SlotInfo) • Types Attribute name space – Versioning • Several policies • Possible at cell level • Configurations – Dependencies Entity name space Cell • Notification • Tracking – Conventional wrappers • Attribute patterns: navigable ordered trees, etc. – Collaboration support • Persistence • Fine-grained concurrency policy • Surrogacy Carnegie Mellon Scherlis The version forest Initial version Each transition represents a manipulation (abandoned) E.g., 400,000 nodes 10,000 versions Latest release A growing tip in the tree Latest snapshot Shared manipulations Experimental Configuration Carnegie Mellon Scherlis [ demo ] Carnegie Mellon Scherlis This Presentation • Technical objectives – Code-level assurance • In development and adaptation – Application to specific assurance properties • Code safety and threading. • Frameworks. • Technical approach – Semantics-based manipulation • Structural. Threads. Etc. – Annotation and analysis • Uniqueness. Effects. Etc. – Tool-based studies • Java source-level manipulation • Existing practice • Accomplishments & plans • Premises and scope • Transition – Adaptation: JDK evolution – Security: CERT data Carnegie Mellon – Schedule – Expected accomplishments – Tool – Infrastructure Scherlis Four-A Schedule • Year 1 – Tool infrastructure – – – – • 99% Java, analysis, annotation, adaptation, accounting Analysis algorithms (uniqueness, effects, mayEqual, etc.) Demonstrate preservation of assurance properties thru change Manipulations for threading Case studies for thread safety and pattern • Year 2 – – – – – Class-level structural manipulations Management of uses information Exploitation of aliasing annotations to assure code safety props Threading annotations and analyses Design record to support assurance information • Year 3 – Manipulation library for improvement of code safety • Prevent Detect Tolerate – Large-scale manipulation through analytic views – Tool-based case study based on intrusion scenarios Carnegie Mellon Scherlis Recent Accomplishments • Four-A tool prototype – Supports non-local manipulations • Annotations and analyses – Unique. MayEqual. • Support for evolving multi-threaded code safely – Manipulations (preliminary form) – Annotations • Software engineering baseline – Evolution census: JDK changes: source code, logs Carnegie Mellon Scherlis Transition • Build on mainstream commercial technologies – Java, beans, etc. • Build on existing infrastructure – Tool (developed by our team) for Java analysis, manipulation, engineering process, design information management. – Platform (UI, IM, VM, syntax) is also usable for other languages. • Usability/adoptability a priority from the outset – Enable experimentation/studies without high adoption cost – E.g., gesture-based interface where possible • Conduct engineering baseline analyses – What are the code-level vulnerabilities being exploited? – What kinds of changes are routinely made in commercial APIs? – What is the impact of those changes on code safety? Carnegie Mellon Scherlis
© Copyright 2026 Paperzz