Technological Networks: From Empirical Laws to Theory Stephanie Forrest Dept. of Computer Science University of New Mexico http://cs.unm.edu/~forrest [email protected] UNM CS Dept. Profile • 18 Faculty: – 9 tenured (5 full, 4 assoc), 5 assistant,1 lecturer – 3 openings (2 junior, 1 senior) – Prince of Asturias Endowed Chair in Information Technology – External faculty appointments from other departments and national labs. • Close collaborations with SFI and national labs (SNL and LANL) • Students: – Degrees: BS: ~40, MS: ~35, PhD: ~5 – Undergraduate: ~200 majors; BS degrees • >20% Female; >35% Minorities – Graduate: ~120 MS ~80 Ph. D. • >20% Female; 40-60% Foreign • Funding: (2004-2005) – Total: $3.5M – NSF, DARPA, DOE, NIH, Sandia and Los Alamos UNM CS Dept. Research • Strongly Interdisciplinary: – Adaptive computation – New paradigms of computing (molecular and quantum computation) – Computational biology and bioinformatics (phylogenetic tree reconstruction, radiology, RNAi) • • Graphics and visualization High-performance computing: – Light-weight distributed operating systems • Automated reasoning and machine learning: – Otter – POMDP • Complex networks: – Provably robust scalable algorithms for P2P networks – Phase transitions in NP-complete problems Themes of Talk • The real world isn't exactly scale free. • Understanding and predicting network structure is important for engineering: – Network properties can be exploited to enhance computer security (computer epidemiology / border gateway protocol) • We lack theory to explain/predict the structure of technological networks: – Preferential attachment isn’t good enough. • Initial steps toward theory. Technological Networks • Distribute resources: – Energy – Materials – Information • Energy distribution: – Power grids – Gas pipelines • Transportation: – Highways – Airline routes • The Internet: – – – – Physical connectivity Autonomous systems (AS) World-wide web Social contacts, e.g., email • Microarchitecture Network Structure • Network topology affects network properties: – Shortest distance between two nodes – Bisection width – Rate and extent of contagion • Analysis: – Epidemiological models. – The epidemic threshold. • Degree distribution of network: – Scale-free (power law) networks: – Pk = k-c • Controlling infections on scale-free networks: – Random vaccination is ineffective (e.g., anti-virus software). – Targeted vaccination of highconnectivity nodes. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Example 1: Computer Epidemiology Justin Balthrop, Mark Newman, Matt Williamson • Viruses and worms spread over networks of contacts between computers: – Email address books. – URL links. • Different types of networks are exploited by different types of infections. UNM CS Dept. Network of Address Books Degree Distributions of Four Networks Relevant to Computer Security 300 IP network Adminstrator network 250 Email traffic Address books 10000 200 1000 150 100 100 10 50 0 1 0 100 200 300 400 1 Degree k 10 100 1000 Degree k Science 304:527-529 (2004) – Not scale-free. – Targeted vaccination unlikely to be effective: • • > 10% of nodes required for address book data > 87% of nodes required for email traffic data – Computer infections can choose their own topology, so network topology is not static. – Viruses spread faster than repairs. Throttling: Generic Control of Epidemics Matt Williamson and Justin Balthrop • Control network topology in time rather than space. – – • Assumes that virus traffic is significantly different from normal traffic: – – • Effective when the form of infection not known in advance. Reduces amount of traffic generated during an epidemic. Limitations: – • Nimbda infects up to 400 new machines per second compared to normal rate of connections to new Web servers of about one connection per second or slower (Williamson, 2002). Throttling Nimbda would have increased the epidemic time from one day to over one year. Advantages: – – • Limit the rate at which a computer can make new connections Limits spreading rates rather than stopping. Effectiveness decreased if not all nodes are throttled (altruism), stealth attacks. Implementations: – – RIOT HP’s Virus Throttle Responsive I/O Throttle (RIOT) Justin Balthrop and Matt Williamson • Vision: An adaptive desktop firewall: – – – – • Personal desktop firewalls: – – • Hamper worms, viruses, DOS, misconfigurations, etc. Graduated automated response (throttling) on all connections. Adaptive (learn normal). Robust to false positives Coarse-grained Static and preprogrammed • How to detect anomalous connections? – Generic rate limit (throttle): – – Delay network connections that occur at an anomalous rate. Detector activation triggers delay and packets are dropped. IP Address 1 32 bits IP Address 2 32 bits Set of detectors (lymphocytes) that observe TCP connections: • • – – TCP Port 1 16 bits Data Path (below) Meta-information (direction, TCP flags) Each detector matches some portion of IP space. Each detector has its own normal activity level. TCP Port 2 16 bits Learning and Throttling Connections 1. A new connection is initiated (SYN Packet) 2. The connection information is translated to a bitstring 3. The bitstring is shown to all detectors 4. An immature detector exceeds its activation threshold 5. The detector’s activation threshold is raised 6. Detectors with stable activation thresholds become mature Autonomic Responses: A Repertoire LISYS (Steven Hofmeyr) E-mail system administrator when there is an anomalous network connection. Personal Desktop Firewall Block network connections based on user-specified policy. pH Slow down system calls for anomalous processes. (Anil Somayaji) Throttling (Matt Williamson) RIOT (Justin Balthrop) PGBGP (Josh Karlin) Limit the overall rate of new network connections for a computer. Limit rate of all anomalously behaving TCP connections. Adaptive. Limit the rate at which new BGP routes are adopted. Example 2: Inter-Domain Routing Josh Karlin, Stephanie Forrest, and Jennifer Rexford • • ~20,000 Autonomous Systems (ASs) connected via the Border Gateway Protocol (BGP) ASs route blocks of IP’s, know as prefixes AT&T – 64.106.0.0/17 (Owned by UNM) • • ~170,000 prefixes owned by ASs today. Border gateway protocol (BGP): – Tell neighbors about new routes (Announcements). – Tell neighbors about old routes gone bad (Withdrawals). – That’s it. Time Warner Telecom Comcast SWCP UNM BGP Networks are Interesting and Important • Distributed: – Nodes are AUTONOMOUS systems. – No centralized routing information (routes stored and maintained locally). – No authentication of new nodes or routes. • Dynamic: – Network connectivity changes routinely and continually. – Network updates are spread through local contact (BGP). • Confluence of technological and economic constraints: – A “policy” network as well as a routing network. • All inter-domain Internet traffic relies upon BGP. • Vulnerable: – Trivial to inject false routing information into network. – Man-in-the-Middle attacks. • Pretty Good BGP (PGBGP) and Internet Alert Registry (IAR). – Throttle the adoption of new routes. PGBGP Algorithm • Main Idea: Delay Suspicious Routes – Lower the preference of suspicious routes (24hr) • Detection: – Monitor BGP update messages – Treat origin ASs for a prefix seen within the past few days as normal – Treat new origin ASs as suspicious for 24 hours, then accept as normal (possible prefix hijack) – Treat new sub-prefixes as suspicious for 24 hours, then accept as normal (possible sub-prefix hijack) • Response: – Suspicious origin AS routes are temporarily given low local preference – Suspicious sub-prefixes are temporarily ignored (not forwarded to) PGBGP Advantages • Incremental deployability – No change to BGP protocol, just to path selection – Immediate benefits to adopting AS and customers • Automated and immediate response – Avoid using and propagating the bogus route – Network has chance to stop the attack before it spreads • Robust to false positives – Lowering preference for suspicious routes – No loss of reachability – Accidental short-term delays do no harm • Offline investigation of suspicious route – Internet Alert Registry, active probing, … • Adaptive, simple Incremental Deployment Prefix Hijack Subprefix Hijack ICNP (2006) • Limitations: – Doesn’t address path spoofing, redistribution attacks. – Negligence BGP Network Structure Petter Holme and Josh Karlin • Barabasi-Albert (BA) model (Barabasi and Albert, 1999): – – • Vertices and edges added iteratively Probability of attaching to vertex i is proportional to k(i) Inet model (Winick and Jamin, 2000) – – – No simple growth principle Generate random graph with known degree distribution Augment to mimic additional known correlations (e.g., connecting all high-degree nodes) Proc. Royal Acad. A (in press) Tentative Conclusions • Real AS graph is more heterogeneous than can be expected from degree distribution alone: – – – – Core providers in the low-d tail Peak at d=3 (vertices directly connected to the core) Second peak at d=4 (vertices directly connected to d=3 nodes) More structure in periphery than predicted by earlier models. • Preferential attachment is a poor model for network growth. • What constraints determine network architecture and growth? Allometric Scaling in Biology Y Y0 X 3 / 4 ln (metabolic rate *eE/kT) 30 Endotherms Reptiles Fish Amphibians Invertebrates Unicells Plants 10 y = 0.71x + C -10 -30 0 30 ln (body mass) • Dominant design constraint: – Distribute resources to every cell in the organism • • • Internal, space-filling, hierarchical networks (vascular system) Invariant terminal units (capillaries) Optimality (minimize transport time, maximize metabolism) Examples: Scaling in Software H. Inoue HelloWorld: Unique Function Calls vs. Invocation Freq LimeWire Behavior Example: Scaling in Software Dave Ackley and Terry Van Belle • How to measure evolvability? – Likelihood (how likely is a location to change). – Impact (change in one location affects other locations). • • Work = [Likelihood x Impact]: – Software maintenance costs. – Expected time to evolve. Study long-lived Java code bases: – Code change sizes and frequencies are power law ish. Van Belle, 2004 A Theory of Network Scaling: Conclusions • Why it’s important – Networking infrastructure continues to expand by orders of magnitude: • NSF GENI project: Redesign the Internet from the ground up. • Interplanetary networking. – Architecture: • Move from performance-oriented to power-aware designs. • The end of silicon. – Scaling problems in software, security problems everywhere • Why it’s hard – Terminal units aren’t necessarily invariant sized – Dimensionality and geometry are not obvious – May require new mathematics
© Copyright 2026 Paperzz