Structure and Dynamics in Complex Interactive Networks

Dynamics of Learning & Distributed Adaptation
PI: James P. Crutchfield, Santa Fe Institute
Second PI Meeting, 17-19 April 2001, SFe

Dynamics of Learning:
Single-agent learning theory

Emergence of Distributed Adaptation:
Agent-collective learning theory

Strategies:
Simulation: learning dynamics, collective behavior
Theory: basic constraints, quantitative predictions
REF:
Control and Adaptation in Heterogeneous,
Dynamics Environments

Traffic dynamics
 Food delivery to major cities
 Electrical power grid
 Internet packet dynamics
 Market economies
 Dynamic task allocation by ant colonies
Questions
 How do large-scale systems maintain coordination?
 How does one design such large-scale systems?
REF:
Control and Adaptation in Heterogeneous,
Dynamics Environments
Common features
 Distributed systems with many subsystems
 Adaptive response to internal/external change
 No global control, but still perform function
 Local intelligence:
– Controller
• Sensor
• Internal model
• Actuators
Common vocabulary
– Agents
– Environment = Other agents + Exogenous Influences
What is an Intelligent Agent?
The Learning Channel

TLC: Adaptation of Communication Channel

What are fundamental constraints on learning?
– How to measure environmental structure?
– How to measure “cognitive” capacity of learning agents?
– How much data for a given complexity of inferred model?
Computational Mechanics:
Preliminaries
www.santafe.edu/projects/CompMech
Observations: s = s s
Past  Future: … s-Ls-L+1…s-1s0|s1…s L-1sL
…
Probabilities: Pr(s), Pr(s), Pr(s)
 Uncertainty: Entropy
H[P] = -i pi log pi [bits]
 Prediction error: Entropy Rate

h = H[Pr(si|si-1si-2si-3…)]

Information transmitted to future: Excess Entropy
E = H[Pr(s)/ (Pr(s)Pr(s))]
Measure of independence: Is Pr(s)=Pr(s)Pr(s)?
Describes information in “raw” sequence blocks
Computational Mechanics:
Mathematical Foundations

Casual state = Condition of knowledge about future
 -Machines = {Causal states, Transitions}
 Optimality Theorem:
-Machines are optimal predictors of environment.

Minimality Theorem:
Of the optimal predictors, -Machines are smallest.

Uniqueness Theorem:
Up to isomorphism, an -Machine is unique.

The Point:
Discovering an -Machine is the goal for any learning process.
Practicalities may preclude this, but this is the goal.
(w/ DP Feldman/CR Shalizi)
Computational Mechanics:
Why Model?

Structural Complexity of Information Source
C = H[Pr(S)], S = {Casual states}
 Uses:
– Environ’l complexity: Amount/kind of relevant structure
– Agent’s inferential capacity: Sophistication of models?

Theorem: E  C
Conclusion: Build models vs. storing only E bits of history.
– Raw sequence blocks do not allow optimal prediction,
only E bits of mutual information in blocks.
– Optimal prediction requires larger model: 2C, not 2E.
– Explicit: 1D Range-R Ising spin system: C =E+Rh.
Synchronizing to the
Environment—
Constraints on Agent Learning
How does an agent come to know the environment?
 Agent
synchronized to the environment when
Agent Knows the (Hidden) State of the Environment

Here an information theoretic answer
 Focus on Entropy Growth H(L) = H[Pr(sL)]
 Take derivatives and integrals of H(L)
 Recover in one framework all existing quantities
h, E, and G

Introduce a new quantity: Transient Information T
Entropy Growth H(L)

Entropy Convergence
h(L) = DH(L)

Predictability Gain
D2H(L)

Example:
All Period-5 Processes

Three unique templates
– 10000
– 10101
– 11000
Example:
Golden Mean Process

“No consecutive 0s”
Example:
Even Process

“Even blocks of 1s”
Example:
RRXOR Process

...S1S2S3S1S2S3...
– S1 random
– S2 random
– S3 = XOR(S1,S2)
Example:
NonDeterministic Process
 A Hidden
Markov Model
Example:
Morse-Thue Process

Production rules:
– 1  10
– 0  11

Infinite Memory
Regularities Unseen,
Randomness Observed

Consequence: Ignore Structure  More Unpredictable
Regularities Unseen,
Randomness Observed

Consequence:
Assume Instant Synchronization  More Predictable (False)
Regularities Unseen,
Randomness Observed

Consequence: Assume Synchronization  Less Memory
Regularities Unseen,
Randomness Observed
Conclusions
 Quantities key to synchronization, agent modeling
h, E, T, and G

Relationships between them via a single framework
 Derived consequences of ignoring them
 Can now distinguish kinds of synchronization
 Improved model building and control system design