Part II: Identity Thesis

The Experimental Logic

Introduction
0:00 / 0:00

The Experimental Logic

Consider a population of self-maintaining patterns in a sufficiently complex CA substrate—or transformer-based agents in a 3D multi-agent environment, initialized with random weights, no pretraining, no human language. Let them learn. Let them interact. Let them develop whatever communication emerges from the pressure to coordinate, compete, and survive.

The literature establishes: language spontaneously emerges in multi-agent RL environments under sufficient pressure. Not English. Not any human language. Something new. Something uncontaminated.

Now: extract the affect dimensions from their activation space. Valence as viability gradient. Arousal as belief update rate. Integration as partition prediction loss. Effective rank as eigenvalue distribution. Counterfactual weight as simulation compute fraction. Self-model salience as MI between self-representation and action.

These are computable. In a CA, exactly. In a transformer, via the proxies defined above.

Simultaneously: translate their emergent language into English. Not by teaching them English—by aligning their signals with VLM interpretations of their situations. If the VLM sees a scene that looks like fear (agent cornered, threat approaching, escape routes closing), and the agent emits signal-pattern σ\sigma, then σ\sigma maps to fear-language. Build the dictionary from scene-signal pairs, not from instruction.

The translation is uncontaminated because:

  1. The agent never learned human concepts
  2. The mapping is induced by environmental correspondence
  3. The VLM interprets the scene, not the agent’s internal states
  4. The agent’s "thoughts" remain in their original emergent form