Skip to content

Latest commit

 

History

History
258 lines (188 loc) · 13.8 KB

File metadata and controls

258 lines (188 loc) · 13.8 KB

Encephagen: Research Development Report — Day 3

Date: April 3-4, 2026 Author: edvatar (toroleapinc) Repo: https://github.com/toroleapinc/encephagen AI-assisted: Code development assisted by Claude (Anthropic). All experimental design, analysis, and interpretation by the author. All code reviewed, validated, and tested.


1. Summary of Day 3

Day 3 answered the central question: does human brain structure help learning?

  1. Deep research into C. elegans (OpenWorm) and Drosophila (FlyWire/Eon) brain simulations
  2. Implemented e-prop learning rule (Bellec et al. 2020) — replacing the crude Hebbian outer product
  3. Experiment 22: Connectome vs random with e-prop — structure helps conditioning (p=0.011)
  4. Experiment 23: Analysis of WHY — discovered channeling vs distributing trade-off

Key finding: The human connectome is a channeling architecture. It concentrates signals through specific pathways (VIS→AMYG, VIS→PFC), making survival-critical computations more efficient. Random wiring distributes signals more uniformly. Neither is universally better — each has advantages for different tasks.


2. Research Phase: Learning from OpenWorm and FlyWire

Before building, conducted exhaustive research into the two most successful connectome simulation projects.

OpenWorm (C. elegans, 302 neurons)

  • Timeline: 2011-present (13+ years)
  • Data: 302 neurons, 7,000 synapses (electron microscopy)
  • Key finding: Raw connectome weights do NOT produce locomotion. Required proprioceptive feedback loops and parameter tuning.
  • What failed: Direct weight transcription, uniform neuron models, spiking-only approaches
  • What worked: Muscle-driven body simulation (Sibernetic), proprioceptive feedback chain, careful E/I balance tuning

FlyWire/Eon (Drosophila, 139K neurons)

  • Timeline: 2014-2024
  • Data: 139,000 neurons, 54M synapses at 8nm resolution (EM)
  • Key finding: Visual motion detection emerged from connectivity alone (LIF model, one free parameter: global gain)
  • Embodiment: Pre-trained walking controllers, brain provides high-level commands via descending neurons
  • Critical gap: No learning — all weights fixed from connectome

Implications for Encephagen

  • Our data is 6 orders of magnitude coarser (96 macro-regions vs 139K identified neurons)
  • We're working with a "roadmap" not a "circuit diagram"
  • Both projects confirm: topology constrains but does not determine behavior
  • Neither project attempted learning — we are the first to test whether macro-scale topology provides a learning advantage

3. E-prop Implementation

What was wrong with the old learning rule

Experiments 15-21 used "three-factor STDP" which was actually just Hebbian outer product:

# OLD (Exp 21): Not real STDP — no spike timing, no eligibility traces
cs_active = (cs_activity > 2).float()
us_active = (us_activity > 2).float()
dW = learning_rate * torch.outer(cs_active, us_active)

Expert reviewers correctly flagged this: "This is Hebbian weight injection, not STDP. No temporal credit assignment."

E-prop: The proper learning rule

Implemented Bellec et al. (2020) "A solution to the learning dilemma for recurrent networks of spiking neurons" (Nature Communications):

Three components:

  1. Eligibility trace per synapse: e_ji = ψ_j × z̄_i where ψ is the surrogate gradient at the postsynaptic neuron and z̄ is the filtered presynaptic spike train
  2. Surrogate gradient: Piecewise linear approximation of the Heaviside derivative: ψ = γ × max(0, 1 - |v - v_thr| / v_thr)
  3. Reward-modulated update: ΔW = lr × reward_signal × e_snapshot

Key design: eligibility snapshot. During CS presentation, eligibility traces accumulate — tracking which synapses causally contributed to the response. At CS offset, we snapshot these traces. When reward arrives (US phase), it modulates the SNAPSHOT, not current eligibility. This implements temporal credit assignment: the reward "reaches back" to strengthen synapses that were active during the stimulus.

Debugging journey

Problem 1: Exploding eligibility traces

  • First run: e_max = 1,367, e_mean = 121
  • Trial 1 modified ALL 1.1M synapses, destroyed the network
  • Root cause: z_bar = alpha * z_bar + z accumulates without bound when alpha ≈ 0.995 (dt=0.1ms, tau_m=20ms)
  • Fix: Normalize with (1-alpha) factor: z_bar = alpha * z_bar + (1-alpha) * z
  • After fix: e_max = 0.018, e_mean = 0.0017 ✓

Problem 2: Learning signal collapse

  • After 5 trials, weight changes dropped to zero
  • Root cause: reward baseline tracked reward too quickly (decay=0.95), so reward - baseline ≈ 0
  • Fix: Slower baseline (decay=0.99) + apply reward ONCE per trial using snapshot

Problem 3: Reward timing

  • Initially applied reward at each step during US phase
  • But by then, eligibility traces had been overwritten by US-phase activity
  • Fix: Snapshot eligibility at CS offset, apply reward to snapshot when US arrives

Final working parameters:

  • lr=0.1, tau_e=50ms, gamma=0.5, reward_decay=0.99
  • ~1.1M synapses modified per trial, dW_max ≈ 0.0013 per trial
  • Post-training CS response: +28.5% increase

4. Experiments Conducted

Experiment 22: Does brain structure help LEARNING?

Protocol: Same as Experiment 21, but with e-prop instead of Hebbian.

  • 10 connectome brains vs 10 degree-preserving random brains
  • Phase 1: Innate regional differentiation
  • Phase 2: Classical conditioning (30 CS-US pairings with e-prop)
  • Phase 3: Pattern discrimination (5 classes, after e-prop training)
  • Phase 4: Working memory (PFC persistence)

Results:

Metric Connectome Random p-value Winner
Regional differentiation (CV) 0.745 0.692 0.0002* Connectome
Conditioning speed (slope) -0.000 -0.000 0.473
Conditioning strength (final) 0.00018 0.00008 0.011* Connectome
Pattern discrimination 38% 45% 0.052
Working memory 129% 131% 0.427

Key finding: E-prop revealed a conditioning advantage (p=0.011, d=1.5) that Hebbian learning couldn't detect. The connectome's VIS→AMYG pathway enables more efficient association learning. This was invisible with the crude Hebbian rule in Exp 21.

Comparison with Experiment 21:

  • Exp 21 (Hebbian): 1/5 significant (only regional differentiation)
  • Exp 22 (E-prop): 2/5 significant (differentiation + conditioning strength)
  • The learning rule matters — a better learning algorithm extracts more from the topology

Experiment 23: Why does random wiring trend better at discrimination?

Exp 22 showed random brains trending better at pattern discrimination (p=0.052). Investigated why.

Metrics measured (8 connectome + 8 random brains):

Metric Connectome Random p-value Winner
Mean cosine distance 0.0004 0.0004 0.083
Trial consistency 0.9995 0.9994 0.003 Connectome
Fisher discriminability 0.355 0.345 0.382
Effective dimensionality 1.75 1.74 0.879
Activation entropy 4.306 4.332 0.0002 Random

Key finding: The discrimination difference was likely noise. Response diversity (cosine distance, Fisher ratio, dimensionality) is NOT significantly different. What IS different:

  • Connectome: higher trial consistency (p=0.003, d=2.2) — more reliable signal
  • Random: higher activation entropy (p=0.0002, d=-9.0) — more evenly distributed activity

Interpretation: The connectome concentrates activity in specific regions through structured pathways. Random wiring distributes it more uniformly. Neither strategy is better for discrimination — the differences cancel out.


5. The Trade-off: Channeling vs Distributing

Experiments 21-23 together reveal a fundamental trade-off:

Property Connectome Random
Regional organization High (p=0.0002) Low
Conditioning efficiency High (p=0.011) Low
Trial-to-trial reliability High (p=0.003) Low
Activity distribution Concentrated Uniform (p=0.0002)
Pattern discrimination ~38% ~45% (ns)
Working memory ~129% ~131% (ns)

The connectome is an optimization, not a universal advantage. It channels signals through specific pathways that evolution selected for survival-critical computations (fear conditioning via VIS→AMYG). Random wiring distributes signals uniformly — no better, no worse for general tasks, but missing the specialized routing.

This parallels the C. elegans/Drosophila lesson: structure constrains but does not determine. The macro-scale connectome is a roadmap that makes certain computations more efficient, not a blueprint that makes all computations possible.


6. New Code

src/encephagen/learning/eprop.py

  • EpropLearner: GPU-native e-prop with eligibility traces (~1.3M traces, ~5MB VRAM)
  • EpropParams: Learning rate, surrogate gradient dampening, eligibility filter tau, reward decay
  • snapshot_eligibility(): Capture CS-phase eligibility for delayed reward
  • apply_reward(): Reward-modulated weight update using snapshot
  • apply_supervised(): Supervised variant with random feedback alignment (not yet used)

Modified: src/encephagen/gpu/spiking_brain_gpu.py

  • enable_learning(params): Initialize e-prop learner on the brain's sparse weights
  • apply_reward(spikes, reward): Apply reward and rebuild sparse weight matrix
  • Eligibility trace update happens inside step() at pre-spike voltage (before reset)

7. Negative Results and Honest Assessment

What didn't work as hoped

  1. Conditioning slope is not significant — both brains learn at similar speeds, the connectome just reaches a slightly higher final response
  2. Pattern discrimination shows no connectome advantage — and random actually trends better
  3. Working memory is topology-independent — NMDA dynamics dominate regardless of wiring

Limitations of this study

  1. Macro-scale data (96 regions) — 6 orders of magnitude coarser than FlyWire. Many structural advantages may only appear at cellular resolution.
  2. Small neurons/region (200) — limited representational capacity per region
  3. Degree-preserving rewiring — preserves degree distribution, so any advantage from degree heterogeneity is invisible. A more aggressive null model (Erdős-Rényi) might show larger differences.
  4. E-prop reward signal is global — real dopamine has region-specific effects. A more biologically detailed reward system might reveal larger topology-dependent effects.
  5. No conduction delays — the connectome has distance-dependent delays (up to 20ms) that we're not modeling. Delays create temporal coding opportunities that topology could exploit.

What the expert reviewers would say

  • "The conditioning result (p=0.011) is interesting but the effect size is small in absolute terms (0.00018 vs 0.00008)"
  • "Only 2/5 metrics are significant — this is partial evidence at best"
  • "Need to test with multiple null models, not just degree-preserving rewiring"
  • "Conduction delays should be added — temporal coding may be where structure really matters"

8. Research Context

What this means for the 先天 × 后天 hypothesis

  • 先天 (innate structure): Provides channeling — efficient routing for specific computations
  • 后天 (learned calibration): E-prop fills in the synaptic weights that macro-scale dMRI can't resolve
  • The interaction: Structure × learning > either alone. The conditioning advantage only appears with a proper learning rule (e-prop), not with crude Hebbian injection.

Comparison with prior work

Project Connectome resolution Learning Structure advantage
OpenWorm Synaptic (302 neurons) None Locomotion emerges with tuning
FlyWire/Eon Synaptic (139K neurons) None Visual motion detection emerges
Encephagen Macro (96 regions) E-prop Conditioning advantage (p=0.011)

We are the first to demonstrate a learning advantage from macro-scale connectome topology.


9. What's Next

Immediate priorities

  1. Add conduction delays (distance-dependent, from tract lengths in TVB96) — temporal coding may be where topology provides the largest advantage
  2. Test with Erdős-Rényi null model — more aggressive control
  3. Scale to 500+ neurons/region for richer within-region dynamics

Medium term

  1. Implement attractor dynamics in PFC (true working memory, not just NMDA decay)
  2. Sequential learning tasks (where temporal credit assignment matters most)
  3. Multi-sensory integration (auditory + visual → associative learning)

The big question remains

Can a macro-scale connectome provide enough structural advantage to justify building brain-structured AI, vs just using random topology? Current evidence: partial. The conditioning advantage is real but small. Need more tasks and richer dynamics to fully answer this.


10. Files Changed

File Change
src/encephagen/learning/eprop.py NEW — E-prop learning rule
src/encephagen/learning/__init__.py Export EpropLearner, EpropParams
src/encephagen/gpu/spiking_brain_gpu.py Added enable_learning(), apply_reward()
experiments/22_eprop_connectome_vs_random.py NEW — Definitive experiment with e-prop
experiments/23_discrimination_analysis.py NEW — Response diversity analysis
README.md Updated experiments table, learning description
results/exp22_eprop_connectome_vs_random/results.json Raw data
results/exp23_discrimination_analysis/results.json Raw data

Total runtime: ~3.5 hours (Exp 22: ~2h, Exp 23: ~1h, debugging: ~0.5h) GPU: RTX 5070 (12GB VRAM), peak VRAM usage: ~2GB