Jeden Skill in Manus ausführen
mit einem Klick

Jeden Skill in Manus mit einem Klick ausführen

active-inference-robotics

Bridge active inference theory with robot control using K-Scale's JAX/MuJoCo stack. Use when connecting predictive coding to locomotion policies, mapping KL divergence minimization to RL training, applying mean field approximation to robotics state estimation, or implementing sim2real as inference about future observations.

In Manus ausführen

Sterne26

Forks8

Aktualisiert10. Juni 2026 um 11:55

Quelle

plurigrid

plurigrid/asi

GitHub-Repository öffnen Creator-Repositorys ansehen

Installationsbefehl

Download

In Manus ausführen

Verwandte BerufeSOC

Basierend auf der SOC-Berufsklassifikation

SoftwareentwicklerInformatik- und Mathematikberufe·SOC 15-1252

Datei-Explorer

3 Dateien

SKILL.md

readonly

Mehr aus diesem Repository

gleiches Repository

2600-magazine

plurigrid/asi

Query and explore the 2600: The Hacker Quarterly magazine archive (1984-present) via DuckDB. Provides structured access to 168+ issues covering hacker culture, security, privacy, telephony, and digital rights without loading full content into context.

2026-06-1026

acsets-algebraic-databases

plurigrid/asi

ACSets (Attributed C-Sets): Algebraic databases with Specter-style bidirectional navigation. Category-theoretic formalism for relational databases.

2026-06-1026

acsets-hatchery

plurigrid/asi

Attributed C-Sets as algebraic databases. Category-theoretic data structures generalizing graphs and dataframes with Gay.jl color integration.

2026-06-1026

acsets-algebraic-databases

plurigrid/asi

ACSets (Attributed C-Sets): Algebraic databases with Specter-style bidirectional

2026-06-1026

affective-taxis

plurigrid/asi

Implement affective valence as directional derivative of interoceptive energy landscapes for AI alignment. Use when building alignment-aware RL agents, validating GF(3) conservation in reward signals, training Langevin-based policies, or analyzing fold-change detection signals in POMDP environments.

2026-06-1026

algebraic-rewriting

plurigrid/asi

Category-theoretic graph rewriting with DPO, SPO, and SqPO pushouts for C-Sets. Declarative transformation of acset data structures.

2026-06-1026

name	active-inference-robotics
description	Bridge active inference theory with robot control using K-Scale's JAX/MuJoCo stack. Use when connecting predictive coding to locomotion policies, mapping KL divergence minimization to RL training, applying mean field approximation to robotics state estimation, or implementing sim2real as inference about future observations.

active-inference-robotics

Synthesizes Patrick Kenny's discrete active inference framework with K-Scale's JAX/MuJoCo robotics stack for predictive coding in robot locomotion.

Use When

Bridging active inference theory with robot control implementations
Connecting KL divergence minimization to RL training loops
Applying mean field approximation to robotics state estimation
Implementing sim2real transfer as inference about future observations
Training locomotion policies with principled entropy regularization

The Constructive Collision

┌─────────────────────────────────────────────────────────────────────────────┐
│  CONSTRUCTIVE COLLISION: Two Threads Converging                              │
│                                                                              │
│  Thread A: Patrick Kenny (Nov 2025)                                          │
│  ════════════════════════════════════                                        │
│  "Active inference can be formulated as constrained KL divergence           │
│   minimization solved by standard mean field methods"                        │
│                                                                              │
│  Key insight: Expected Free Energy ≈ KL Divergence + Entropy Regularizer    │
│                                                                              │
│  Thread B: K-Scale Labs (2024-2025)                                          │
│  ═══════════════════════════════════                                         │
│  "RL-based closed-loop control using policies trained in simulation         │
│   has firmly won as the best way of achieving real-time control"            │
│                                                                              │
│  Key insight: Stateless vs Stateful behaviors as pure/coalgebraic semantics │
│                                                                              │
│  COLLISION POINT: Both minimize surprise about future observations          │
│  ══════════════════════════════════════════════════════════════════         │
│                                                                              │
│       Active Inference              Robotics RL                              │
│       ────────────────              ──────────                               │
│       Predictive Distribution  ←→   Policy π(a|s)                           │
│       Hidden Markov Model      ←→   MDP/POMDP                                │
│       Mean Field Updates       ←→   PPO Gradient Steps                       │
│       Variational Free Energy  ←→   Policy Loss                              │
│       Expected Free Energy     ←→   Value Function + Entropy                 │
│       Perception/Action Loop   ←→   Observation/Action Loop                  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Kenny's Key Contribution

From arXiv:2511.20321:

Perception/Action Divergence = VFE(past) + KL(future states)

Where:
- VFE(past) = Standard variational free energy on observed history
- KL(future) = Divergence of predictive distribution from HMM

This differs from Expected Free Energy by an ENTROPY REGULARIZER:
  EFE ≈ Pragmatic Value + Mutual Information
  PAD ≈ Pragmatic Value + Entropy(Q)

Why Entropy Regularization Matters for Robotics

# In ksim PPO training, entropy bonus prevents policy collapse:
loss = policy_loss + value_loss - entropy_coef * entropy

# Kenny's formulation shows this is NOT ad-hoc but principled:
# Entropy regularizer = not being overconfident about predictions
# Biological rationale: know limitations of future predictions

Mapping to ksim Architecture

Active Inference Concept	ksim Implementation
Hidden Markov Model	`PhysicsEngine` (MJX/MuJoCo)
Observation distribution	`Observation.observe(state)`
State inference Q(s)	`Critic.forward(obs, carry)`
Action inference Q(a)	`Actor.forward(obs, carry)`
Mean field factorization	Independent Q(s_t) per timestep
Predictive distribution	Policy rollout trajectory
VFE minimization	PPO policy gradient
EFE/PAD minimization	Value function + entropy bonus

Second-Order Behavior Types

1. Reflexive Control (Kenny's "Sufficient" Model)

# Agent predicts proprioceptive sensations → fulfills reflexively
class ReflexiveController:
    """
    Kenny: "If the agent can successfully predict its future sensations,
    it can fulfill them unconsciously via motor reflexes."
    """
    def step(self, predicted_proprio: Array) -> Action:
        # Low-level PD control fulfills proprioceptive predictions
        return self.pd_controller(predicted_proprio, self.current_state)

2. Deliberative Planning (EFE Extension)

# When reflexive prediction fails, engage deliberative inference
class DeliberativeController:
    """
    Extends reflexive control with policy search over trajectories.
    This is where EFE differs from Kenny's PAD formulation.
    """
    def plan(self, beliefs: Distribution, horizon: int) -> Policy:
        # Tree search over policies weighted by expected free energy
        for policy in self.policy_space:
            efe = self.expected_free_energy(beliefs, policy, horizon)
            # EFE includes mutual information (curiosity/exploration)
            # PAD would use entropy instead (uncertainty awareness)

3. Hierarchical Composition

Level 3: Goal Selection (minimize long-horizon EFE)
    ↓ sets reference for
Level 2: Trajectory Planning (predictive distribution)
    ↓ sets reference for  
Level 1: Reflexive Execution (fulfill proprio predictions)
    ↓ actuates
Level 0: Motor Primitives (PD control, actuator dynamics)

GF(3) Balanced Quad

active-inference (0) ⊗ kscale-ksim (0) ⊗ mujoco-playground (0) = 0 ✓

All three are ERGODIC — coordination/infrastructure skills.
This is a "resonant triad" where all components coordinate.

For generation (+1), add: skill-creator, algorithmic-art
For verification (-1), add: sheaf-cohomology, code-review

Skill Colors (drand seed 12005093902789493003)

Skill	Color	Role
`active-inference`	`#DF8D0F`	Coordination (theory)
`kscale-ksim`	`#25BC3D`	Coordination (simulation)
`mujoco-playground`	`#93DBDA`	Coordination (framework)

2-3-5-7 Prime Sieve Experts

Applying prime-indexed refinement to identify domain experts:

Prime	Expert	Domain	Key Contribution
2	Patrick Kenny	Active Inference	Mean field formulation, PAD criterion
3	Thomas Parr	Active Inference	2022 textbook, EFE derivation
5	Ben Bolte	K-Scale	ksim architecture, open-source humanoids
7	Karl Friston	Free Energy Principle	FEP foundations, continuous formulation
11	(DeepMind team)	MuJoCo Playground	MJX, sim2real zero-shot
13	Wesley Maa	K-Scale	Tooling, visualization

Mutual Awareness

This skill references and is referenced by:

depends_on:
  - kscale-ksim        # Simulation implementation
  - kscale-ecosystem   # Hardware context
  - mujoco-playground  # Framework foundation
  
referenced_by:
  - cognitive-superposition  # Team mental models
  - parametrised-optics-cybernetics  # Category theory bridge
  - reafference-corollary-discharge  # Sensorimotor prediction

Implementation Pattern

# Unified Active Inference + RL Training Loop
class ActiveInferenceTrainer:
    """
    Combines Kenny's PAD criterion with ksim's PPO.
    """
    def __init__(self, hmm: PhysicsEngine, config: Config):
        self.hmm = hmm
        self.actor = Actor(config)
        self.critic = Critic(config)
        
    def perception_action_divergence(
        self, 
        observations: Array,  # O_{1:t} (past)
        q_future: Distribution  # Q(S_{t+1:T}, O_{t+1:T})
    ) -> Scalar:
        """
        Kenny's PAD = VFE(past) + KL(future states from HMM)
        """
        # Past: standard VFE on observation history
        vfe_past = self.variational_free_energy(observations)
        
        # Future: KL divergence of predicted states from HMM
        # Note: Observable emissions cancel out in future KL
        kl_future = self.kl_future_states(q_future, self.hmm)
        
        return vfe_past + kl_future
    
    def train_step(self, trajectory: Trajectory) -> Metrics:
        # PPO updates approximate mean field coordinate ascent
        # Entropy bonus provides Kenny's regularization
        return ppo_update(
            self.actor, 
            self.critic, 
            trajectory,
            entropy_coef=0.01  # ← The regularizer!
        )

References

ACSet Schema

@present SchActiveInferenceRobotics(FreeSchema) begin
    # Objects
    HMM::Ob           # Hidden Markov Model (generative model)
    State::Ob         # Latent state
    Observation::Ob   # Sensory observation
    Action::Ob        # Motor command
    Policy::Ob        # Action sequence
    
    # Morphisms (inference)
    perceive::Hom(Observation, State)    # Perception: O → S
    predict::Hom(State, Observation)     # Prediction: S → O
    act::Hom(State, Action)              # Action selection: S → A
    transition::Hom(State × Action, State)  # Dynamics: S × A → S'
    
    # Attributes
    FreeEnergy::AttrType
    vfe::Attr(State, FreeEnergy)         # Variational free energy
    efe::Attr(Policy, FreeEnergy)        # Expected free energy
    pad::Attr(Policy, FreeEnergy)        # Perception/action divergence
    
    # The key relationship (Kenny's contribution):
    # pad ≈ efe + entropy_regularizer
end

Related Skills

kscale-ksim — simulation implementation partner
mujoco-playground — framework foundation
cognitive-superposition — team mental models
reafference-corollary-discharge — sensorimotor prediction