Skip to main content
Run any Skill in Manus
with one click

pg-dpo-non-exponential-discounting

Pontryagin-Guided Direct Policy Optimization (PG-DPO) — a variational RL framework that replaces Bellman recursions with Pontryagin Maximum Principle for non-exponential discounting (hyperbolic, survival-discount). Handles settings where standard value/actor-critic methods fail. Use when: RL with non-exponential discounting, hyperbolic discounting RL, human-like time preferences, survival processes, Pontryagin-based RL. Activation: PG-DPO, non-exponential discounting, Pontryagin RL, hyperbolic discounting, Bellman breakdown, Adjoint-MC projection.

Overview

Pontryagin-Guided Direct Policy Optimization (PG-DPO) — a variational RL framework that replaces Bellman recursions with Pontryagin Maximum Principle for non-exponential discounting (hyperbolic, survival-discount). Handles settings where standard value/actor-critic methods fail. Use when: RL with non-exponential discounting, hyperbolic discounting RL, human-like time preferences, survival processes, Pontryagin-based RL. Activation: PG-DPO, non-exponential discounting, Pontryagin RL, hyperbolic discounting, Bellman breakdown, Adjoint-MC projection.

Install command
npx skills add https://github.com/hiyenwong/ai_collection --skill pg-dpo-non-exponential-discounting

Copy and paste this command into Claude Code to install the skill

Stars1
Forks0
UpdatedJune 4, 2026 at 02:00
SKILL.md
readonly