Run any Skill in Manus with one click

$pwd:

ff-new-reward

Name: Ff New Reward
Author: X-GenGroup

// Complete workflow for adding a new reward model. Covers pointwise vs groupwise design, __call__ contract, registration, YAML config, multi-reward setup, and verification. Trigger: 'add reward', 'new reward model', 'custom reward', 'scoring function'.

Run Skill in Manus

$ git log --oneline --stat

stars:539

forks:40

updated:April 6, 2026 at 09:22

SKILL.md

readonly

name	ff-new-reward
description	Complete workflow for adding a new reward model. Covers pointwise vs groupwise design, __call__ contract, registration, YAML config, multi-reward setup, and verification. Trigger: 'add reward', 'new reward model', 'custom reward', 'scoring function'.

New Reward Model Integration

Authoritative reference: guidance/rewards.md — read it first. Template: src/flow_factory/rewards/my_reward.py

Prerequisites

Determine your reward type:

Pointwise: Each sample scored independently (e.g., aesthetic score, CLIP similarity)
Groupwise: Scores depend on comparison within a group (e.g., ranking, preference)

Phase 1: Design

Choose base class: PointwiseRewardModel or GroupwiseRewardModel
Identify required inputs: What fields from Sample does your reward need?
- Common: prompt, image, video, condition_images, condition_videos
- Set required_fields tuple accordingly
Input format: PIL Images (default) or Tensors?
- Set use_tensor_inputs = True if your model needs raw tensors

Phase 2: Implementation

Create the reward model file

# src/flow_factory/rewards/<my_reward>.py
from .abc import PointwiseRewardModel, RewardModelOutput
from ..hparams import RewardArguments
from accelerate import Accelerator
from typing import Optional, List
from PIL import Image
import torch

class MyRewardModel(PointwiseRewardModel):
    required_fields = ("prompt", "image")
    use_tensor_inputs = False

    def __init__(self, config: RewardArguments, accelerator: Accelerator):
        super().__init__(config, accelerator)
        # Load your model, processor, etc.
        # Use self.device and self.dtype from base class

    @torch.no_grad()
    def __call__(
        self,
        prompt: List[str],
        image: Optional[List[Image.Image]] = None,
        video: Optional[List[List[Image.Image]]] = None,
        condition_images=None,
        condition_videos=None,
        **kwargs,
    ) -> RewardModelOutput:
        # Compute rewards — shape must be (batch_size,) for Pointwise
        # or (group_size,) for Groupwise
        rewards = torch.zeros(len(prompt), device=self.device)
        return RewardModelOutput(rewards=rewards)

Key constraints for `call`:

Pointwise: Input length = config.batch_size. Return rewards shape (batch_size,)
Groupwise: Input length = group_size. You handle batching yourself. Return rewards shape (group_size,)
Always use @torch.no_grad() decorator
Return RewardModelOutput (not raw tensors)

Phase 3: Register

Add to _REWARD_MODEL_REGISTRY in src/flow_factory/rewards/registry.py:

'my_reward': 'flow_factory.rewards.<my_reward>.MyRewardModel',

Phase 4: Configuration

Use in YAML config:

rewards:
  - name: "my_reward"
    reward_model: "my_reward"        # Must match registry key
    model_path: "org/model-name"     # HuggingFace model path (if applicable)
    dtype: "bfloat16"
    device: "cuda"
    batch_size: 16

Multi-reward setup:

rewards:
  - name: "aesthetic"
    reward_model: "PickScore"
    weight: 0.7
  - name: "custom"
    reward_model: "my_reward"
    weight: 0.3

Phase 5: Verification

__init__ loads model without errors
__call__ returns correct reward shape
Rewards are numerically reasonable (not all zeros, no NaN/Inf)
Works with RewardProcessor dispatch (Pointwise/Groupwise routing)
Works in multi-reward setup with weight aggregation
Device placement correct (respects config.device)
Registry entry resolves: get_reward_model_class('my_reward')

Common Pitfalls

Wrong return shape — Pointwise must return (batch_size,), Groupwise (group_size,)
Forgetting @torch.no_grad() — causes reward computation to build unnecessary graph, OOM
Hardcoding device — use self.device from base class, not torch.device('cuda')
Not setting required_fields — RewardProcessor won't pass the right data to your model
Mixing paradigms — don't inherit PointwiseRewardModel if your reward needs group context

related-skills.json

same repository

ff-new-algorithm.md

from "X-GenGroup/Flow-Factory"

Complete workflow for adding a new RL training algorithm. Covers paradigm selection, TrainingArguments subclass, trainer implementation, registry, example config, and verification. Trigger: 'add algorithm', 'new trainer', 'new training method', 'implement algorithm'.

2026-05-24539

ff-review.md

from "X-GenGroup/Flow-Factory"

Mandatory pre-commit code review gate. Checks constraint violations, cross-module consistency, and implementation quality. Trigger proactively when changes span multiple files or touch shared infrastructure. Trigger: 'review', 'check before commit'.

2026-05-17539

ff-new-model.md

from "X-GenGroup/Flow-Factory"

Complete workflow for adding a new model adapter. Covers analysis, sample dataclass, adapter implementation (4 abstract methods + per-modality encoder overrides), registry, example YAML, and verification. Trigger: 'add model', 'support new model', 'integrate model', 'new adapter'.

2026-04-25539

ff-develop.md

from "X-GenGroup/Flow-Factory"

Feature development with cross-module impact analysis. Covers trainer hierarchy, model adapters, reward pipeline, config system, sample dataclasses, and distributed training paths. Trigger: 'add feature', 'implement', 'refactor', 'reorganize', 'new capability'.

2026-04-25539

ff-debug.md

from "X-GenGroup/Flow-Factory"

Bug fixing and debugging for ANY error, crash, loss divergence, gradient explosion, distributed hang, NaN, or unexpected behavior. Covers quick fixes and full protocol with 5-phase investigation. Trigger: 'fix bug', 'fix error', 'broken', 'crash', 'doesn't work', 'fails with', 'loss NaN', 'training hangs', 'OOM'.

2026-04-08539

package.json

"author": "X-GenGroup"

"repository": "X-GenGroup/Flow-Factory"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name	ff-new-reward
description	Complete workflow for adding a new reward model. Covers pointwise vs groupwise design, __call__ contract, registration, YAML config, multi-reward setup, and verification. Trigger: 'add reward', 'new reward model', 'custom reward', 'scoring function'.

New Reward Model Integration

Authoritative reference: guidance/rewards.md — read it first. Template: src/flow_factory/rewards/my_reward.py

Prerequisites

Determine your reward type:

Pointwise: Each sample scored independently (e.g., aesthetic score, CLIP similarity)
Groupwise: Scores depend on comparison within a group (e.g., ranking, preference)

Phase 1: Design

Choose base class: PointwiseRewardModel or GroupwiseRewardModel
Identify required inputs: What fields from Sample does your reward need?
- Common: prompt, image, video, condition_images, condition_videos
- Set required_fields tuple accordingly
Input format: PIL Images (default) or Tensors?
- Set use_tensor_inputs = True if your model needs raw tensors

Phase 2: Implementation

Create the reward model file

# src/flow_factory/rewards/<my_reward>.py
from .abc import PointwiseRewardModel, RewardModelOutput
from ..hparams import RewardArguments
from accelerate import Accelerator
from typing import Optional, List
from PIL import Image
import torch

class MyRewardModel(PointwiseRewardModel):
    required_fields = ("prompt", "image")
    use_tensor_inputs = False

    def __init__(self, config: RewardArguments, accelerator: Accelerator):
        super().__init__(config, accelerator)
        # Load your model, processor, etc.
        # Use self.device and self.dtype from base class

    @torch.no_grad()
    def __call__(
        self,
        prompt: List[str],
        image: Optional[List[Image.Image]] = None,
        video: Optional[List[List[Image.Image]]] = None,
        condition_images=None,
        condition_videos=None,
        **kwargs,
    ) -> RewardModelOutput:
        # Compute rewards — shape must be (batch_size,) for Pointwise
        # or (group_size,) for Groupwise
        rewards = torch.zeros(len(prompt), device=self.device)
        return RewardModelOutput(rewards=rewards)

Key constraints for `call`:

Pointwise: Input length = config.batch_size. Return rewards shape (batch_size,)
Groupwise: Input length = group_size. You handle batching yourself. Return rewards shape (group_size,)
Always use @torch.no_grad() decorator
Return RewardModelOutput (not raw tensors)

Phase 3: Register

Add to _REWARD_MODEL_REGISTRY in src/flow_factory/rewards/registry.py:

'my_reward': 'flow_factory.rewards.<my_reward>.MyRewardModel',

Phase 4: Configuration

Use in YAML config:

rewards:
  - name: "my_reward"
    reward_model: "my_reward"        # Must match registry key
    model_path: "org/model-name"     # HuggingFace model path (if applicable)
    dtype: "bfloat16"
    device: "cuda"
    batch_size: 16

Multi-reward setup:

rewards:
  - name: "aesthetic"
    reward_model: "PickScore"
    weight: 0.7
  - name: "custom"
    reward_model: "my_reward"
    weight: 0.3

Phase 5: Verification

__init__ loads model without errors
__call__ returns correct reward shape
Rewards are numerically reasonable (not all zeros, no NaN/Inf)
Works with RewardProcessor dispatch (Pointwise/Groupwise routing)
Works in multi-reward setup with weight aggregation
Device placement correct (respects config.device)
Registry entry resolves: get_reward_model_class('my_reward')

Common Pitfalls

Wrong return shape — Pointwise must return (batch_size,), Groupwise (group_size,)
Forgetting @torch.no_grad() — causes reward computation to build unnecessary graph, OOM
Hardcoding device — use self.device from base class, not torch.device('cuda')
Not setting required_fields — RewardProcessor won't pass the right data to your model
Mixing paradigms — don't inherit PointwiseRewardModel if your reward needs group context

ff-new-reward

New Reward Model Integration

Prerequisites

Phase 1: Design

Phase 2: Implementation

Create the reward model file

Key constraints for __call__:

Phase 3: Register

Phase 4: Configuration

Phase 5: Verification

Common Pitfalls

More from this repository

More from this repository

New Reward Model Integration

Prerequisites

Phase 1: Design

Phase 2: Implementation

Create the reward model file

Key constraints for __call__:

Phase 3: Register

Phase 4: Configuration

Phase 5: Verification

Common Pitfalls

Key constraints for `call`:

Key constraints for `call`: