Run any Skill in Manus with one click

$pwd:

mixing-transforms

Name: Mixing Transforms
Author: albumentations-team

// Policy for AlbumentationsX transforms that combine multiple images or objects. Use when implementing, reviewing, or using Mosaic, CopyAndPaste, OverlayElements, HistogramMatching, PixelDistributionAdaptation, or other mixing transforms.

Run Skill in Manus

$ git log --oneline --stat

stars:466

forks:29

updated:May 25, 2026 at 11:18

SKILL.md

readonly

name	mixing-transforms
description	Policy for AlbumentationsX transforms that combine multiple images or objects. Use when implementing, reviewing, or using Mosaic, CopyAndPaste, OverlayElements, HistogramMatching, PixelDistributionAdaptation, or other mixing transforms.

Mixing Transforms Policy

Apply this skill when implementing, reviewing, or using transforms that combine data from multiple images: Mosaic, CopyAndPaste, OverlayElements, HistogramMatching, PixelDistributionAdaptation, etc.

1. Donor sampling happens OUTSIDE the transform

Mixing transforms never sample which donor image or which instances to use. That is the user's responsibility. The transform receives the final list and processes it verbatim.

Why: Deterministic control, class-balanced pasting, curriculum strategies, hard-example mining — all require the user to decide what goes in. One extra line of code outside the transform is a better trade-off than a black-box internal sampler.

# CORRECT — user picks donors before the transform
donors = [dataset[random.choice(indices)] for _ in range(n)]
result = transform(image=image, mosaic_metadata=donors)

# INCORRECT — transform decides internally
result = MosaicWithSampling(dataset=dataset)(image=image)

2. Metadata format: `list[dict]`

All mixing transforms receive auxiliary data as list[dict] under a metadata_key. Each dict is one item (one full image for Mosaic, one object instance for CopyAndPaste). This is consistent across transforms.

mosaic_metadata = [
    {"image": img1, "mask": mask1, "bboxes": bboxes1, "bbox_labels": {...}},
    {"image": img2, ...},
]

copy_paste_metadata = [
    {"image": src_img, "mask": obj_mask, "bbox": [x1, y1, x2, y2], "bbox_labels": {"class_id": 3}},
    {"image": src_img, "mask": obj_mask2, "bbox_labels": {"class_id": 7}},
]

3. Label fields: `bbox_labels` and `keypoint_labels` (dicts)

All mixing transforms use the same wrapper dict convention for labels:

bbox_labels: dict[str, Any] — maps each label field name (as declared in BboxParams.label_fields) to its value(s) for this item.
keypoint_labels: dict[str, Any] — maps each label field name (as declared in KeypointParams.label_fields) to its value(s) for this item.

For CopyAndPaste (one object per dict), values are scalars (one bbox, one object):

{
    "image": src_image,
    "mask": obj_mask,
    "bbox": [10, 20, 50, 80],        # same coord_format as BboxParams
    "bbox_labels": {
        "class_id": 3,
        "is_crowd": 0,
    },
    "keypoints": [[25, 40]],         # same coord_format as KeypointParams
    "keypoint_labels": {
        "joint_name": "left_eye",
    },
}

For Mosaic (one full image per dict), values are lists — one entry per bbox/keypoint:

{
    "image": img,
    "bboxes": [[10, 20, 50, 80], [5, 5, 30, 30]],
    "bbox_labels": {
        "class_id": [3, 7],
        "is_crowd": [0, 1],
    },
    "keypoints": [[25, 40], [60, 70]],
    "keypoint_labels": {
        "joint_name": ["left_eye", "nose"],
    },
}

Key rule: the dict keys in bbox_labels / keypoint_labels must exactly match what is declared in BboxParams(label_fields=[...]) and KeypointParams(label_fields=[...]).

4. Coordinates use the same format as `BboxParams` / `KeypointParams`

Bboxes and keypoints in metadata dicts must use the same coord_format as declared in Compose. The processor's preprocess() converts them to the internal albumentations format — no manual conversion needed.

# BboxParams declared with coord_format='pascal_voc'
# → bboxes in metadata must also be pascal_voc [x_min, y_min, x_max, y_max]
copy_paste_metadata = [
    {"image": img, "mask": m, "bbox": [10, 20, 50, 80], "bbox_labels": {"class_id": 3}},
]

5. `metadata_key` pattern

Every mixing transform exposes metadata_key: str in its constructor and lists it in targets_as_params. This ensures Compose validates that the key is present.

@property
def targets_as_params(self) -> list[str]:
    return [self.metadata_key]

6. No-op on empty or missing metadata

If the metadata list is empty or missing, the transform must return the input unchanged without raising an error.

metadata = data.get(self.metadata_key)
if not isinstance(metadata, list) or not metadata:
    return self._no_op_params()

related-skills.json

same repository

add-transform.md

from "albumentations-team/AlbumentationsX"

Full checklist for adding a new transform to AlbumentationsX. Use when the user asks to add, implement, or create a new transform/augmentation.

2026-05-25466

benchmark.md

from "albumentations-team/AlbumentationsX"

Run performance benchmarks for transform changes. Use when the user asks to benchmark, measure performance, compare speed, or when changes affect apply methods, functional layer, get_params, or core pipeline code.

2026-05-25466

docstring-deep-dive.md

from "albumentations-team/AlbumentationsX"

Quality bar for docstrings in albumentations. Use when writing or updating docstrings in albumentations/, especially for transforms and public APIs.

2026-05-25466

internal-workspace.md

from "albumentations-team/AlbumentationsX"

Use the repo `_internal/` directory for anything that must not be committed — scratch files, temporary outputs, local demos, Codex artifacts, or one-off scripts. Use when creating temp files, debug dumps, or local-only tooling during a task.

2026-05-25466

release-notes.md

from "albumentations-team/AlbumentationsX"

Generate release notes for AlbumentationsX. Use when the user asks to prepare, draft, or write release notes for a new version (e.g. "prepare release notes for 2.x.y", "draft release X").

2026-05-25466

review-transform.md

from "albumentations-team/AlbumentationsX"

Run the full shared Codex review checklist against a transform. Use when the user asks to review, audit, or check a transform for correctness, performance, or API consistency.

2026-05-25466

package.json

"author": "albumentations-team"

"repository": "albumentations-team/AlbumentationsX"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name	mixing-transforms
description	Policy for AlbumentationsX transforms that combine multiple images or objects. Use when implementing, reviewing, or using Mosaic, CopyAndPaste, OverlayElements, HistogramMatching, PixelDistributionAdaptation, or other mixing transforms.

Mixing Transforms Policy

1. Donor sampling happens OUTSIDE the transform

Mixing transforms never sample which donor image or which instances to use. That is the user's responsibility. The transform receives the final list and processes it verbatim.

# CORRECT — user picks donors before the transform
donors = [dataset[random.choice(indices)] for _ in range(n)]
result = transform(image=image, mosaic_metadata=donors)

# INCORRECT — transform decides internally
result = MosaicWithSampling(dataset=dataset)(image=image)

2. Metadata format: `list[dict]`

mosaic_metadata = [
    {"image": img1, "mask": mask1, "bboxes": bboxes1, "bbox_labels": {...}},
    {"image": img2, ...},
]

copy_paste_metadata = [
    {"image": src_img, "mask": obj_mask, "bbox": [x1, y1, x2, y2], "bbox_labels": {"class_id": 3}},
    {"image": src_img, "mask": obj_mask2, "bbox_labels": {"class_id": 7}},
]

3. Label fields: `bbox_labels` and `keypoint_labels` (dicts)

All mixing transforms use the same wrapper dict convention for labels:

bbox_labels: dict[str, Any] — maps each label field name (as declared in BboxParams.label_fields) to its value(s) for this item.
keypoint_labels: dict[str, Any] — maps each label field name (as declared in KeypointParams.label_fields) to its value(s) for this item.

For CopyAndPaste (one object per dict), values are scalars (one bbox, one object):

{
    "image": src_image,
    "mask": obj_mask,
    "bbox": [10, 20, 50, 80],        # same coord_format as BboxParams
    "bbox_labels": {
        "class_id": 3,
        "is_crowd": 0,
    },
    "keypoints": [[25, 40]],         # same coord_format as KeypointParams
    "keypoint_labels": {
        "joint_name": "left_eye",
    },
}

For Mosaic (one full image per dict), values are lists — one entry per bbox/keypoint:

{
    "image": img,
    "bboxes": [[10, 20, 50, 80], [5, 5, 30, 30]],
    "bbox_labels": {
        "class_id": [3, 7],
        "is_crowd": [0, 1],
    },
    "keypoints": [[25, 40], [60, 70]],
    "keypoint_labels": {
        "joint_name": ["left_eye", "nose"],
    },
}

Key rule: the dict keys in bbox_labels / keypoint_labels must exactly match what is declared in BboxParams(label_fields=[...]) and KeypointParams(label_fields=[...]).

4. Coordinates use the same format as `BboxParams` / `KeypointParams`

# BboxParams declared with coord_format='pascal_voc'
# → bboxes in metadata must also be pascal_voc [x_min, y_min, x_max, y_max]
copy_paste_metadata = [
    {"image": img, "mask": m, "bbox": [10, 20, 50, 80], "bbox_labels": {"class_id": 3}},
]

5. `metadata_key` pattern

Every mixing transform exposes metadata_key: str in its constructor and lists it in targets_as_params. This ensures Compose validates that the key is present.

@property
def targets_as_params(self) -> list[str]:
    return [self.metadata_key]

6. No-op on empty or missing metadata

If the metadata list is empty or missing, the transform must return the input unchanged without raising an error.

metadata = data.get(self.metadata_key)
if not isinstance(metadata, list) or not metadata:
    return self._no_op_params()

mixing-transforms

Mixing Transforms Policy

1. Donor sampling happens OUTSIDE the transform

2. Metadata format: list[dict]

3. Label fields: bbox_labels and keypoint_labels (dicts)

4. Coordinates use the same format as BboxParams / KeypointParams

5. metadata_key pattern

6. No-op on empty or missing metadata

More from this repository

More from this repository

Mixing Transforms Policy

1. Donor sampling happens OUTSIDE the transform

2. Metadata format: list[dict]

3. Label fields: bbox_labels and keypoint_labels (dicts)

4. Coordinates use the same format as BboxParams / KeypointParams

5. metadata_key pattern

6. No-op on empty or missing metadata

2. Metadata format: `list[dict]`

3. Label fields: `bbox_labels` and `keypoint_labels` (dicts)

4. Coordinates use the same format as `BboxParams` / `KeypointParams`

5. `metadata_key` pattern

2. Metadata format: `list[dict]`

3. Label fields: `bbox_labels` and `keypoint_labels` (dicts)

4. Coordinates use the same format as `BboxParams` / `KeypointParams`

5. `metadata_key` pattern