원클릭으로
chibi-sticker-sheet
// Use when generating a LINE/WeChat-style chibi sticker sheet (4x8 grid, 32 expressions) from an anime character reference image via Gemini, including transparent PNG output and individual cell slicing.
// Use when generating a LINE/WeChat-style chibi sticker sheet (4x8 grid, 32 expressions) from an anime character reference image via Gemini, including transparent PNG output and individual cell slicing.
Turn a completed non-trivial engineering session into a paired durable artifact: (1) a reusable skill under Innei's personal SKILL repo, and (2) a published blog post that narrates the journey and embeds the skill URL. Triggers on "把这个过程写成 skill 再写一篇 blog"、"沉淀一下这次的折腾"、 "productize this session"、"publish this as a skill and a writeup".
Use when deploying VLESS+Reality+Vision on AWS Lightsail and bridging it to Surge (or similar TUN clients) with an Alpine LXC SOCKS5 bridge on the LAN plus a brew sing-box on the laptop for off-LAN use. Triggers on "set up vless reality on aws", "tokyo lightsail proxy", "surge doesn't support vless", "vless when away from home lan", "alpine lxc client for vless".
Access the remote mx-space PostgreSQL through ssh to the swarm host, then docker exec into the Postgres container and run psql inside the container. Use for inspecting tables, sampling rows, validating topic assignments, and performing guarded updates in the specific mx-space deployment pattern where direct host-level Postgres access is unreliable.
Audit remote mx-space translation data through ssh to the swarm host, then docker exec into the Postgres container and run psql inside the container. Use for checking translation_entries coverage, ai_translations gaps, strict computeContentHash mismatches, and runtime freshness semantics in deployments where direct Postgres access is unreliable.
Use when you need to canary-deploy two HTTP backends on the same Dokploy-managed domain with a percentage split (e.g. migrating an SPA from one framework to another). Covers the critical SPA-asset trap that breaks naive sticky-cookie splits, a path-aware Traefik config that solves it, and a four-phase workflow (backup → dry-run on a canary host → roll to prod → quick rollback).
Use when user provides exported chat history (WeChat / Telegram / iMessage / QQ exports as .md / .txt / .json) and asks "我和 X 聊了啥"、"看完整个 md"、"细说 XX"、"是不是有 Y"、"what did I and X talk about", wants topical breakdown, requests detail on specific themes (relationships / work / health / events), or seeks honest interpretation of conversation dynamics. Triggers on large chat dumps (>1000 lines) where direct full read is impractical.
| name | chibi-sticker-sheet |
| description | Use when generating a LINE/WeChat-style chibi sticker sheet (4x8 grid, 32 expressions) from an anime character reference image via Gemini, including transparent PNG output and individual cell slicing. |
Generate a 4×8 chibi sticker sheet from a character reference image with Gemini, then key out the white background and slice into 32 individual transparent PNGs.
sheet_white.png, sheet_transparent.png, cells/*.png (512×512 each)GOOGLE_AI_STUDIO_API_KEY / GEMINI_API_KEY / GOOGLE_API_KEY (AI Studio) or VERTEX_AI_KEY (Vertex Express) or GOOGLE_GENAI_USE_VERTEXAI=true + GOOGLE_CLOUD_PROJECT/LOCATION (Vertex ADC); Python + uv; model gemini-3.1-flash-image-previewImageDraw post-hoc for captionsThe natural "white bg + black bg → α extraction" requires pixel-aligned foregrounds. Gemini ignores bg-change instructions in image-to-image edits and returns the same image. Use edge flood-fill keying instead (see scripts/key_alpha.py).
| Variable | Meaning |
|---|---|
CHAR_REF | Absolute path to character reference PNG |
EXPRESSIONS | List of 32 strings, action-first (e.g. "waterfall tears streaming down both cheeks") |
OUT_DIR | Output directory |
[1] Generate white-bg 4×8 sheet (two Gemini calls, each 4×4)
-> gemini-3.1-flash-image-preview, image-to-image with CHAR_REF
-> Call A: expressions 1–16 -> sheet_white_a.png
-> Call B: expressions 17–32 -> sheet_white_b.png
-> Stitch A and B vertically -> sheet_white.png
[2] Key out background
-> scripts/key_alpha.py: edge flood-fill via scipy.ndimage.label
-> save sheet_transparent.png
[3] Slice 4×8 grid
-> scripts/key_alpha.py: min(cw, ch) square crop, centered per cell
-> save cells/01_*.png … cells/32_*.png (512×512)
Gemini's image output resolution caps around 1024×1024. A single 4×8 canvas would squash each cell to ~128×256px — too small for clean chibi art. Generating two 4×4 sheets and stitching keeps per-cell resolution at the same quality as 16-sticker sets.
Two calls (Call A: expressions 1–16, Call B: expressions 17–32), each image-to-image with the same character reference.
Use identical art-style and character-lock text for both calls to ensure visual consistency.
Generate a 4x4 grid sticker sheet of 16 chibi stickers of the same character
from the reference image. Seamless pure white (#ffffff) background; no grid
lines, no cell borders, no text, no captions. Stickers evenly spaced.
Art style: LINE / WeChat Japanese chibi sticker, extreme super-deformed
2-head body ratio, oversized round head, tiny stubby body, thick uniform
bold black ink outline, flat cel shading, warm creamy pastel palette, two
large round pink cheek blush dots on every face, large round eyes with a
single bright white highlight, mochi chibi aesthetic.
Character lock (must match in every single cell):
[list each attribute: hair color/style, accessories, eye color, outfit details]
16 expressions, left-to-right top-to-bottom:
1. <action-first description>
...
16. <action-first description>
Run the same prompt template a second time with expressions 17–32 substituted in. Then stitch:
from PIL import Image
a = Image.open("sheet_white_a.png")
b = Image.open("sheet_white_b.png")
# Resize b to match a's width if they differ (Gemini output dims can vary)
if b.width != a.width:
b = b.resize((a.width, int(b.height * a.width / b.width)), Image.LANCZOS)
combined = Image.new("RGB", (a.width, a.height + b.height), (255, 255, 255))
combined.paste(a, (0, 0))
combined.paste(b, (0, a.height))
combined.save("sheet_white.png")
"waterfall tears streaming" beats "sad". Include a visible prop or gesture.MALFORMED_FUNCTION_CALL. No negative-list clauses (must NOT).super-deformed 2-head ratio · thick uniform bold black ink outline · flat cel shading · mochi chibi · large round pink cheek blush dotspink cherry blossom sakura flower hair ornament), eye color, every garment piece.Gemini tends to reuse the same pose for cells that share a column, especially cells 9 and 13 (column-1 of rows 3–4) and the equivalent pairs in the second sheet (25 and 29). Prevent duplicates:
Span all six emotional axes across the 32 slots — no axis should appear more than 6 times:
| Axis | Example actions |
|---|---|
| Joy / excitement | raised fist, jumping, sparkle eyes |
| Sadness / crying | waterfall tears, wilting head, tissues |
| Anger / frustration | puffed cheeks, steam from head, finger-point |
| Surprise / shock | wide-O mouth, hands-on-cheeks, dropped jaw |
| Shy / embarrassed | hands over red face, hiding behind sleeves |
| Calm / smug | arms crossed, side-eye, tea-sipping |
Assign a distinct body verb to every cell. If two cells share the same verb (e.g., both "crying"), Gemini collapses them. Make verbs orthogonal: waterfall-tears arms-spread ≠ single-teardrop hands-clasped.
Critical column-1 pairs: 9 & 13, and 25 & 29 must each differ in both emotion axis AND body posture. Write each pair side by side before finalising and check they are visually distinguishable.
Vary facing direction and limb action across rows. Cells in the same column naturally echo each other; counteract by alternating pose direction (facing left vs. right) or prop presence.
No expression from Call A (1–16) should be repeated in Call B (17–32). Cross-check verbs before submitting the second prompt.
Reference 32-slot layout (copy and customize):
Call A (sheet 1, expressions 1–16):
1. arms raised, jumping for joy, sparkle eyes
2. waterfall tears streaming, arms limp at sides
3. puffed cheeks, steam wisps from head, fists clenched
4. wide-O mouth shock, both hands on cheeks Home-Alone pose
5. smug smile, arms crossed, eyes half-lidded
6. shy embarrassment, hands pressed together, deep blush
7. thumbs-up grin, winking one eye
8. single teardrop rolling, trembling lip, hands clasped
9. hyper-excited wave, leaning forward, mouth wide open ← must differ from #13
10. exhausted slumped, sweat drop, drooping eyes
11. index finger raised, lecture pose, tiny smile
12. heart eyes, both hands framing face, rosy cheeks
13. angry stomp, foot raised, fist shaking at sky ← must differ from #9
14. sleeping ZZZ, head tilted, eyes closed
15. nervous laugh, hand behind head, eye twitch
16. victory peace-sign, tongue out, confetti burst
Call B (sheet 2, expressions 17–32):
17. cheering both fists raised, eyes glittering, mouth wide
18. sobbing face buried in hands, shoulders shaking
19. furious vein-pop, pointing finger, leaning in
20. startled leap, arms flailing outward, pupils tiny
21. lovesick float, dreamy spiral eyes, hearts around head
22. pouty sulk, arms crossed, cheeks puffed, eyes averted
23. excited run, legs spinning wheel, sweat drops flying
24. relieved sigh, hand on chest, eyes closed in relief
25. cheerful skip, one leg up, waving hello ← must differ from #29
26. defeated head-desk slump, tiny sweat rivers
27. determined fist-pump, one eye closed, cape flutter
28. panicked sprint, eyes wide, papers scattering
29. grumpy arms-crossed side-eye, tapping foot impatiently ← must differ from #25
30. yawning stretch, arms out, eyes teary from yawn
31. embarrassed covering ears, blushing furiously, hunched
32. triumphant pose, foot on imaginary podium, sparkle burst
Gemini does not divide the canvas into equal cells. Row/column heights vary (e.g., top row 550px vs. bottom row 480px). Hard image_size / 4 cuts cause sticker overflow into adjacent cells.
After stitching the two 4×4 sheets into one 4×8 canvas, run _find_cuts() on the combined image. Pass n_rows=8, n_cols=4.
Fix: compute per-row and per-column dark-outline occupancy profiles instead of white-fraction profiles. True inter-cell gaps have zero or near-zero dark pixels because the black outline disappears entirely in the white gutter, while white hair or clothing inside a sticker still leaves some outline pixels somewhere on that same row/column.
dist = np.max(np.abs(rgb.astype(np.int16) - 255), axis=2) # 0 = white
dark = binary_dilation(dist > OUTLINE_THRESH, iterations=1)
row_profile = dark.sum(axis=1) # dark outline count per row
col_profile = dark.sum(axis=0) # dark outline count per col
# find contiguous near-zero-dark runs, then select the deepest valleys
# under broad cell-size sanity bounds instead of equal-spacing fallback
If a gutter is noisy and no exact zero-dark run survives, fall back to the lowest dark-count valleys in the smoothed profile. Cells are saved as cells/01_*.png … cells/32_*.png (512×512). See scripts/key_alpha.py: _find_cuts().
See scripts/key_alpha.py. Core algorithm:
# dist[y,x] = max channel delta from pure white (0 = white, 255 = farthest)
dist = np.max(np.abs(rgb.astype(np.int16) - 255), axis=2)
near_white = dist < WHITE_TOL # WHITE_TOL = 28
lbl, _ = label(near_white) # connected components
edge_ids = {lbl[0,:], lbl[-1,:], lbl[:,0], lbl[:,-1]} - {0} # border-touching
bg_mask = np.isin(lbl, list(edge_ids))
alpha = np.where(bg_mask, 0, 255).astype(np.uint8)
# feather boundary: GaussianBlur(radius=1.2) on alpha channel
Interior white pixels (shirt, skin highlights) are enclosed by foreground and never reach the border — they stay opaque.
White clothing leak: if the outline has 1–2 px gaps, the flood leaks through into white fabric. Fix: dilate dark outline pixels (dist > OUTLINE_THRESH=150) by OUTLINE_DILATE=2 iterations before flood-fill to plug gaps.
Gemini 3 image models have three transient failure modes:
| Symptom | Action |
|---|---|
503 UNAVAILABLE / 429 RESOURCE_EXHAUSTED | Exponential back-off (2^attempt * 5s), 6 retries |
FinishReason.MALFORMED_FUNCTION_CALL | Shorten/simplify prompt; remove negative clauses |
resp.parts is None, only text returned | Retry; tighten the lock clause |
See scripts/generate.py for the full retry wrapper.
Two separate Gemini calls will often produce slightly different rendering — brightness, line weight, palette temperature, or shading style may drift. Control it with the following steps:
Copy the exact art-style paragraph and character-lock paragraph from Call A into Call B without any edits. Even synonym substitutions (circular vs round) can shift Gemini's style.
Pass sheet_white_a.png alongside CHAR_REF as the image input for Call B. Gemini will use it as a visual anchor. Example (google-generativeai SDK):
parts = [
PIL_to_part(char_ref_img),
PIL_to_part(sheet_a_img), # style anchor
types.Part(text=prompt_b),
]
After stitching, compute the mean luminance of the top half (sheet A cells) and bottom half (sheet B cells). If they differ by more than 8 luma units, apply a PIL.ImageEnhance.Brightness correction to the dimmer half before writing sheet_white.png.
from PIL import Image, ImageEnhance
import numpy as np
def mean_luma(img_crop):
arr = np.asarray(img_crop.convert("L")).astype(float)
return arr.mean()
combined = Image.open("sheet_white.png")
h = combined.height // 2
luma_a = mean_luma(combined.crop((0, 0, combined.width, h)))
luma_b = mean_luma(combined.crop((0, h, combined.width, combined.height)))
if abs(luma_a - luma_b) > 8:
# brighten or darken the bottom half
factor = luma_a / luma_b
bottom = combined.crop((0, h, combined.width, combined.height))
bottom = ImageEnhance.Brightness(bottom).enhance(factor)
combined.paste(bottom, (0, h))
combined.save("sheet_white.png")
Open sheet_white.png and scan for:
If drift is visible, re-run Call B with a slightly adjusted prompt (e.g., add "same warm creamy pastel palette as reference sheet") and a fresh seed. Repeat QC.
| Symptom | Cause | Fix |
|---|---|---|
| Sheet B looks sketch-like, less shaded | Gemini skipped cel-shading | Add "flat cel shading, no sketch lines" to Call B prompt |
| Hair color changes between row 4 and row 5 | Character lock didn't persist | Repeat exact hex color code in Call B character lock |
| Sheet B overall darker | Different generation context | Apply brightness correction (step 3) |
| Outline weight visibly thinner in sheet B | Gemini style variance | Add "thick uniform bold black ink outline, same line weight as reference" |
| Mistake | Fix |
|---|---|
Vague accessories (red ribbon) | Spell out species/shape (pink sakura flower, 5 petals) |
| Uniform expressions (all sad-variants) | Mix action verbs: raised fist, palm-push, tilted head, waterfall tears, thumbs-up |
| Cells 9 and 13 look identical | They share column-1; assign different emotion axis and body posture to each — see Expression Diversity Rules |
| Grid lines appear in output | Add seamless pure white, no grid lines, no cell borders to prompt |
| Hair color drifts across cells | Repeat exact color spec as first item in "character lock" |
image.size AttributeError | part.as_image() returns genai Image, not PIL; convert via Image.open(io.BytesIO(img.image_bytes)) |
| Adjacent sticker bleeds into cell | Gemini grid is uneven; use _find_cuts() dark-profile detection, not image_size // 4 |
| White clothing becomes transparent | Outline gaps let flood reach fabric; set OUTLINE_DILATE=2 to dilate outline before flood-fill |
Each sticker set needs three additional assets. Generate with scripts/generate_extras.py:
uv run generate_extras.py <sticker_dir> <char_ref_image> "<theme hint>"
| Asset | Spec | How produced |
|---|---|---|
banner.png | 750×400 PNG, colorful bg, no text | Gemini image-to-image, 16:9 → center-crop |
cover.png | 240×240 transparent PNG, half/full body | Cell 07 (thumbs-up) resized with PIL |
icon.png | 50×50 transparent PNG, head shot | Cell 07 full cell resized (no crop — chibi proportions fit naturally) |
Theme hints by set:
"snowy winter wonderland""golden autumn ginkgo forest""sunny summer beach ocean waves""cherry blossom spring school campus""red plum blossom Japanese garden"WeChat rules:
scripts/generate_banner.py produces a standalone banner for any registered platform — independent of the WeChat extras flow. WeChat itself is one preset; Twitter/X is another. New platforms are added by appending to the PLATFORMS dict.
uv run generate_banner.py <char_ref> "<theme>" <out_path> --platform twitter
| Platform | Size | Gemini AR | Crop |
|---|---|---|---|
wechat | 750×400 | 16:9 | minimal (~6% vertical trim) |
twitter | 1500×500 | 16:9 | aggressive (middle 56% only) |
Each preset is a Platform(width, height, aspect_ratio, composition_hint). The hint is appended to the prompt and is the place to encode platform-specific safe-zone layout (avatar overlay, mobile crop, letterbox bars). WeChat needs no hint; Twitter needs explicit letterbox awareness — see below.
Gemini 3 image preview only supports a fixed aspect-ratio menu (1:1, 4:3, 3:4, 16:9, 9:16). For ratios more extreme than 16:9 (Twitter 3:1, LinkedIn 4:1) we generate at 16:9 then center-crop the middle band. The naive prompt strategy fails in two opposite ways:
| Prompt told Gemini | Failure mode |
|---|---|
| "Subjects must fit inside the 22%–78% band" | Gemini adds safety margin → chibis end up at ~35%–65% → tiny in final frame |
| "Subjects must fill the canvas vertically" | Gemini puts hair at 5% of source canvas → heads land in the cropped-off top 22% → decapitation |
The fix is to tell Gemini that the top/bottom 22% are off-screen letterbox bars (showing only background overflow), and place subjects in the middle 25%–75% band of the source canvas. This produces full vertical fill in the final visible frame plus a 3% safety margin against the crop line on each side.
The Twitter preset's composition_hint encodes this verbatim. When adding a new extreme-aspect platform (e.g. LinkedIn 4:1 → middle 44% of 16:9 source), compute the letterbox percentages from the math and follow the same structure.
# α channel: expect ~30-40% transparent pixels for a sticker sheet
python -c "
from PIL import Image; import numpy as np
a = np.asarray(Image.open('sheet_transparent.png').convert('RGBA'))[...,3]
print('min', a.min(), 'max', a.max(), 'transparent%', (a<10).mean()*100)
"
# Cell sizes must all be square
ls -la cells/*.png | awk '{print $5, $9}' | head