Skip to main content
Manusで任意のスキルを実行
ワンクリックで
$pwd:

intel-esimd-fuse

// Expert guidance for implementing fused multi-operation kernels on Intel GPUs using ESIMD. Use this skill whenever the user needs to fuse multiple operations into a single kernel pass to minimize memory traffic, such as softmax + top-K + normalize, or any pipeline that chains reduction, selection, and normalization in one kernel. Also trigger for ESIMD softmax implementation, vectorized exp on simd<float,N> for a full row, detail::sum vs reduce pitfall (reduce silently returns 0), fused attention block selection with probability normalization, or any kernel that computes softmax probabilities and immediately selects the top-K entries. The main example is the fused softmax+topk+normalize V2 variant achieving 43.2 GB/s (43% bandwidth utilization) for seq_len=32K, N=128, K=8.

$ git log --oneline --stat
stars:2,318
forks:204
updated:2026年4月17日 07:14
SKILL.md
readonly