Skip to main content
在 Manus 中运行任何 Skill
一键导入
$pwd:

lds-optimization

// Optimize LDS (Local Data Share / shared memory) access patterns in FlyDSL GPU kernels. Diagnose bank conflicts and high lgkmcnt stalls from ATT trace data, then apply swizzle or padding layouts to eliminate conflicts. Also increase the distance between LDS write and subsequent LDS read to hide LDS latency. LDS read preceded by write always requires a sync (s_waitcnt lgkmcnt or s_barrier). Use when trace analysis shows ds_read/ds_write/lgkmcnt as a bottleneck. Usage: /lds-optimization

$ git log --oneline --stat
stars:192
forks:56
updated:2026年5月29日 07:14
SKILL.md
readonly