Skip to main content
Ejecuta cualquier Skill en Manus
con un clic
$pwd:

lds-optimization

// Optimize LDS (Local Data Share / shared memory) access patterns in FlyDSL GPU kernels. Diagnose bank conflicts and high lgkmcnt stalls from ATT trace data, then apply swizzle or padding layouts to eliminate conflicts. Also increase the distance between LDS write and subsequent LDS read to hide LDS latency. LDS read preceded by write always requires a sync (s_waitcnt lgkmcnt or s_barrier). Use when trace analysis shows ds_read/ds_write/lgkmcnt as a bottleneck. Usage: /lds-optimization

$ git log --oneline --stat
stars:192
forks:56
updated:29 de mayo de 2026, 07:14
SKILL.md
readonly