一键在 Manus 中运行任何 Skill

hls

High-Level Synthesis — C/C++ algorithm analysis, HLS directive optimisation, synthesis execution, and co-simulation verification. Use when converting C/C++ to synthesisable RTL, optimising for latency/throughput/area targets using pragmas, or verifying that generated RTL matches the golden C model.

在 Manus 中运行

概览

安装命令

npx skills add https://github.com/chuanseng-ng/digital-chip-design-agents --skill hls

复制此命令并粘贴到 Claude Code 中以安装该技能

来源

chuanseng-ng/digital-chip-design-agents

星标140

分支36

更新时间2026年5月31日 01:04

SKILL.md

readonly

同仓库更多 Skills

同仓库

pipeline-orchestration

chuanseng-ng/digital-chip-design-agents

Cross-domain loop orchestration for the chip design pipeline. Provides the fix_request protocol, iteration-cap logic, escalation templates, and dispatch patterns for routing verification/formal failures to the RTL orchestrator and back. Use when driving the closed-loop verification↔RTL feedback cycle.

2026-05-31140

memory-keeper

chuanseng-ng/digital-chip-design-agents

Distil accumulated experience records (experiences.jsonl) into updated domain knowledge summaries (knowledge.md) for any chip-design domain. Run after every 10 orchestrator sessions, or on demand when a domain has collected new issue/fix patterns.

2026-05-31140

sta

chuanseng-ng/digital-chip-design-agents

Static timing analysis — multi-corner constraint validation, setup and hold analysis, timing exception review, and ECO guidance for closure. Use when running timing analysis on a design, reviewing timing violations, guiding ECO fixes, or performing timing sign-off for tape-out.

2026-05-31140

architecture

chuanseng-ng/digital-chip-design-agents

Microarchitecture exploration, PPA estimation, risk assessment, and architecture sign-off for digital chip design. Use when evaluating design candidates, estimating power/area/performance, assessing technical risk, or producing a microarchitecture document for handoff to RTL design.

2026-05-31140

dft

chuanseng-ng/digital-chip-design-agents

Design for Test — scan architecture planning, scan insertion, ATPG pattern generation, MBIST for embedded memories, and JTAG boundary scan. Use when planning a DFT strategy, inserting scan, generating test patterns, or verifying that a chip will be testable in manufacturing.

2026-05-31140

formal-verification

chuanseng-ng/digital-chip-design-agents

Formal property verification (FPV) and logical equivalence checking (LEC). Use when proving design properties exhaustively, checking RTL vs gate-level netlist equivalence, verifying CDC crossings formally, or closing verification coverage gaps that simulation cannot efficiently reach.

2026-05-31140

来源

chuanseng-ng

chuanseng-ng/digital-chip-design-agents

打开 GitHub 仓库查看创作者相关仓库

安装命令

下载

在 Manus 中运行

适用职业SOC

软件开发工程师计算机与数学类职业15-1252L4

name	hls
description	High-Level Synthesis — C/C++ algorithm analysis, HLS directive optimisation, synthesis execution, and co-simulation verification. Use when converting C/C++ to synthesisable RTL, optimising for latency/throughput/area targets using pragmas, or verifying that generated RTL matches the golden C model.
version	1.0.0
author	chuanseng-ng
license	MIT
allowed-tools	Read, Write, Bash

Skill: High-Level Synthesis (HLS)

Invocation

If invoked by a user presenting an HLS task: immediately spawn the digital-chip-design-agents:hls-orchestrator agent and pass the full user request and any available context. Do not execute stages directly.
If invoked by the hls-orchestrator mid-flow: do not spawn a new agent. Treat this file as read-only — return the requested stage rules, sign-off criteria, or loop-back guidance to the calling orchestrator.

Spawning the orchestrator from within an active orchestrator run causes recursive delegation and must never happen.

Pre-run Context

Before executing or advising on any stage, read the following files if they exist:

memory/hls/knowledge.md — known failure patterns, successful tool flags, PDK/tool quirks. Incorporate its guidance into every stage decision. If absent, proceed without it.
memory/hls/run_state.md — current run identity (run_id, design_name, tool, last_stage). Use this to resume correctly after interruption. If absent, a new run is starting; the orchestrator will create this file before the first stage.

This pre-run read applies whether this skill is loaded by a user or called by the orchestrator mid-flow. It ensures the fix database is consulted before any diagnosis step.

Purpose

Convert C/C++/SystemC algorithmic descriptions to synthesisable RTL. Covers algorithm analysis for HLS compatibility, pragma/directive optimisation, and co-simulation to verify RTL matches the golden C model.

Supported EDA Tools

Open-Source

Bambu HLS (bambu) — open-source HLS from Politecnico di Milano
LegUp HLS — FPGA-targeted HLS built on LLVM
Calyx / Futil — infrastructure for HLS compilers (academic)
MLIR/CIRCT (circt-opt) — compiler infrastructure for hardware design

Proprietary

Xilinx Vitis HLS (vitis_hls) — C/C++ to RTL for AMD/Xilinx devices
Cadence Stratus (stratus) — SystemC/C++ HLS for ASIC and FPGA
Siemens Catapult (catapult) — algorithmic synthesis from C++/SystemC

Stage: algorithm_analysis

HLS-Hostile Patterns (must fix before synthesis)

Dynamic memory (malloc/new) → replace with fixed-size static arrays
Recursive functions → convert to iterative with explicit stack
Pointer aliasing → use restrict keyword or restructure accesses
System calls (printf, file I/O) → wrap in #ifndef __SYNTHESIS__
Function pointers → replace with switch/case dispatch
Data-dependent loop bounds → add maximum bound + early-exit flag
Floating-point → evaluate fixed-point (ap_fixed<W,I> for Vitis HLS)

Analysis Steps

Identify innermost critical loop — the performance bottleneck
Analyse loop-carried dependencies — limit achievable II
Classify memory access: sequential (burst-able) vs random (expensive)
Calculate theoretical minimum latency: trip_count × body_latency

QoR Metrics to Evaluate

All HLS-hostile patterns resolved
Critical loop identified with dependency graph
Theoretical II lower bound computed

Output Required

Algorithm analysis report
Fixed-point type recommendations (if applicable)
Critical loop dependency graph

Stage: directive_planning

Pipelining and Throughput

#pragma HLS PIPELINE II=1          // Pipeline loop, target II=1
#pragma HLS DATAFLOW                // Task-level pipelining
#pragma HLS LOOP_FLATTEN            // Flatten nested loops
#pragma HLS LOOP_MERGE              // Merge sequential loops

Latency and Unrolling

#pragma HLS UNROLL factor=4        // Partial unroll (4 parallel copies)
#pragma HLS UNROLL                  // Full unroll (small trip counts only)

Memory and Interfaces

#pragma HLS ARRAY_PARTITION variable=buf cyclic factor=4
#pragma HLS INTERFACE mode=axis port=data       // AXI4-Stream
#pragma HLS INTERFACE mode=m_axi port=mem       // AXI4 master
#pragma HLS INTERFACE mode=s_axilite port=ctrl  // AXI4-Lite registers

Resource Binding

#pragma HLS BIND_OP op=mul impl=dsp      // Force multiply to DSP
#pragma HLS ALLOCATION operation=mul limit=4   // Cap DSP count

Strategy by Target

Target	Primary Directives
Low latency	UNROLL + PIPELINE II=1
High throughput	PIPELINE + DATAFLOW + ARRAY_PARTITION
Low area	ALLOCATION limits + no UNROLL
Balanced	PIPELINE II=1 inner loop + ARRAY_PARTITION

QoR Metrics to Evaluate

Achieved II: ≤ design_state.constraints.hls.target_ii (one of target_ii or target_latency_cycles must be set; prefer target_ii if both — see Constraint Validation section)
Latency: ≤ design_state.constraints.hls.target_latency_cycles cycles (one of target_ii or target_latency_cycles must be set)
Area: within budget
No directive synthesis errors

Output Required

Annotated source with all directives and justifications
Directive justification table

Stage: hls_synthesis

Domain Rules

Synthesise at target clock period
Check HLS report: latency, II, resource usage
Compare achieved vs target — loop back to directives if miss
Flag any warnings: unresolved dependencies, failed II, inferred latches
Verify interface protocols match system integration requirements

QoR Metrics to Evaluate

II: matches or beats design_state.constraints.hls.target_ii (one of target_ii or target_latency_cycles must be set; prefer target_ii if both)
Latency: within design_state.constraints.hls.target_latency_cycles cycles (one of target_ii or target_latency_cycles must be set)
Area: within budget
No latch inference warnings

Output Required

HLS synthesis report (latency, II, resource summary)
Generated RTL files
Unresolved warnings with justification

Stage: rtl_qc

Domain Rules

Run lint on HLS-generated RTL (same rules as rtl-design skill)
Verify no latches in generated RTL
Verify interface signal names match integration requirements
Check all registers reset correctly

QoR Metrics to Evaluate

Lint: 0 errors
No latches inferred
Interface ports match integration spec

Output Required

Lint report on HLS-generated RTL

Stage: cosimulation

Domain Rules

C testbench drives RTL through HLS wrapper
RTL outputs compared against C golden model automatically
Measure actual latency and II — must match HLS report ±5%
Exercise all code paths; test boundary conditions

Common Failures

Failure	Fix
Output mismatch	Check fixed-point overflow; increase bit widths
AXI handshake error	Fix INTERFACE pragma configuration
Latency differs	Verify loop bounds are static
X propagation	Initialise all variables in C source

QoR Metrics to Evaluate

Co-simulation: 100% output match with C golden model
Latency measured: within design_state.constraints.hls.cosim_tolerance_pct% of HLS report (default: 5%)
II measured: matches HLS report exactly
No simulation errors or X propagation

Output Required

Co-simulation pass/fail report
Latency and II measurement log

Stage: hls_signoff

Sign-off Checklist

All HLS-hostile patterns resolved
Achieved II ≤ design_state.constraints.hls.target_ii (one of target_ii or target_latency_cycles must be set; prefer target_ii if both)
Latency ≤ design_state.constraints.hls.target_latency_cycles cycles (one of target_ii or target_latency_cycles must be set)
Area within budget
RTL QC: lint clean, no latches
Co-simulation: 100% output match; latency within design_state.constraints.hls.cosim_tolerance_pct% (default: 5%)
Interface ports match system integration spec

Output Required

HLS RTL package (generated .v/.sv files)
Co-simulation pass report
HLS QoR report (latency, II, area)
Interface documentation

Constraint Validation

See plugins/meta/skills/pipeline-orchestration/SKILL.md §Constraints Schema for the authoritative schema and stage-entry validation rule.

Required at entry (algorithm_analysis) — at least one must be non-null:

constraints.hls.target_ii — target initiation interval (one of target_ii or target_latency_cycles must be set; prefer target_ii if both)
constraints.hls.target_latency_cycles — target latency in clock cycles (one of target_ii or target_latency_cycles must be set)

Optional (schema defaults apply when absent):

constraints.hls.cosim_tolerance_pct (default: 5) — acceptable co-simulation latency deviation %
constraints.clock.clk_mhz — target clock for synthesis (used if set; otherwise tool default)

Memory

Write on stage completion

After each stage completes (regardless of whether an orchestrator session is active), write or overwrite one JSON record in memory/hls/experiences.jsonl keyed by run_id. This ensures data is persisted even if the flow is interrupted or called without full orchestrator context.

Use run_id = hls_<YYYYMMDD>_<HHMMSS> (set once at flow start; reuse on each stage update). Every JSON record written must include a top-level "run_id" field whose value matches this key — this is what makes overwrites unambiguous. Set signoff_achieved: false until the final sign-off stage completes.

Run state (write before first stage, update after each stage)

Write memory/hls/run_state.md as the first action before launching any tool:

run_id:      hls_<YYYYMMDD>_<HHMMSS>
design_name: <design>
tool:        <primary tool>
start_time:  <ISO-8601>
last_stage:  <first stage name>

Update last_stage after each stage completes. This file lets wakeup-loop prompts and resumed sessions identify the correct run without relying on in-memory state. Create the file and parent directories if they do not exist.

Optional: claude-mem index

If mcp__plugin_ecc_memory__add_observations is available in this session, emit each applied fix as an observation to entity chip-design-hls-fixes after writing to experiences.jsonl. Skip silently if the tool is absent — JSONL is the canonical record.