llm-inference-optimization

星标2

分支2

更新时间2026年2月9日 00:33

Quantization, caching, batching, and serving optimization for LLM inference.

安装

用 Codex 或 Claude 帮你安装复制这段 Prompt，粘贴到 Codex、Claude 或其他助手里，让它检查 Skill 页面并帮你完成安装。

在 Manus 中运行

来源

cgyudistira

cgyudistira/agentkit

打开 GitHub 仓库查看创作者相关仓库

下载

在 Manus 中运行

LLM Inference Optimization

Quantization, caching, batching, and serving optimization for LLM inference.

When to Use

Use this skill when working on ai engineer tasks related to llm inference optimization.

Key Concepts

Best Practices: Follow industry standards
Implementation: Step-by-step guidance
Examples: Real-world applications

Guidelines

Start with understanding requirements
Apply proven patterns
Test and validate results

同仓库更多 Skills

同仓库

affiliate-marketing

cgyudistira/agentkit

Affiliate program strategy, link optimization, and commission maximization.

2026-02-092

ai-code-generation

cgyudistira/agentkit

Code generation with LLMs, code review automation, and AI pair programming.

2026-02-092

ai-safety-alignment

cgyudistira/agentkit

RLHF, constitutional AI, safety evaluation, and alignment techniques.

2026-02-092

astro-sites

cgyudistira/agentkit

Astro static site generation, islands architecture, and content collections.

2026-02-092

brand-identity

cgyudistira/agentkit

Logo design, brand guidelines, and visual identity systems.

2026-02-092

case-study-writing

cgyudistira/agentkit

Compelling case studies that showcase results and drive conversions.

2026-02-092

一键运行任何 Skill

name	LLM Inference Optimization
description	Quantization, caching, batching, and serving optimization for LLM inference.