llm-inference-optimization

Stars2

Forks2

UpdatedFebruary 9, 2026 at 00:33

Quantization, caching, batching, and serving optimization for LLM inference.

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

cgyudistira

cgyudistira/agentkit

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

Related occupationsSOC

Based on SOC occupation classification

Software DevelopersComputer and Mathematical Occupations·SOC 15-1252

SKILL.md

readonly

name	LLM Inference Optimization
description	Quantization, caching, batching, and serving optimization for LLM inference.

LLM Inference Optimization

Quantization, caching, batching, and serving optimization for LLM inference.

When to Use

Use this skill when working on ai engineer tasks related to llm inference optimization.

Key Concepts

Best Practices: Follow industry standards
Implementation: Step-by-step guidance
Examples: Real-world applications

Guidelines

Start with understanding requirements
Apply proven patterns
Test and validate results

llm-inference-optimization

LLM Inference Optimization

When to Use

Key Concepts

Guidelines

More from this repository

LLM Inference Optimization

When to Use

Key Concepts

Guidelines

More from this repository