원클릭으로 Manus에서 모든 스킬 실행

$pwd:

redis-semantic-cache

Name: Redis Semantic Cache
Author: redis

// Redis LangCache guidance for semantic caching of LLM responses on Redis Cloud — calling search/set via the SDK or REST API, tuning the similarity threshold, separating caches per task type, and filtering with custom attributes. Use when caching LLM completions or RAG answers to cut API cost and latency, building a cache-aside layer in front of OpenAI / Anthropic / etc., tuning hit rate vs precision, or splitting one app's LLM workloads into multiple LangCache caches.

Manus에서 실행

$ git log --oneline --stat

stars:64

forks:17

updated:2026년 5월 27일 07:58

파일 탐색기

4 개 파일

SKILL.md

readonly

name	redis-semantic-cache
description	Redis LangCache guidance for semantic caching of LLM responses on Redis Cloud — calling search/set via the SDK or REST API, tuning the similarity threshold, separating caches per task type, and filtering with custom attributes. Use when caching LLM completions or RAG answers to cut API cost and latency, building a cache-aside layer in front of OpenAI / Anthropic / etc., tuning hit rate vs precision, or splitting one app's LLM workloads into multiple LangCache caches.
license	MIT
metadata	{"author":"Redis, Inc.","version":"0.1.0"}

Redis Semantic Cache

Semantic caching for LLM responses with Redis Cloud's LangCache service. Stores prompts as embeddings; subsequent semantically-similar prompts return the cached response without re-calling the model.

LangCache is currently in preview on Redis Cloud. Features and behavior may change.

When to apply

Wrapping an LLM call (OpenAI, Anthropic, etc.) with a cache layer to cut cost and latency.
Caching RAG answers, classification outputs, or any deterministic LLM workload.
Tuning the precision/hit-rate trade-off for a semantic cache.
Splitting one application's LLM workloads across multiple cache instances.

1. The cache-aside flow

LangCache fits in front of any LLM call as a standard cache-aside pattern:

Send the user's prompt to LangCache's search.
Cache hit — return the stored response directly.
Cache miss — call the LLM, then set the response so future similar prompts hit.

from langcache import LangCache
import os

lang_cache = LangCache(
    server_url=f"https://{os.getenv('HOST')}",
    cache_id=os.getenv("CACHE_ID"),
    api_key=os.getenv("API_KEY"),
)

result = lang_cache.search(prompt="What is Redis?", similarity_threshold=0.9)
if result:
    response = result[0]["response"]
else:
    response = llm.generate("What is Redis?")
    lang_cache.set(prompt="What is Redis?", response=response)

The same operations are available via REST (POST /v1/caches/{cacheId}/entries/search and POST /v1/caches/{cacheId}/entries) when an SDK isn't an option.

See references/langcache-usage.md for full SDK + REST samples and attribute-based storage.

2. Tune the similarity threshold

The threshold controls how close (in embedding cosine distance) a new prompt must be to a cached one to count as a hit. Higher = stricter match, fewer false positives. Lower = more hits, more risk of returning an off-topic answer.

Threshold	Behavior	Use when
0.95+	Near-exact match required	Customer-facing answers where wrong responses are costly
0.9	Balanced default	Most workloads — start here
0.8	Loose semantic match	Internal tools, exploratory queries, FAQ deduplication

# Stricter — fewer false positives
result = lang_cache.search(prompt="What is Redis?", similarity_threshold=0.95)

# Looser — higher hit rate
result = lang_cache.search(prompt="What is Redis?", similarity_threshold=0.8)

Adjust by watching the actual cache-hit rate and spot-checking that returned answers are still relevant.

See references/best-practices.md.

3. Separate caches per task type

Different LLM workloads should not share one cache — a "code question" prompt is semantically close to other code questions but has nothing to do with a password-reset support query, and crossing them returns garbage.

support_cache = LangCache(server_url=..., cache_id="support-cache-id", api_key=...)
code_cache    = LangCache(server_url=..., cache_id="code-cache-id",    api_key=...)

Create distinct cache IDs in Redis Cloud per task, and route each call to the right one. As a finer-grained alternative, store and search with custom attributes (e.g. {"category": "database"}) to keep tasks in the same cache but isolated by attribute filter — useful when the same prompt format spans subtopics.

References

LangCache documentation

related-skills.json

같은 저장소

iris-development.md

from "redis/agent-skills"

Iris is Redis's umbrella for AI-focused products. Use this skill when integrating with the Iris Redis Agent Memory (RAM) data plane on Redis Cloud — recording session events for an AI agent, creating or searching long-term memories, configuring a memory store, or tuning background memory promotion. Code examples use the official `redis-agent-memory` (Python) and `@redis-iris/agent-memory` (TypeScript) SDKs.

2026-05-2764

redis-clustering.md

from "redis/agent-skills"

Redis Cluster and replication guidance covering hash tags for multi-key operations, avoiding CROSSSLOT errors, and reading from replicas to scale read-heavy workloads. Use when designing keys for a sharded Redis Cluster, debugging CROSSSLOT errors on MGET / SDIFF / pipelines, configuring a multi-key transaction in a cluster, or routing reads to replicas for caches, analytics, or dashboards.

2026-05-2764

redis-connections.md

from "redis/agent-skills"

Redis client and connection guidance covering connection pooling, multiplexing, pipelining, client-side caching with RESP3, avoiding slow commands (KEYS, SMEMBERS, HGETALL), and tuning socket timeouts. Use when configuring a Redis client (redis-py, Jedis, Lettuce, NRedisStack), batching commands for throughput, eliminating per-request connection creation, iterating large keyspaces with SCAN, enabling client-side caching for read-heavy workloads, or setting connect and read timeouts.

2026-05-2764

redis-core.md

from "redis/agent-skills"

Core Redis modeling guidance — choose the right data structure (String, Hash, List, Set, Sorted Set, JSON, Stream, Vector Set) and use consistent colon-separated key names. Use when designing a Redis data model, caching objects, deciding between Hash and JSON, building counters, leaderboards, membership sets, or session stores, or when reviewing/cleaning up Redis key naming.

2026-05-2764

redis-observability.md

from "redis/agent-skills"

Redis observability guidance — which metrics to monitor (memory, connections, hit ratio, ops/sec, rejected connections), which built-in commands to reach for during incident triage (SLOWLOG, INFO, MEMORY DOCTOR, CLIENT LIST, FT.PROFILE), and when to use the Redis Insight GUI. Use when setting up monitoring or alerts for a Redis instance, diagnosing a performance regression, profiling a slow FT.SEARCH query, or wiring Redis metrics into Prometheus, Datadog, or similar.

2026-05-2764

redis-query-engine.md

from "redis/agent-skills"

Redis Query Engine (RQE) guidance covering FT.CREATE schema design, field type selection (TEXT, TAG, NUMERIC, GEO, GEOSHAPE, VECTOR), DIALECT 2 query syntax, efficient FT.SEARCH and FT.AGGREGATE queries, zero-downtime index updates via aliases, and the SKIPINITIALSCAN option. Use when defining a search index on Hash or JSON documents, picking between TEXT and TAG for filtering, writing FT.SEARCH queries with filters and SORTBY, managing or swapping indexes in production, or troubleshooting slow searches with FT.PROFILE.

2026-05-2764

package.json

"author": "redis"

"repository": "redis/agent-skills"

GitHub 저장소 열기 Creator 저장소 보기

$ install --global

$ download --local

Manus에서 실행

$ useful --forSOC

소프트웨어 개발자컴퓨터 및 수학직15-1252L4

name	redis-semantic-cache
description	Redis LangCache guidance for semantic caching of LLM responses on Redis Cloud — calling search/set via the SDK or REST API, tuning the similarity threshold, separating caches per task type, and filtering with custom attributes. Use when caching LLM completions or RAG answers to cut API cost and latency, building a cache-aside layer in front of OpenAI / Anthropic / etc., tuning hit rate vs precision, or splitting one app's LLM workloads into multiple LangCache caches.
license	MIT
metadata	{"author":"Redis, Inc.","version":"0.1.0"}

Redis Semantic Cache

Semantic caching for LLM responses with Redis Cloud's LangCache service. Stores prompts as embeddings; subsequent semantically-similar prompts return the cached response without re-calling the model.

LangCache is currently in preview on Redis Cloud. Features and behavior may change.

When to apply

Wrapping an LLM call (OpenAI, Anthropic, etc.) with a cache layer to cut cost and latency.
Caching RAG answers, classification outputs, or any deterministic LLM workload.
Tuning the precision/hit-rate trade-off for a semantic cache.
Splitting one application's LLM workloads across multiple cache instances.

1. The cache-aside flow

LangCache fits in front of any LLM call as a standard cache-aside pattern:

Send the user's prompt to LangCache's search.
Cache hit — return the stored response directly.
Cache miss — call the LLM, then set the response so future similar prompts hit.

from langcache import LangCache
import os

lang_cache = LangCache(
    server_url=f"https://{os.getenv('HOST')}",
    cache_id=os.getenv("CACHE_ID"),
    api_key=os.getenv("API_KEY"),
)

result = lang_cache.search(prompt="What is Redis?", similarity_threshold=0.9)
if result:
    response = result[0]["response"]
else:
    response = llm.generate("What is Redis?")
    lang_cache.set(prompt="What is Redis?", response=response)

The same operations are available via REST (POST /v1/caches/{cacheId}/entries/search and POST /v1/caches/{cacheId}/entries) when an SDK isn't an option.

See references/langcache-usage.md for full SDK + REST samples and attribute-based storage.

2. Tune the similarity threshold

Threshold	Behavior	Use when
0.95+	Near-exact match required	Customer-facing answers where wrong responses are costly
0.9	Balanced default	Most workloads — start here
0.8	Loose semantic match	Internal tools, exploratory queries, FAQ deduplication

# Stricter — fewer false positives
result = lang_cache.search(prompt="What is Redis?", similarity_threshold=0.95)

# Looser — higher hit rate
result = lang_cache.search(prompt="What is Redis?", similarity_threshold=0.8)

Adjust by watching the actual cache-hit rate and spot-checking that returned answers are still relevant.

See references/best-practices.md.

3. Separate caches per task type

support_cache = LangCache(server_url=..., cache_id="support-cache-id", api_key=...)
code_cache    = LangCache(server_url=..., cache_id="code-cache-id",    api_key=...)

References

LangCache documentation

redis-semantic-cache

Redis Semantic Cache

When to apply

1. The cache-aside flow

2. Tune the similarity threshold

3. Separate caches per task type

References

이 저장소의 다른 Skills

이 저장소의 다른 Skills

Redis Semantic Cache

When to apply

1. The cache-aside flow

2. Tune the similarity threshold

3. Separate caches per task type

References