en un clic
wpt-gen-llm
// Best practices for configuring LLM integrations, concurrent networking, context scraping, and managing prompts in WPT-Gen.
// Best practices for configuring LLM integrations, concurrent networking, context scraping, and managing prompts in WPT-Gen.
Instructions to manually test the generate workflow of WPT-Gen to verify system integrity.
Generate Web Platform Tests (WPT) from minimal XML test suggestions. The agent will autonomously determine the test type and implementation details by analyzing existing repository paradigms. Use when the user asks to generate a Web Platform Test based on a test suggestion.
Best practices for CLI infrastructure, outputs, subprocess management, and templating in WPT-Gen.
Instructions on managing dependencies, build tools, project architecture rules, and integrating workflows via the Makefile in WPT-Gen.
Guidelines for Python testing using pytest, including coverage constraints, mock migrations, type safety (mypy), and style linting (ruff) in WPT-Gen.
Guidelines for finalizing changes, running presubmit checks, and preparing for submission in WPT-Gen.
| name | wpt-gen-llm |
| description | Best practices for configuring LLM integrations, concurrent networking, context scraping, and managing prompts in WPT-Gen. |
This document outlines the best practices for LLM integrations, prompt pipeline safety, and context extraction within the wpt-gen repository.
WPT-Gen supports Google Gemini, OpenAI, and Anthropic models via a unified abstraction.
google-genai): Used primarily for deep context reasoning (e.g., gemini-3.1-pro-preview) and fast generation (gemini-3-flash-preview). API keys are read from GEMINI_API_KEY.openai): Used as an alternative provider. API keys are read from OPENAI_API_KEY.anthropic): Used as an alternative provider (Claude models). API keys are read from ANTHROPIC_API_KEY.The agentic workflow uses different model categories based on the complexity of the task, as configured in wpt-gen.yml.
Providing context is critical for minimizing hallucinations, but it involves extensive network I/O blockades.
asyncio.gather combined with asyncio.to_thread for concurrent network requests (e.g. fetching 5 MDN explainer URLs at once). Blocking sequential for loops across network dependencies creates catastrophic UI lag.HTTPError 429 (Too Many Requests). Do not depend on LLMs blindly surviving a dropped HTTP request.trafilatura to extract dense text from W3C Specification URLs linked to web features.Prompt structure determines output quality, but unbounded inputs determine runtime OOM exceptions.
{% if %} statements to avoid feeding the LLM empty strings when variables map to null.