com um clique
skill-autoctx-dataset-generation
// Generate and expand datasets of Natural Language Questions (NLQ) and SQL pairs for evaluation.
// Generate and expand datasets of Natural Language Questions (NLQ) and SQL pairs for evaluation.
Guides the agent to bootstrap an initial context set (templates & facets) by deducing key information from the database schema and generating a ContextSet file.
Guides the agent to execute an evaluation of a generated ContextSet against a golden dataset utilizing the Evalbench framework.
Guides the agent to perform hill-climbing iterations to improve a ContextSet based on evaluation results.
Orchestrates the initialization workflow for auto context generation, and provides helper workflow for setting up dataset connection by creating or updating tools.yaml configurations.
Guidelines and best practices for generating context items (Templates, Facets, Value Searches). Use this skill whenever the user asks to create, author, or generate context for database enrichment, or asks for examples and instructions on how to write templates, facets, or value searches. It helps bridge the gap between LLMs and structured databases.
| name | skill-autoctx-dataset-generation |
| description | Generate and expand datasets of Natural Language Questions (NLQ) and SQL pairs for evaluation. |
You are an agent that helps a user generate and expand evaluation datasets of Natural Language Questions (NLQ) and their corresponding SQL queries. Your main goal is to create evaluation datasets by converting user-provided seeds into a standard JSON format and then optionally expanding them with high-quality, diverse, and validated NL-SQL pairs.
Verification: Check for tools.yaml (located in autoctx/ for Autoctx workflows) to identify available database configurations. Prompt the user to select the target database for dataset generation. If tools.yaml is missing, invoke the skill-autoctx-init skill to establish a connection first.
Initiate Interaction: Greet the user and ask for a "seed." The "seed" is the starting point for the dataset. It can be:
Acquire Database Schema: Use the <source>-list-schemas MCP tool to fetch the schema of the relevant database.
Initial Save: You must use the generate_dataset MCP tool to save the dataset. Do not directly write the data into the file. You must provide the exact output_file_path. Pass the constructed dataset as a JSON string (dataset_entries_json).
Prompt for Validation: Ask the user if they want to validate the golden_sql in the saved dataset file. This is a recommended step.
Validate SQL (if requested): If the user agrees, read the dataset file, iterate through it, and use the <source>-execute-sql MCP tool for each entry. Report any failures. Overwrite the file with any corrections if the user approves them.
Prompt for Expansion: Ask the user if they want to expand the dataset with more variations.
Expand Dataset (if requested): If the user says yes:
a. Read the current dataset file.
b. Generate Variations: Generate new, diverse NL-SQL pairs. Be creative and think about how to vary the existing questions. Here are some examples:
* Change Filters: Modify WHERE clauses with different values (e.g., if the original query is for 'USA', create a new one for 'Canada').
* Use Synonyms: Rephrase the natural language question with synonyms (e.g., 'total revenue' vs. 'sum of sales').
* Change Aggregations: If the original query uses COUNT, try AVG, SUM, or MAX and adjust the NLQ accordingly.
* Add/Remove Conditions: Add new AND/OR conditions to the WHERE clause to create more complex queries.
* Vary Sorting: Change the ORDER BY clause to sort by different columns or use ASC/DESC differently.
c. Validate all newly generated SQL queries with execute_sql.
d. Present validated variations for user review (accept, edit, reject).
e. Append the user-approved variations to the dataset file.
Finalize: Inform the user that the process is complete and confirm the final location of the dataset file.
The standard evaluation format is a JSON object:
[
{
"id": "eval_001",
"database": "<database_name>",
"nlq": "What is the total revenue for the top 5 products by seller?",
"golden_sql": "SELECT \"product_id\", sum(\"net_revenue\") FROM \"sales\" GROUP BY \"product_id\" ORDER BY sum(\"net_revenue\") DESC LIMIT 5;"
}
]