Print confirmation block. Wait for y.
Order matters: building-block skills first, then project code.
-
target_dir/.gitignore — extend with data/, INVOICE_PDFS/, *.onnx, scripts/__pycache__/.
-
target_dir/pyproject.toml — start from shared/templates/pyproject.toml.template (do NOT hand-roll one). The template already pins the correct build backend (setuptools.build_meta, NOT setuptools.backends.legacy:build — that name does not exist in setuptools>=68 and pip install -e . will fail with ModuleNotFoundError on a fresh venv; v3 friction P0-V3-N4). Then extend dependencies with:
- Always:
fastapi>=0.110, uvicorn[standard]>=0.27, langchain-core>=0.3, langchain-community>=0.3, langchain-openai>=0.2, langgraph>=0.2, openai>=1.40, onnx2oracle, Faker>=24, python-multipart.
- Idea 1: + (no extras).
- Idea 2: + (no extras).
- Idea 3: +
reportlab>=4, pypdf>=4.
- Do NOT add
oci-openai, oci, oracle-database-mcp-server, or hand-rolled ONNX deps. Those are friction P0-1 / P0-2 / P0-4 — superseded by the bearer-token + local-BaseTool + onnx2oracle paths.
- Imports use the installed package name, not the on-disk path. Even though sources live under
src/<package_slug>/, the [tool.setuptools.packages.find] where = ["src"] line in the template means pip install -e . installs the package as <package_slug> (no src. prefix). Always import from <package_slug>.foo import bar — from src.<package_slug>.foo raises ModuleNotFoundError: No module named 'src' (v3 friction P1-V3-F-3).
-
src/<package_slug>/inference.py — copy shared/snippets/oci_chat_factory.py verbatim. It uses the upstream openai SDK against the OCI Generative AI bearer-token endpoint (OCI_GENAI_BASE_URL defaults to us-phoenix-1, auth via OCI_GENAI_API_KEY). Model id is the full xai.grok-4. The earlier OCI-SDK SigV1 path is in archive/ only.
-
Per-idea agent module — IMPORTANT, read both warnings before writing code:
Warning A — LangChain 1.x removed AgentExecutor and create_tool_calling_agent. They were in langchain.agents in 0.3.x; in 1.x the agent loop has moved to LangGraph (langgraph.prebuilt.create_react_agent) or to plain .bind_tools() + manual loop. Do NOT import them from langchain.agents — ImportError on a fresh venv (v3 friction P0-V3-N2).
Warning B — Grok-4 over the OCI OpenAI-compat endpoint stops emitting structured tool_calls after ~2 turns. On the 3rd+ tool call it returns plain text like Function: run_sql({"query": "..."}) instead of an OpenAI-shape tool_calls object, which LangGraph + LangChain agents cannot parse. The reliable shape at this tier is therefore a 2-step pipeline, not an open agent loop (v3 friction P0-V3-N3):
from <package_slug>.inference import get_chat_client
from <package_slug>.tool_registry import get_tools
def answer(user_q: str) -> dict:
tools = get_tools()
llm = get_chat_client()
plan = llm.bind_tools(tools).invoke([{"role": "user", "content": user_q}])
results = [t.run(call["args"]) for call in plan.tool_calls
for t in tools if t.name == call["name"]]
final = llm.invoke([
{"role": "user", "content": user_q},
{"role": "assistant", "content": str(plan.tool_calls)},
{"role": "tool", "content": "\n".join(map(str, results))},
])
return {"answer": final.content, "tool_calls": plan.tool_calls,
"tool_results": results}
The 2-step pipeline produces grounded answers + the SQL/tool args used, which is what the demo needs. If a multi-step loop is essential (e.g. idea 3's "vector then SQL then both" routing), split it into multiple top-level answer() calls and orchestrate from the FastAPI adapter — never let the LLM drive >2 tool turns in one call. The intermediate v3 cold-start walk proved this.
- Idea 2 →
src/<package_slug>/generate.py (one-shot script that walks the schema and INSERTs rows into SCHEMA_DOCS_DOCUMENTS with embeddings via VECTOR_EMBEDDING(MY_MINILM_V1 USING :description)) + src/<package_slug>/agent.py (RAG over the generated docs via vector_search MCP tool).
- Idea 3 →
src/<package_slug>/agent.py with a system prompt that explicitly teaches the agent the two-modality choice (vector for "find similar invoices to this PDF", run_sql for "sum unpaid amounts", both for "find unpaid invoices similar to X").
-
src/<package_slug>/adapter.py — FastAPI /v1/chat/completions wrapping the agent (same shape as beginner; differences: handles tool-call streaming events from the agent executor, surfaces them as OpenAI-compatible "function_call" deltas).
SQLcl-tee logging (folded in by default at this tier — friction-pass decision). Wrap the run_sql BaseTool with shared/snippets/sqlcl_tee.py so every SQL the agent emits gets teed through SQLcl into <target>/logs/sqlcl_<ts>.log. The wrapper appends [sqlcl_log: <path>] to the streamed response. Setup:
- Pre-flight: check that SQLcl is installed before scaffolding the wiring. Run
which sql (or command -v sql); if not on PATH, follow the install steps in shared/references/sqlcl-tee.md (~/opt/sqlcl) BEFORE writing the wiring. Do NOT assume /home/ubuntu/sqlcl/bin/sql or any other host-specific path is present (v3 friction P2-V3-N5). The wrapper degrades gracefully when SQLcl is missing — it appends [sqlcl_tee: skipped — SQLcl not installed] and the inner tool result passes through — but the user loses the inspectable log, so install is strongly recommended.
- In
src/<package_slug>/tool_registry.py, import from shared.snippets.sqlcl_tee import wrap_with_sqlcl_tee and wrap the run_sql tool that comes back from mcp_client.list_tools().
- Document SQLcl install in the project's README (link to
shared/references/sqlcl-tee.md).
- Why MCP+SQLcl: MCP shows the SQL the agent emits; SQLcl shows what the DB actually did (rows, errors, plan). Together you can debug an agent turn end-to-end.
- Observability inherited from
oracle-mcp-server-helper Steps 4.5+4.6: every run_sql call goes out tagged /* LLM in use is <model> */, sessions populate V$SESSION.MODULE/ACTION, and (if the user opts in) one row is inserted into CYP_MCP_LOG per call. README should mention the three diagnostic queries: SELECT module, action FROM v$session, SELECT * FROM v$sql WHERE sql_text LIKE '/* LLM in use is %', and SELECT * FROM CYP_MCP_LOG ORDER BY ts DESC FETCH FIRST 20 ROWS ONLY.
- If the user has SQLcl 25.2+: mention in the README that
sql -mcp is a drop-in alternative for the local-tool transport (Oracle's first-party MCP server, ships with DBTOOLS$MCP_LOG natively). Do not auto-switch — the local-tool scaffold remains the workshop default for portability.
-
verify.py — fill template:
- Round-trip:
len(get_embedder().embed_query("dim check")) == 384.
- Smoke: query the registered ONNX model directly via SQL.
- Smoke: list MCP tools — assert at least the per-idea allowed list is present.
- Smoke: a single chain call asking a simple question of the seeded data.
-
notebook.ipynb — 8 cells:
- Setup (load
.env, smoke verify).
- Show the registered ONNX model (
SELECT * FROM USER_MINING_MODELS WHERE MODEL_NAME='MY_MINILM_V1').
- Show MCP tools list.
- One direct
vector_search MCP call.
- One
run_sql MCP call.
- One full agent turn (idea-specific question).
- Show the chat history table populated.
- "Now run
python -m <pkg>.adapter and open http://localhost:3000."
-
README.md — fill placeholders. "Why Oracle" paragraph names: in-DB ONNX embeddings, AI Vector Search, oracle-database-mcp-server, JSON Duality (idea 3), persistent chat history. Include the "Why in-DB embeddings?" callout from intermediate/project-ideas.md verbatim — it's the load-bearing pitch.
If you grow this intermediate project into a multi-user agent — multiple humans, each wanting their preferences and durable facts auto-extracted and recalled across sessions — swap the manual chat history layer for OAMP (oracleagentmemory PyPI package). OAMP owns per-user threads, automatic memory extraction, and prompt-ready context cards; the advanced tier wires it via shared/snippets/oamp_helpers.py against the same in-DB ONNX embedder + Grok-4 you're using here. See shared/references/oamp.md for the OAMP-vs-OracleVS-vs-OracleChatHistory-vs-SQL decision tree. Until then, the manual chat history is correct for single-user demos.