Jeden Skill in Manus ausführen
mit einem Klick

Jeden Skill in Manus mit einem Klick ausführen

$pwd:

ops-query

Name: Ops Query
Author: boundless-xyz

// Internal — for Boundless team members only. Cross-reference Boundless indexer API data, broker telemetry, and service logs to investigate production and staging operational issues. Use when the user wants to understand why slashings happened on prod/staging, diagnose prover or service failures in deployed environments, correlate market events with broker behavior, investigate fulfillment rate drops, look at prover/service logs, or perform any analysis that requires combining on-chain indexer data with off-chain broker telemetry and CloudWatch logs. Also use when the user asks to "investigate", "diagnose", or "find root cause" for prover, service, or market issues on live networks. Do NOT use for debugging local code changes, reviewing PRs, or investigating issues in the codebase itself.

In Manus ausführen

$ git log --oneline --stat

stars:194

forks:174

updated:16. April 2026 um 17:10

Datei-Explorer

8 Dateien

SKILL.md

readonly

related-skills.json

gleiches Repository

ops-logs-query.md

from "boundless-xyz/boundless"

Internal — for Boundless team members only. Query AWS CloudWatch logs for Boundless services (provers, slasher, distributor, order stream, order generator, indexer, signal) on prod/staging environments. Use when the user asks to look at service logs, debug service behavior from log output, search logs for a request ID, or investigate errors using CloudWatch. Do NOT use for debugging local code changes, reviewing PRs, or investigating issues in the codebase itself.

2026-05-19194

pr.md

from "boundless-xyz/boundless"

Create or update a pull request for the current git changes. Use when the user wants a PR, commit/push for review, or gh pr create flow.

2026-05-19194

ops-check-balances.md

from "boundless-xyz/boundless"

Internal — for Boundless team members only. Audit native ETH, market deposit, prover collateral, and distributor ZKC reserve balances for every operator-managed address (provers, distributor, order generators, signal). Use when the user wants to know which addresses need topping up, asks about the balance of provers/OGs/distributor/signal signers, says something is "running low" or "out of gas", or wants a periodic operational health check on operator wallets. Defaults to prod env (mainnets + prod testnets) — pass `--all` to also include staging.

2026-05-17194

ops-pipelines.md

from "boundless-xyz/boundless"

Internal — for Boundless team members only. Monitor Boundless deployment pipelines (AWS CodePipeline + CodeBuild) on the ops account. Use when the user wants to track a deployment after merging a PR, check whether a commit has rolled out to staging/prod, diagnose a failed deployment, watch the status of a specific pipeline, or get prompted to approve a production rollout once staging succeeds. Do NOT use for service runtime debugging (use ops-logs-query) or for deploying dev infrastructure (use ops-infra-deploy).

2026-05-04194

ops-telemetry-query.md

from "boundless-xyz/boundless"

Internal — for Boundless team members only. Query Boundless broker telemetry tables on Redshift for prod/staging operational data. Use when the user asks about broker health, request evaluations, request completions, proving times, skip rates, telemetry data, or wants to run SQL against the telemetry database on live networks. Also covers historical telemetry through 2026-04-24 stored as Parquet archives in S3 (queried via DuckDB). Do NOT use for debugging local code changes, reviewing PRs, or investigating issues in the codebase itself.

2026-04-24194

setup-prover.md

from "boundless-xyz/boundless"

Set up and deploy a Boundless prover to a GPU server using Ansible. Handles inventory setup, SSH connectivity, NVIDIA drivers, Docker, and the full bento stack. Use when deploying a new prover, redeploying to an existing server, or troubleshooting prover infrastructure.

2026-04-21194

package.json

"author": "boundless-xyz"

"repository": "boundless-xyz/boundless"

GitHub-Repository öffnen Creator-Repositorys ansehen

$ install --global

$ download --local

In Manus ausführen

$ useful --forSOC

Netzwerk- und ComputersystemadministratorenInformatik- und Mathematikberufe15-1244L4

name

ops-query

description

Internal — for Boundless team members only. Cross-reference Boundless indexer API data, broker telemetry, and service logs to investigate production and staging operational issues. Use when the user wants to understand why slashings happened on prod/staging, diagnose prover or service failures in deployed environments, correlate market events with broker behavior, investigate fulfillment rate drops, look at prover/service logs, or perform any analysis that requires combining on-chain indexer data with off-chain broker telemetry and CloudWatch logs. Also use when the user asks to "investigate", "diagnose", or "find root cause" for prover, service, or market issues on live networks. Do NOT use for debugging local code changes, reviewing PRs, or investigating issues in the codebase itself.

Query

Combine on-chain indexer data, off-chain broker telemetry, and CloudWatch service logs to investigate operational issues and find insights.

Setup

Set up the data sources needed for the investigation. Not all sources are needed for every query -- use the ones relevant to the task.

Read network_secrets.toml from the repo root. If it exists, it contains credentials for all environments (indexer API keys, telemetry DB URLs/passwords, AWS creds). Also read network_address_labels.json (same directory) for labelling addresses -- it is plain JSON ({"0xaddr": "label", ...}). If network_secrets.toml is not present, recommend the user create it -- instructions and credentials are in the Boundless runbook. If network_address_labels.json is not present, recommend the user create it -- the canonical address mapping is in the Boundless runbook.
Read and follow the ops-indexer-query skill at .claude/skills/ops-indexer-query/SKILL.md to set up indexer access (MARKET_INDEXER_URL, ZKC_INDEXER_URL, optional INDEXER_API_KEY, and the indexer_get helper function).
Read and follow the ops-telemetry-query skill at .claude/skills/ops-telemetry-query/SKILL.md to set up Redshift access (REDSHIFT_URL). Before writing any telemetry SQL, read crates/boundless-market/src/telemetry.rs for exact column names and enum values.
Read and follow the ops-logs-query skill at .claude/skills/ops-logs-query/SKILL.md to set up CloudWatch log access (AWS credentials, log group discovery). Use when the investigation benefits from raw service logs -- especially for our own operated provers, or for other services like indexer, order stream, slasher, etc.
Ask the user which network they want to investigate (determines which market indexer + ZKC indexer to use).

Data Source Overview

Source	What it knows	Join fields
Market Indexer	On-chain request lifecycle: submitted, locked, fulfilled, slashed, expired. Pricing, collateral, tx hashes, timestamps.	`request_id`, `request_digest`, prover/requestor addresses
Telemetry (Redshift)	Broker-side operational data: evaluation decisions, skip reasons, proving durations, error codes, queue depths, estimated vs actual proving times.	`request_id`, `request_digest`, `broker_address`, `order_id`
CloudWatch Logs	Raw service logs for provers we operate (and other infra services). Detailed error messages, stack traces, runtime behavior.	`request_id`, `request_digest`, timestamps

The key join between indexer and telemetry is request_id and request_digest. The indexer's lock_prover_address or fulfill_prover_address corresponds to telemetry's broker_address. Logs can be correlated by searching for the same request_id or request_digest within the relevant time window.

Telemetry is opt-in -- not all brokers send telemetry. If a prover address has no telemetry data, note this to the user. CloudWatch logs are only available for services we operate.

Investigation Workflow

All investigations follow the same pattern:

Identify targets -- Use the indexer to find the relevant requests/addresses/time periods.
Correlate with telemetry -- Use the request IDs, digests, or broker address + time window to look up telemetry data. Telemetry provides a pre-processed view with structured skip reasons, error codes, proving durations, and estimation accuracy -- this should answer most questions about why orders were skipped, failed, or slashed without needing raw logs.
Dig into logs (last resort) -- Only go to CloudWatch logs if the indexer and telemetry data are insufficient. Logs are useful when you need raw error messages, stack traces, or runtime details that telemetry doesn't capture (e.g. infrastructure-level failures, panics, or non-prover service issues). One particularly useful check: look for recent deployments in bento prover logs. Nightly deployments restart Docker Compose and can explain gaps in telemetry, sudden behavior changes, or outages. New code deployed can also introduce bugs. See the "Checking for Recent Deployments" section in the ops-logs-query skill.
Analyze and synthesize -- Combine findings from all sources into a coherent narrative.

Rate limit all sources:

Indexer: sleep 1 between requests, sleep 2 between pagination pages.
Redshift: No rate limit, but use LIMIT on exploratory queries.
CloudWatch: sleep 1 between paginated log queries.

Pre-Built Investigations

Before starting any work, check if the user's question matches a pre-built investigation. These are tested playbooks with the right queries, presentation format, and step-by-step instructions. Using them produces consistent, comprehensive results. Each lives in its own file under references/:

Investigation	File	When to use
Market Summary	references/market-summary.md	"How's the market?" / "give me a summary" / overview of health, prover activity, failures, skips
Slashing Reasons	references/slashing-reasons.md	Prover was slashed -- find out why
Fulfillment Rate Drops	references/fulfillment-rate-drops.md	Market or prover fulfillment rate declined, success rate alarms
Prover Performance	references/prover-performance.md	Deep dive into a specific prover's operational health
Request Lifecycle	references/request-lifecycle.md	Trace a specific request end-to-end across all data sources
Customer Expired Requests	references/customer-expired-requests.md	Customer's requests are expiring -- investigate why provers skip or fail to fulfill them

If the user's question clearly maps to one of these, read the corresponding file and follow it step by step. If it doesn't fit any pre-built investigation, fall back to the general Investigation Workflow above and build a custom investigation.

Aurora DB Instances

When investigating RDS/Aurora issues (storage, CPU, connections, etc.), never assume the DB identifier reflects the actual instance role. Instance identifiers containing reader or writer may be mislabeled -- the name is set at creation time and does not update if Aurora promotes/demotes instances.

Always determine the actual role by querying the instance metadata:

aws rds describe-db-instances \
  --query 'DBInstances[?contains(DBInstanceIdentifier, `prod-8453-indexer`)].{id: DBInstanceIdentifier, role: DBInstanceArn}' \
  --output table

Or more directly, check the cluster's member list with roles:

aws rds describe-db-clusters \
  --db-cluster-identifier "CLUSTER_ID" \
  --query 'DBClusters[0].DBClusterMembers[].{id: DBInstanceIdentifier, isWriter: IsClusterWriter}' \
  --output table

Use IsClusterWriter: true/false as the source of truth. When reporting findings, always state the actual role alongside the identifier, e.g. "instance *-reader-v19 (actual role: writer)".

Presenting Results

Addresses

Always show the full address when displaying broker/prover addresses. Do not truncate to 0x8305...04b5. If a label exists in network_address_labels.json, show both: 0x83052f16a84e6f2cec4bf3beda45c40c800904b5 (BP1).

Our Provers

Provers we operate are labeled with a BP prefix in network_address_labels.json (e.g. BP1, BP2, BPNightlyAWS). When investigating any issue, always highlight what our provers are doing -- did they skip the order, did they fail, did they drop it, what error codes are they hitting? This should be called out explicitly in every investigation, even when the issue is not specifically about our provers.

Prover Summary Tables

When showing prover activity, pivot telemetry outcomes into columns so each prover is one row. Include fulfilled, failures, and skips as separate columns. By default summary tables should cover the top 5 provers by volume plus all provers we operate (from address labels).

Failure and Skip Breakdowns

After the summary table, include two separate breakdown sections:

Failure breakdown: For each prover, show a per-prover table of outcome, error_code, summarized error_reason, and count. Group by error pattern, not by individual request.

Skip breakdown: Same structure — per-prover table of skip_code, example reason, and count, sorted by count descending.

Drop breakdown: Same structure — per-prover table of commitment_skip_code, reason, and count, sorted by count descending.

Alerts and Error Codes

Alerts always match on error codes (e.g. [B-PRO-501]), not on string patterns. Seeing a string like ProvingFailed in a log message does NOT mean it counts toward the proving-failed metric or alert. Only entries with the corresponding error code (e.g. [B-PRO-501]) are counted. When investigating alert triggers or counting occurrences for a specific alert, always filter by the error code, not by keyword/string matching.

Telemetry Terminology

Locked: Order was priced and the broker decided to try locking it on-chain.
Skipped: Order was rejected during pricing in the OrderPicker (e.g. unprofitable, wrong image, over capacity). It never reached the OrderMonitor.
Committed: Order was successfully committed to the proving pipeline (lock tx succeeded or immediate commitment for FulfillAfterLockExpire).
Dropped: Order reached the OrderMonitor but was NOT committed to the proving pipeline. Reasons include: lock tx failed, order was fulfilled/expired/locked by another prover before we could act, insufficient deadline remaining, or insufficient balance. Check commitment_skip_code for specifics.
Cancelled: Completion outcome meaning the broker finished proving but another prover fulfilled the order first. This is a race loss (wasted proving work), NOT an error. Do not count Cancelled as a failure in summary tables — show it as a separate column.

Secondary Fulfillment

When a prover locks an order but fails to fulfill it before the lock expires, the order becomes available for secondary fulfillment by any other prover in the network. The secondary fulfiller earns the slash collateral as a reward. In telemetry, secondary fulfillments are identified by fulfillment_type = 'FulfillAfterLockExpire' (in both evaluations and completions). In the indexer, a secondary fulfillment shows as fulfill_prover_address differing from lock_prover_address, and market aggregates include total_secondary_fulfillments.

When investigating expired or slashed requests, always check whether secondary fulfillment was attempted -- especially by our BP provers. Did they see the opportunity? Did they skip it, and why? Did they attempt it but fail?

ops-query

Mehr aus diesem Repository

Mehr aus diesem Repository

Query

Setup

Data Source Overview

Investigation Workflow

Pre-Built Investigations

Aurora DB Instances

Presenting Results

Addresses

Our Provers

Prover Summary Tables

Failure and Skip Breakdowns

Alerts and Error Codes

Telemetry Terminology

Secondary Fulfillment

Query

Setup

Data Source Overview

Investigation Workflow

Pre-Built Investigations

Aurora DB Instances

Presenting Results

Addresses

Our Provers

Prover Summary Tables

Failure and Skip Breakdowns

Alerts and Error Codes

Telemetry Terminology

Secondary Fulfillment