Run any Skill in Manus with one click

$pwd:

ops-logs-query

Name: Ops Logs Query
Author: boundless-xyz

// Internal — for Boundless team members only. Query AWS CloudWatch logs for Boundless services (provers, slasher, distributor, order stream, order generator, indexer, signal) on prod/staging environments. Use when the user asks to look at service logs, debug service behavior from log output, search logs for a request ID, or investigate errors using CloudWatch. Do NOT use for debugging local code changes, reviewing PRs, or investigating issues in the codebase itself.

Run Skill in Manus

$ git log --oneline --stat

stars:194

forks:174

updated:May 19, 2026 at 20:55

SKILL.md

readonly

related-skills.json

same repository

pr.md

from "boundless-xyz/boundless"

Create or update a pull request for the current git changes. Use when the user wants a PR, commit/push for review, or gh pr create flow.

2026-05-19194

ops-check-balances.md

from "boundless-xyz/boundless"

Internal — for Boundless team members only. Audit native ETH, market deposit, prover collateral, and distributor ZKC reserve balances for every operator-managed address (provers, distributor, order generators, signal). Use when the user wants to know which addresses need topping up, asks about the balance of provers/OGs/distributor/signal signers, says something is "running low" or "out of gas", or wants a periodic operational health check on operator wallets. Defaults to prod env (mainnets + prod testnets) — pass `--all` to also include staging.

2026-05-17194

ops-pipelines.md

from "boundless-xyz/boundless"

Internal — for Boundless team members only. Monitor Boundless deployment pipelines (AWS CodePipeline + CodeBuild) on the ops account. Use when the user wants to track a deployment after merging a PR, check whether a commit has rolled out to staging/prod, diagnose a failed deployment, watch the status of a specific pipeline, or get prompted to approve a production rollout once staging succeeds. Do NOT use for service runtime debugging (use ops-logs-query) or for deploying dev infrastructure (use ops-infra-deploy).

2026-05-04194

ops-telemetry-query.md

from "boundless-xyz/boundless"

Internal — for Boundless team members only. Query Boundless broker telemetry tables on Redshift for prod/staging operational data. Use when the user asks about broker health, request evaluations, request completions, proving times, skip rates, telemetry data, or wants to run SQL against the telemetry database on live networks. Also covers historical telemetry through 2026-04-24 stored as Parquet archives in S3 (queried via DuckDB). Do NOT use for debugging local code changes, reviewing PRs, or investigating issues in the codebase itself.

2026-04-24194

setup-prover.md

from "boundless-xyz/boundless"

Set up and deploy a Boundless prover to a GPU server using Ansible. Handles inventory setup, SSH connectivity, NVIDIA drivers, Docker, and the full bento stack. Use when deploying a new prover, redeploying to an existing server, or troubleshooting prover infrastructure.

2026-04-21194

ops-query.md

from "boundless-xyz/boundless"

Internal — for Boundless team members only. Cross-reference Boundless indexer API data, broker telemetry, and service logs to investigate production and staging operational issues. Use when the user wants to understand why slashings happened on prod/staging, diagnose prover or service failures in deployed environments, correlate market events with broker behavior, investigate fulfillment rate drops, look at prover/service logs, or perform any analysis that requires combining on-chain indexer data with off-chain broker telemetry and CloudWatch logs. Also use when the user asks to "investigate", "diagnose", or "find root cause" for prover, service, or market issues on live networks. Do NOT use for debugging local code changes, reviewing PRs, or investigating issues in the codebase itself.

2026-04-16194

package.json

"author": "boundless-xyz"

"repository": "boundless-xyz/boundless"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Network and Computer Systems AdministratorsComputer and Mathematical Occupations15-1244L4

name

ops-logs-query

description

Internal — for Boundless team members only. Query AWS CloudWatch logs for Boundless services (provers, slasher, distributor, order stream, order generator, indexer, signal) on prod/staging environments. Use when the user asks to look at service logs, debug service behavior from log output, search logs for a request ID, or investigate errors using CloudWatch. Do NOT use for debugging local code changes, reviewing PRs, or investigating issues in the codebase itself.

Logs Query

Query AWS CloudWatch Logs for Boundless services on prod/staging.

Prerequisites

Read network_secrets.toml from the repo root. Extract the AWS credentials for the target environment from [aws.prod] or [aws.staging] (access_key_id, secret_access_key). If the file is not present, recommend the user create it -- instructions and credentials are in the Boundless runbook.
Export credentials before running any queries:

export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
export AWS_DEFAULT_REGION="us-west-2"

Finding Log Groups

Prover log groups

Prover log groups follow the pattern /boundless/bento/<hostname>. The hostnames are defined in the Pulumi config files:

Staging: infra/cw-monitoring/Pulumi.staging.yaml
Prod: infra/cw-monitoring/Pulumi.production.yaml

Read the relevant Pulumi config to find current hostnames. As of now:

Log Group	Environment	Chain ID	Label (approx, in `network_address_labels`)	Description
`/boundless/bento/prover-84532-staging-nightly`	staging	84532 (Base Sepolia)		Staging nightly prover
`/boundless/bento/prover-8453-prod-release`	prod	8453 (Base)	`BPLatitudeRelease`	Prod release prover. Legacy name: `/boundless/bento/base-mainnet-prover-release`
`/boundless/bento/prover-8453-prod-nightly`	prod	8453 (Base)	`BPLatitudeNightly`	Prod nightly prover. Legacy name: `/boundless/bento/base-mainnet-prover-nightly`
`/boundless/bento/prover-84532-prod-nightly`	prod	84532 (Base Sepolia)		Prod Base Sepolia nightly prover
`/boundless/bento/prover-11155111-prod-nightly`	prod	11155111 (Eth Sepolia)		Prod Eth Sepolia nightly prover
`/boundless/bento/prover-01`	prod	8453 (Base)	`BPProver01DC`	Prod datacenter prover 01
`/boundless/bento/prover-02`	prod	8453 (Base)	`BPProver02DC`	Prod datacenter prover 02

The label column lists the network_address_labels.json label that the log group is believed to correspond to. Names may not match exactly -- always confirm against network_address_labels.json and Pulumi config before relying on the mapping.

If these are out of date, check the Pulumi config files for the current list.

We only have prover log groups for provers we operate. External provers do not have queryable logs.

Provers we operate are labeled with a BP prefix in network_address_labels.json (e.g. BP1, BP2, BPNightlyAWS). When investigating any issue, always highlight what our BP provers are doing -- did they skip, fail, drop, or fulfill? This should be called out explicitly even when the investigation is not specifically about our provers.

Discovering log groups for other services

Other services (slasher, distributor, order stream, order generator, indexer backend, indexer API, etc.) have log groups that follow a naming convention but may change. Discover them dynamically rather than hardcoding.

Log group naming convention: l-<staging|prod>-<chain_id>-<service-name>-<chain_id>-<resource>

Example: l-staging-167000-indexer-api-167000-lambda

Known chain IDs:

84532 = Base Sepolia
8453 = Base Mainnet
167000 = Taiko Mainnet
11155111 = Eth Sepolia
1 = Eth Mainnet

To find log groups for a specific service, search by prefix. Use the environment (staging/prod) and optionally the chain ID and service name:

aws logs describe-log-groups \
  --log-group-name-prefix "l-staging-84532" \
  --query 'logGroups[].logGroupName' --output table

To find log groups for a specific service across all chains in an environment:

aws logs describe-log-groups \
  --query 'logGroups[?contains(logGroupName, `staging`) && contains(logGroupName, `indexer`)].logGroupName' \
  --output table

To list all log groups in an environment:

aws logs describe-log-groups \
  --log-group-name-prefix "l-staging" \
  --query 'logGroups[].logGroupName' --output table

Also check bento prover log groups:

aws logs describe-log-groups \
  --log-group-name-prefix "/boundless/bento" \
  --query 'logGroups[].logGroupName' --output table

Some services have multiple log groups for different components (e.g. an indexer may have separate groups for the backend worker and the API lambda). When investigating an issue, check all matching log groups.

Service name patterns

Common service name fragments to search for:

Service	Search fragments
Indexer API	`indexer-api`
Indexer backend	`indexer`, `market-indexer`, `rewards-indexer`
Order stream	`order-stream`
Order generator	`order-generator`, `og`
Slasher	`slasher`
Distributor	`distributor`
Signal	`prod-8453-signal` (no `l-` prefix)
Prover (bento)	`/boundless/bento/prover` or `/boundless/bento/-prover-`

Querying Logs

Always filter by time range. Log groups are high-volume and queries without time bounds will be slow or hit limits.

Use aws logs filter-log-events for searching. Key parameters:

--log-group-name: required
--start-time / --end-time: Unix milliseconds (required -- always set these)
--filter-pattern: CloudWatch filter syntax for searching log content
--output json: pipe through jq for readability

Computing timestamps

Convert human-readable times to Unix milliseconds:

START=$(date -j -u -f "%Y-%m-%dT%H:%M:%SZ" "2026-03-30T00:00:00Z" +%s 2>/dev/null || date -d "2026-03-30T00:00:00Z" +%s)
START_MS=$((START * 1000))

END=$(date -j -u -f "%Y-%m-%dT%H:%M:%SZ" "2026-03-31T00:00:00Z" +%s 2>/dev/null || date -d "2026-03-31T00:00:00Z" +%s)
END_MS=$((END * 1000))

For relative times:

NOW_MS=$(date +%s)000
ONE_HOUR_AGO_MS=$(( ($(date +%s) - 3600) * 1000 ))
SIX_HOURS_AGO_MS=$(( ($(date +%s) - 21600) * 1000 ))
ONE_DAY_AGO_MS=$(( ($(date +%s) - 86400) * 1000 ))

Searching by request ID

The most common query pattern. Request IDs appear in log messages as hex values (e.g. 0x2a):

aws logs filter-log-events \
  --log-group-name "$LOG_GROUP" \
  --start-time "$ONE_HOUR_AGO_MS" \
  --end-time "$NOW_MS" \
  --filter-pattern '"0xREQUEST_ID"' \
  --output json | jq '.events[] | {timestamp: (.timestamp / 1000 | todate), message: .message}'

Searching by request digest

aws logs filter-log-events \
  --log-group-name "$LOG_GROUP" \
  --start-time "$ONE_HOUR_AGO_MS" \
  --end-time "$NOW_MS" \
  --filter-pattern '"0xDIGEST"' \
  --output json | jq '.events[] | {timestamp: (.timestamp / 1000 | todate), message: .message}'

Searching for errors

aws logs filter-log-events \
  --log-group-name "$LOG_GROUP" \
  --start-time "$ONE_HOUR_AGO_MS" \
  --end-time "$NOW_MS" \
  --filter-pattern '"ERROR"' \
  --output json | jq '.events[] | {timestamp: (.timestamp / 1000 | todate), message: .message}'

Searching across multiple log groups

When a service has multiple log groups, query each one:

for LG in "l-staging-84532-indexer-api-84532-lambda" "l-staging-84532-market-indexer-84532-task"; do
  echo "=== $LG ==="
  aws logs filter-log-events \
    --log-group-name "$LG" \
    --start-time "$ONE_HOUR_AGO_MS" \
    --end-time "$NOW_MS" \
    --filter-pattern '"ERROR"' \
    --output json | jq '.events[] | {timestamp: (.timestamp / 1000 | todate), message: .message}'
  sleep 1
done

Pagination

filter-log-events returns a nextToken when there are more results:

TOKEN=""
while true; do
  if [ -n "$TOKEN" ]; then
    RESP=$(aws logs filter-log-events \
      --log-group-name "$LOG_GROUP" \
      --start-time "$START_MS" \
      --end-time "$END_MS" \
      --filter-pattern '"0xREQUEST_ID"' \
      --next-token "$TOKEN" \
      --output json)
  else
    RESP=$(aws logs filter-log-events \
      --log-group-name "$LOG_GROUP" \
      --start-time "$START_MS" \
      --end-time "$END_MS" \
      --filter-pattern '"0xREQUEST_ID"' \
      --output json)
  fi

  echo "$RESP" | jq '.events[] | {timestamp: (.timestamp / 1000 | todate), message: .message}'
  TOKEN=$(echo "$RESP" | jq -r '.nextToken // empty')
  [ -z "$TOKEN" ] && break
  sleep 1
done

CloudWatch Filter Pattern Syntax

"exact phrase" -- matches logs containing the exact phrase (quotes required)
?term1 ?term2 -- OR: matches logs containing either term
"term1" "term2" -- AND: matches logs containing both terms
"ERROR" "request_id" -- combine filters

Checking for Recent Deployments

When investigating fulfillment rate drops, prover downtime, or success rate alarms for provers we operate, always check for recent deployments first. Nightly deployments restart the bento Docker Compose stack and can cause extended outages if the new image is broken.

Deployment events appear in the bento prover log groups (e.g. /boundless/bento/prover-11155111-prod-nightly). Look for these patterns:

# Find recent deployments in a time window
aws logs filter-log-events \
  --log-group-name "$LOG_GROUP" \
  --start-time "$START_MS" \
  --end-time "$END_MS" \
  --filter-pattern '?"Stopping Docker Compose" ?"Starting Docker Compose"' \
  --output json | jq '.events[] | {timestamp: (.timestamp / 1000 | todate), message: .message}'

A deployment cycle looks like:

"Stopping Docker Compose services" — old containers torn down
"Image ghcr.io/boundless-xyz/boundless/broker:<tag> Pulling" — new image pulled (the tag contains the git commit, e.g. nightly-3b8a71f)
"Container bento-broker-1 Created" / "Starting" — new containers come up
Optionally: "dependency failed to start: container ... is unhealthy" — a container failed its healthcheck, cascading to broker failure
If "Starting Docker Compose" is missing or far behind "Stopping Docker Compose", the broker may be in graceful shutdown drain — search ?"starting graceful shutdown" ?"in-progress orders to complete" ?"Cancelling critical tasks". The broker waits up to 2h (SHUTDOWN_GRACE_PERIOD_SECS) for committed orders before exiting; during this window bento_active=0 and channel closed errors from the chain monitor are expected, not an outage.

Deployments are significant events -- they restart the broker (causing a brief gap in telemetry and fulfillments even when healthy) and deploy new code that could introduce bugs or behavior changes. Always note when a deployment occurred relative to the issue being investigated.

If the broker stopped fulfilling shortly after a deployment, check for:

Healthcheck failures: ?"unhealthy" ?"failed to start" ?"Error dependency" — a dependency container (often rest_api) failed, preventing the broker from starting
Container crashes: ?"exit" ?"Exited" ?"Restarting" — the broker or a dependency crashed after startup
Image tag: compare the deployed image tag (git commit hash) against the git log to identify what changed

# Check for container failures after a deployment
aws logs filter-log-events \
  --log-group-name "$LOG_GROUP" \
  --start-time "$START_MS" \
  --end-time "$END_MS" \
  --filter-pattern '?"unhealthy" ?"failed to start" ?"Exited" ?"Error dependency"' \
  --output json | jq '.events[] | {timestamp: (.timestamp / 1000 | todate), message: .message}'

Secondary Fulfillment in Logs

When a prover locks an order but fails to fulfill it before the lock expires, the order becomes available for secondary fulfillment by any other prover, who earns the slash collateral as reward. In broker logs, secondary fulfillment attempts appear as FulfillAfterLockExpire entries. When investigating expired or slashed requests, search our BP prover logs for the request ID to see if they evaluated the secondary fulfillment opportunity:

# Search for secondary fulfillment activity on a specific request
aws logs filter-log-events \
  --log-group-name "$LOG_GROUP" \
  --start-time "$START_MS" \
  --end-time "$END_MS" \
  --filter-pattern '"0xREQUEST_ID" "FulfillAfterLockExpire"' \
  --output json | jq '.events[] | {timestamp: (.timestamp / 1000 | todate), message: .message}'

If the request ID doesn't appear at all, the prover never saw the secondary opportunity. If it appears with skip or error messages, note the reason -- common issues include the order being unprofitable at the slash collateral price, insufficient remaining deadline, or the prover being at capacity. Always check whether our BP provers attempted secondary fulfillment on orders that expired after being locked.

Tips

Keep time windows as narrow as possible (minutes or hours, not days)
Start with a request ID filter, then broaden if needed
Log messages are typically structured (JSON or key=value), so jq is useful for parsing
If the output is very large, add | head -50 or pipe to a file
Use the --limit flag to cap results per API call (default 10000)
When unsure which log group to query, discover them first with describe-log-groups
Some services span multiple log groups -- check all matching groups when investigating

ops-logs-query

More from this repository

More from this repository

Logs Query

Prerequisites

Finding Log Groups

Prover log groups

Discovering log groups for other services

Service name patterns

Querying Logs

Computing timestamps

Searching by request ID

Searching by request digest

Searching for errors

Searching across multiple log groups

Pagination

CloudWatch Filter Pattern Syntax

Checking for Recent Deployments

Secondary Fulfillment in Logs

Tips

Logs Query

Prerequisites

Finding Log Groups

Prover log groups

Discovering log groups for other services

Service name patterns

Querying Logs

Computing timestamps

Searching by request ID

Searching by request digest

Searching for errors

Searching across multiple log groups

Pagination

CloudWatch Filter Pattern Syntax

Checking for Recent Deployments

Secondary Fulfillment in Logs

Tips