Jeden Skill in Manus ausführen
mit einem Klick

Jeden Skill in Manus mit einem Klick ausführen

browser-automation

Automate Google Chrome for web tasks — navigate sites, fill forms, click elements, take snapshots, and extract content. Powered by playwright-cli (pw) with a persistent browser profile. Use when asked to interact with a website, fill out a checkout form, scrape content, or perform any browser-based workflow.

In Manus ausführen

Überblick

Installationsbefehl

npx skills add https://github.com/ramp-public/ramp-cli --skill browser-automation

Kopieren Sie diesen Befehl und fügen Sie ihn in Claude Code ein, um den Skill zu installieren

Quelle

ramp-public/ramp-cli

Sterne30

Forks4

Aktualisiert15. Mai 2026 um 23:25

SKILL.md

readonly

Mehr aus diesem Repository

gleiches Repository

agentic-purchase

ramp-public/ramp-cli

Make purchases using Ramp agent cards via browser checkout, then complete all transaction requirements (memo, tracking categories, receipt, trip). Use when asked to buy something with an agent card, make a payment using Ramp, spend from a fund, complete missing transaction items, or test the agent card payment flow. Also use when asked about agent card access, availability, or how to get started with Agentic Cards. Requires ramp CLI and browser-automation skill.

2026-05-1530

approval-dashboard

ramp-public/ramp-cli

Review and approve pending transactions, bills, reimbursements, and requests. Use when: 'approve', 'pending approvals', 'what needs my approval', 'review transactions', 'approve bills', 'reject', 'approval queue', 'clear my approvals'. Do NOT use for: transaction analysis, receipt uploads, or spend tracking.

2026-05-1530

manage-bills

ramp-public/ramp-cli

Search, inspect, and manage vendor bills and invoices. Use when: 'find a bill', 'show me pending bills', 'bill details', 'look up a bill', 'draft bill details', 'bill attachments', 'what bills need my approval'. Do NOT use for: approving bills (use approval-dashboard), uploading vendor documents (use vendor-document-upload), or card transaction management (use transaction-cleanup).

2026-05-1530

manage-procurement

ramp-public/ramp-cli

Search, inspect, summarize, and safely review procurement requests and purchase orders. Use when: 'find a PO', 'show procurement request details', 'purchase order status', 'what procurement requests need approval', 'approve this PO request'. Do NOT use for: bill payment, card transaction cleanup, reimbursements, or vendor document upload.

2026-05-1530

payment-lookup

ramp-public/ramp-cli

Look up vendor bill payments and verify payment status from the terminal. Use when: 'did we pay', 'payment status', 'check if paid', 'verify payment', 'find invoice', 'bill lookup', 'was this bill paid', 'payment confirmation'. Do NOT use for: approving bills (use approval-dashboard), spend analysis across vendors (use spend-analysis), or uploading receipts (use receipt-compliance).

2026-05-1530

spend-analysis

ramp-public/ramp-cli

Analyze spend by vendor, category, or team over a date range. Pulls card transactions and bill payments, aggregates totals, and produces summary tables. Use when: 'how much did we spend on', 'vendor spend', 'SaaS review', 'spend report', 'inference costs', 'total spend', 'spend by vendor', 'spend analysis', 'pull transactions for', 'cost breakdown'. Do NOT use for: approving transactions (use approval-dashboard), uploading receipts (use receipt-compliance), or verifying a single bill payment (use payment-lookup).

2026-05-1530

Quelle

ramp-public

ramp-public/ramp-cli

GitHub-Repository öffnen Creator-Repositorys ansehen

Installationsbefehl

Download

In Manus ausführen

Nützlich fürSOC

SoftwareentwicklerInformatik- und Mathematikberufe15-1252L4

name	browser-automation
description	Automate Google Chrome for web tasks — navigate sites, fill forms, click elements, take snapshots, and extract content. Powered by playwright-cli (pw) with a persistent browser profile. Use when asked to interact with a website, fill out a checkout form, scrape content, or perform any browser-based workflow.

Browser Automation

Automate Google Chrome via playwright-cli (pw). Maintains a persistent Chrome profile at ~/.pw-agent/.playwright-profile/ so logins and cookies survive across sessions.

Prerequisites

Node.js installed
playwright-cli installed: npm install -g @playwright/cli
Minimum version: 0.1.0 (check with playwright-cli --version)
If playwright-cli is not found after install, the npm global bin may not be in PATH. Fix with:
```
export PATH="$(npm root -g)/../bin:$PATH"
```

Setup

~/.pw-agent/ is the runtime home. It holds the pw launcher script and persistent Chrome state.

First-time setup

Create runtime directories:

mkdir -p ~/.pw-agent/.playwright-profile ~/.pw-agent/.playwright-cli ~/.pw-agent/.playwright-mcp

Create a launcher script at ~/.pw-agent/pw:

cat > ~/.pw-agent/pw << 'LAUNCHER'
#!/usr/bin/env bash
cd "$(dirname "$0")"
# Ensure npm global bin is in PATH
NPM_BIN="$(npm root -g 2>/dev/null)/../bin"
[ -d "$NPM_BIN" ] && export PATH="$NPM_BIN:$PATH"
exec playwright-cli "$@"
LAUNCHER
chmod +x ~/.pw-agent/pw

Seed the profile by opening a site:
```
cd ~/.pw-agent && ./pw open https://example.com
```
Log in to any services you need. Close the browser when done. The profile persists for future sessions.

CRITICAL: Never delete ~/.pw-agent/.playwright-profile/ — it contains saved logins and cookies. Re-setup should only recreate the launcher script and mkdir -p missing directories.

Usage

All commands run from ~/.pw-agent/:

cd ~/.pw-agent && ./pw open https://example.com
./pw snapshot        # Capture full DOM as YAML accessibility tree
./pw screenshot      # Capture viewport as PNG
./pw click e42       # Click element by ref ID
./pw fill e42 "text" # Fill form field
./pw press Enter     # Press keyboard key
./pw stop            # Close browser

Headed by default — do not pass `--headless`

The default mode is headed and you should keep it that way. Headed mode lets the user see the browser, intervene on bot checks, and watch the agent — all of which matter most when money is moving.

Never pass --headless for:

Agent-card purchases or any payment flow
Logged-in workflows where a session may need to be re-authenticated
Any task where the user is at the keyboard and would benefit from seeing the browser

--headless is only appropriate for unattended scraping or background inspection where no human is watching, and even then ask the user first.

./pw --headless open https://example.com   # Launch headless (only when explicitly approved)
./pw snapshot                               # Already running — no flag needed

Pass --headless only on the command that launches the browser (open for session 0, start for numbered sessions). Once launched, the session retains its mode for its lifetime — switching requires ./pw stop and re-opening.

Multi-session support

Run parallel browser sessions with isolated contexts:

./pw 1 start https://site-a.com
./pw 1 snapshot
./pw 1 stop

./pw 2 start https://site-b.com   # Parallel session
./pw 2 snapshot
./pw 2 stop

Named sessions for readability:

./pw --session "checkout" open https://store.com
./pw --session "checkout" snapshot
./pw --session "checkout" click e42
./pw --session "checkout" stop

If ./pw N start says "Session already active", pick another number or stop the existing one.

Navigating within a session: Don't use open on an already-running session. Use eval instead:

./pw eval "() => { window.location.href = 'https://example.com/new-page' }"
sleep 3  # Wait for SPA to render
./pw screenshot

Output artifacts

Snapshots: .playwright-cli/page-<timestamp>.yml
Screenshots: .playwright-cli/page-<timestamp>.png
Downloads: .playwright-mcp/

Screenshots vs Snapshots

Screenshots capture the visible viewport only
- Use to verify visual state, debug layout, understand what's visible/hidden
- Use to decide what action to take next (click, scroll, type)
Snapshots capture the entire DOM, not just the visible viewport
- Use to find element refs ([ref=eNNN]) for programmatic interaction
- Use for greppable text content and DOM structure
- Grep snapshots instead of reading hundreds of lines:
```
grep -i "submit\|checkout\|button" .playwright-cli/page-*.yml | head -10
```

Element refs

./pw snapshot produces a YAML accessibility tree with [ref=eNNN] identifiers for interactive elements:

./pw snapshot           # Find element refs
./pw fill e42 "query"   # Interact by ref
./pw click e15          # Click by ref

Refs are invalidated after page changes (navigation, AJAX, dynamic content) — re-snapshot to get fresh refs.

Prefer keyboard over clicks for more reliable interaction:

Tab — next element / form field
Enter — select autocomplete result, activate button
Meta+Enter — submit forms, send messages
Escape — close dialogs, cancel operations
PageUp/PageDown — scroll to load more content

Wait for async content:

SNAPSHOT=$(./pw snapshot 2>&1 | grep -oE '\.playwright-cli/[^ )]+\.yml')
grep -iE "results\|loaded" "$SNAPSHOT" -C 5 || echo "not ready yet"

Dropdown `<select>` elements

fill does NOT work on <select> elements. Use the select command instead:

./pw select <ref> "Option Label"   # Select by visible text
./pw select <ref> "value"          # Select by value attribute

The snapshot shows <select> elements as combobox roles. If fill errors with "Element is not an <input>", switch to select.

Typing vs filling

fill sets the value programmatically — fast but may not trigger React/JS change handlers on some sites. If fill doesn't work:

Click the field first: ./pw click <ref>
Clear it: ./pw press Meta+a then ./pw press Backspace
Type character by character: ./pw press 3 then ./pw press . then ./pw press 4 etc.

Some numeric inputs strip non-numeric characters (e.g., . in donation amount fields that only accept whole numbers).

Nested iframes

Payment forms (Stripe, Braintree) are often inside nested iframes. Playwright-cli handles this transparently — element refs like f20e21 (with f prefix + nested frame IDs) work across iframe boundaries. Just use the ref from the snapshot as you would any other element.

Stripe iframe refs change when the iframe is recreated (e.g., navigating between wizard steps). Always re-snapshot to get fresh refs after page transitions.

Dismissing popups and modals

Many sites show cookie banners, login prompts, or promotional modals:

SNAPSHOT=$(./pw snapshot 2>&1 | grep -oE '\.playwright-cli/[^ )]+\.yml')
grep -iE "dismiss|close|no.thanks|decline|accept.*cookie" "$SNAPSHOT" | head -5
./pw click e42   # Click the dismiss button

JavaScript evaluation

Use heredocs to avoid shell quoting issues:

./pw eval "$(cat <<'EOF'
() => {
  var links = document.querySelectorAll('a[href]');
  return Array.from(links).map(a => a.href).slice(0, 10);
}
EOF
)"

IMPORTANT: eval expects a function definition, not a value expression:

# DO THIS — pass a function
./pw eval "() => document.title"

# NOT THIS — causes "TypeError: result is not a function"
./pw eval "document.title"

Scrolling

mousewheel requires positioning the mouse over the scrollable content first:

./pw mousemove 600 400   # Position over content
./pw mousewheel 500 0    # Scroll down 500px

mousewheel 500 0 = scroll down 500px
mousewheel -500 0 = scroll up 500px
mousewheel 0 500 = scroll right 500px

Authentication tips

When a site requires login:

Check if there's an SSO/OAuth option — prefer it over username/password
If you need credentials, ask the user — never guess passwords
For sites with saved sessions, the persistent profile may already be authenticated
After logging in, the session persists in the profile for future use

Bot checks, CAPTCHAs, and human handoff

When you hit a CAPTCHA, reCAPTCHA, 3DS challenge, login wall, "verify you're human" page, or anything else that requires human input, do not retry programmatically. The agent will not solve these reliably, and on payment flows it should not try. Instead, hand the wheel to the user:

Confirm the browser is headed. If you launched with --headless, the user cannot interact with the page — ./pw stop and re-open without the flag before continuing.
Screenshot the current state so the user knows what they are looking at:
```
./pw screenshot
```
Hand off to the user in plain language. Tell them what is on the screen and what to do in the visible Chrome window. Example: "There's a reCAPTCHA on the Gumroad checkout. Switch to the Chrome window I opened, solve it, and reply when you're past it."
Wait. Do not poll the page or re-snapshot. Wait for the user's reply before doing anything else.
Resume. Once the user confirms, snapshot again and continue from where you stopped. Element refs may have changed during their interaction — re-grep before clicking.

This pattern works because the persistent profile and headed Chrome mean the user shares the same browser session as the agent. They do not need to restart anything, log in again, or copy URLs — they just interact with the window that's already on their screen.

Debugging checkout issues

When a checkout form behaves unexpectedly, check browser console logs:

cat ~/.pw-agent/.playwright-cli/console-*.log | tail -30

Look for HTTP status codes and error events — they reveal what the UI may not show.

Multi-step checkout wizards

Some checkout widgets use multi-step wizards where the Stripe iframe is destroyed between steps. Card data entered in step N may not be preserved when you reach step N+3.

Workaround strategies:

Prefer merchants with single-page checkout forms — fewer points of failure
Look for a direct checkout URL instead of an embedded widget
If stuck with a multi-step wizard, move through the non-card steps as quickly as possible

Best practices

Don't stop to narrate intermediate states — keep clicking through auth flows and loading screens until you hit an actual blocker
Try the action before assuming it won't work
Busywait with exponential backoff for page loads and async content
After a browser session, summarize actions in shorthand: open <url> -> click e42 -> fill e15 "query" -> press Enter
Run date at session start to know current date/time for interpreting relative dates

browser-automation

Mehr aus diesem Repository

Mehr aus diesem Repository

Browser Automation

Prerequisites

Setup

First-time setup

Usage

Headed by default — do not pass --headless

Multi-session support

Output artifacts

Screenshots vs Snapshots

Element refs

Dropdown <select> elements

Typing vs filling

Nested iframes

Dismissing popups and modals

JavaScript evaluation

Scrolling

Authentication tips

Bot checks, CAPTCHAs, and human handoff

Debugging checkout issues

Multi-step checkout wizards

Best practices

Browser Automation

Prerequisites

Setup

First-time setup

Usage

Headed by default — do not pass --headless

Multi-session support

Output artifacts

Screenshots vs Snapshots

Element refs

Dropdown <select> elements

Typing vs filling

Nested iframes

Dismissing popups and modals

JavaScript evaluation

Scrolling

Authentication tips

Bot checks, CAPTCHAs, and human handoff

Debugging checkout issues

Multi-step checkout wizards

Best practices

Headed by default — do not pass `--headless`

Dropdown `<select>` elements

Headed by default — do not pass `--headless`

Dropdown `<select>` elements