con un clic
chrome
// Browser automation for the user's Chrome browser. Use for browser tasks that require the user's cookies, logged-in sessions, existing tabs, extensions, or remote authenticated sites.
// Browser automation for the user's Chrome browser. Use for browser tasks that require the user's cookies, logged-in sessions, existing tabs, extensions, or remote authenticated sites.
| name | Chrome |
| description | Browser automation for the user's Chrome browser. Use for browser tasks that require the user's cookies, logged-in sessions, existing tabs, extensions, or remote authenticated sites. |
Use this skill when the user mentions @chrome.
Chrome is the routing touchpoint for the Codex Chrome Extension:
@chrome requests, do not ask a clarification question just because the request is ambiguous. Proceed with browser automation in this skill using the chrome backend.Before using this skill for the first time in the current conversation context, read the entire SKILL.md file in one read. Do not use a partial range such as sed -n '1,220p'; read through the end of the file. Do not mention this internal skill-loading step to the user.
On the first Chrome-backed browser task in a session, try a lightweight browser-client call such as listing open tabs after bootstrap. If the call fails, wait 2 seconds and retry the same lightweight browser-client call once. Any non-error response means the extension is installed and working.
If browser-client still reports that it cannot communicate with Chrome after that retry, confirm that Chrome is installed, running and that the extension is present in the selected Chrome profile:
From the plugin root, use node_repl to run:
scripts/chrome-is-running.js --check
scripts/installed-browsers.js --check
scripts/check-extension-installed.js --json
scripts/check-native-host-manifest.js --json
Depending on the outcome follow the following checks. Be sure to ask the user permission when required, if it is stated in the check.
Keep the first response short and non-technical unless the user asks for more information.
If Chrome is not installed, then inform the user that this plugin only works with the Chrome browser.
Keep the first response short and non-technical unless the user asks for more information.
If Chrome is not running then ALWAYS ask the User if they would like to launch Chrome. ALWAYS wait for a user response before taking action.
Keep the first response short and non-technical unless the user asks for more information.
Do not install or repair the native host yourself. If native host setup appears broken, tell the user to reinstall the Chrome plugin from the Codex plugin UI.
Keep the first response short and non-technical unless the user asks for more information.
If the Codex Chrome Extension is missing, tell the user:
Cannot communicate with the Codex Chrome Extension. Confirm that the extension is installed and enabled in Chrome.
Ask the User if you can open the Codex Chrome Extension webstore page so they can verify that the extension is installed. ALWAYS wait for a user response before taking action. ALWAYS refer to the extension as the Codex Chrome Extension, and not by it's extension ID.
You can construct the URL of the Codex Chrome extension webstore page by appending the extensionId from scripts/extension-id.json to https://chromewebstore.google.com/detail/codex/.
Keep the first response short and non-technical unless the user asks for more information.
If the Codex Chrome Extension is not enabled ask the User if you can open the Google Chrome Extension Manager so they can verify that the extension is enabled. ALWAYS wait for a user response before taking action. Always refer to the Google Chrome Extension Manager as Google Chrome Extension Manager.
Keep the first response short and non-technical unless the user asks for more information.
If Chrome is running and the extension/native-host checks pass, ask the User if you can open a Chrome window for the selected Chrome profile and retry the connection. ALWAYS wait for a user response before taking action.
If the User agrees, run:
scripts/open-chrome-window.js
Then wait 2 seconds and retry the browser-client setup once.
After one successful setup check in a session, do not repeat extension detection unless browser-client reports an extension connection failure.
If the issue is specifically the native host or extension-backed install path, or if communication still fails after opening a Chrome window and retrying setup once, tell the user to reinstall the Chrome plugin from the Codex plugin UI. Never import or run scripts/installManifest.mjs yourself.
Keep the first response short and non-technical unless the user asks for more information.
If file upload fails when using playwright_file_chooser_set_files, set_files or similar tell the user exactly this:
To enable file upload, go to chrome://extensions in Chrome, click Details under the Codex extension, and enable "Allow access to file URLs." See [here](https://developers.openai.com/codex/app/chrome-extension#upload-files) for details.
This script reports which browsers are installed.
From the plugin root, use node_repl to run:
scripts/installed-browsers.js
Use JSON output when another tool or script needs structured data:
scripts/installed-browsers.js --json
This script checks whether Google Chrome is actively running. It exits 0 when Chrome is running, 1 when Chrome is not running, and 2 for usage or runtime errors.
From the plugin root, use node_repl to run:
scripts/chrome-is-running.js --check
Use JSON output when another tool or script needs structured data:
scripts/chrome-is-running.js --json
This script opens about:blank in a Google Chrome window for the same selected Chrome profile used by check-extension-installed.js. Use it only after the User gives permission.
From the plugin root, use node_repl to run:
scripts/open-chrome-window.js
Use dry-run JSON output when another tool or script needs to verify the selected launch command without opening Chrome:
scripts/open-chrome-window.js --dry-run --json
This script checks whether the selected Google Chrome profile has installed version directories for the configured public Chrome Web Store extension ID. It exits 0 when installed and enabled, 1 when installed but not enabled, 2 when not installed, and 3 for usage or runtime errors.
From the plugin root, use node_repl to run:
scripts/check-extension-installed.js
Use JSON output when another tool or script needs structured data:
scripts/check-extension-installed.js --json
The check reads the configured extension ID from scripts/extension-id.json. It detects the Chrome profile from Local State, then falls back to the highest-numbered Profile X or Default directory with Preferences. For debugging or tests, override profile selection with CODEX_CHROME_USER_DATA_DIR=/path/to/chrome-root or CODEX_CHROME_PREFERENCES_PATH=/path/to/Profile/Preferences.
This script checks whether the Chrome Native Messaging Host manifest exists for the configured native host name and allows the Chrome extension ID from scripts/extension-id.json. On Windows it also checks the Chrome NativeMessagingHosts registry key. It exits 0 when correct, 1 when missing or incorrect, and 2 for usage or runtime errors.
From the plugin root, use node_repl to run:
scripts/check-native-host-manifest.js
Use JSON output when another tool or script needs structured data:
scripts/check-native-host-manifest.js --json
browser.user.openTabs(), choose the matching returned tab by its visible title, URL, recency, and tab group, then pass that exact object to browser.user.claimTab(tab).Tab. Reuse that returned tab for navigation, Playwright, screenshots, CUA, and content reads.openTabs() result.Handle file inputs and uploads through the file chooser flow.
Use this pattern:
const chooserPromise = tab.playwright.waitForEvent("filechooser", { timeoutMs: 10000 });
await tab.playwright.locator('input[type="file"]').click();
const chooser = await chooserPromise;
await chooser.setFiles(["/absolute/path/to/file.txt"]);
Notes:
waitForEvent("filechooser") before clicking the file input or its associated upload control.input[type="file"] when it is available; if the UI uses a visible button or label, click that only when it is the control that opens the chooser.setFiles(...).chooser.isMultiple() before passing multiple files when needed.locator.setInputFiles(...) in this wrapper; uploads are exposed via the chooser object instead.If the task involves attaching a local file, check for a file input and try the filechooser flow before falling back to a native picker.
browser.tabs.finalize({ keep }).browser.tabs.finalize({ keep }) as the final Chrome browser action of the turn. Do not call Chrome browser tools after finalizing. If more browser work is needed, do it before finalizing, then finalize once with the final tab disposition.keep.status: "deliverable" when the tab itself is a user-facing output or requested open page: for example a created/edited document, spreadsheet, slide deck, dashboard, checkout/cart, submitted form result, or a page the user explicitly asked to keep open or inspect directly. Deliverable tabs move to the shared ✅ Codex tab group.status: "handoff" only when the task is still in progress and the user or a later turn should continue from the current task tab group: for example a page waiting for user input, login, approval, payment, CAPTCHA, or an unfinished workflow.These setup details are internal. User-facing progress updates should be less technical in nature. Never mention Node REPL, node_repl, REPL, JavaScript sessions, or module exports unless a user is asking for that exact information. If setup or recovery is needed, describe it naturally as connecting to the browser or retrying the browser connection.
The browser-client module is the core entry point for browser use, and is available under scripts/browser-client.mjs in this plugin's root directory. ALWAYS import it using an absolute path.
IMPORTANT: If this path cannot be found, stop and report that this plugin is missing scripts/browser-client.mjs. NEVER use the built in browser-client library.
Run browser setup code through the Node REPL js tool. In this environment the callable tool id typically appears as mcp__node_repl__js; js_reset only clears state and is not the execution tool. Run this once per fresh node_repl session:
const { setupAtlasRuntime } = await import("<plugin root>/scripts/browser-client.mjs");
await setupAtlasRuntime({ globals: globalThis });
globalThis.browser = await agent.browsers.get("extension");
Use the browser bound to browser for tasks in this skill.
IMPORTANT: do NOT attempt to dig through source code or control the browser through unrelated mechanisms before attempting the workflow for the selected backend. If you run into issues, follow the steps below FIRST.
js_reset is visible but js is not, do not conclude that node_repl is unusable. Use tool discovery for node_repl js, then mcp__node_repl__js, then js, then node_repl js JavaScript execution; run the bootstrap cell with the Node REPL js tool once it is exposed.js execution tool is still unavailable after those searches, say that explicitly before choosing any fallback browser-control path.node_repl is not available, say that explicitly before choosing any fallback browser-control path.Browser commands are executed by calling the Node REPL js tool with JavaScript code. Do not look for a browser-specific js tool; the generic Node REPL MCP provides it.
node_repl, first set up the runtime using the guarded first-browser-cell pattern below. You do not have access to the display function until setup is complete. There is no tab variable until you define it yourself.node_repl, prefer node_repl instead of shell commands.node_repl does not automatically print or return the last expression. If you want to see a value, explicitly use console.log(...), display(...), or equivalent.tab binding across cells. If tab already exists, keep using it instead of reacquiring the same tab.tab acquisition are usually one-time per session unless the kernel resets.tab binding, prefer recovering current-session tabs with browser.tabs.list() and browser.tabs.get(tab.id)await browser.nameSession("...") immediately after setup and before opening or selecting tabs. Start the name with a neutral, friendly, task-relevant emoji to make the session easy to scan. If unsure, use 🔎.tab before using it. Never write tab = ... before tab exists.If startup may be retried, use a retry-safe setup cell such as:
if (!globalThis.agent) {
const { setupAtlasRuntime } = await import("<plugin root>/scripts/browser-client.mjs");
await setupAtlasRuntime({ globals: globalThis });
}
if (!globalThis.browser) {
globalThis.browser = await agent.browsers.get("extension");
}
await browser.nameSession("🔎 short task name");
if (typeof tab === "undefined") {
globalThis.tab = await browser.tabs.selected();
}
browser.tabs.selected() may fail if the selected browser does not report an active tab.
If there may not be a selected tab, create a new one instead:
if (!globalThis.agent) {
const { setupAtlasRuntime } = await import("<plugin root>/scripts/browser-client.mjs");
await setupAtlasRuntime({ globals: globalThis });
}
if (!globalThis.browser) {
globalThis.browser = await agent.browsers.get("extension");
}
await browser.nameSession("🔎 short task name");
if (typeof tab === "undefined") {
globalThis.tab = await browser.tabs.new();
}
After that, keep using the existing tab binding. Do not alternate between tab = ..., let tab = ..., const tab = ..., and globalThis.tab = ... across retries.
If you already created the bindings in an earlier node_repl call in the current session, such as:
if (!globalThis.agent) {
const { setupAtlasRuntime } = await import("<plugin root>/scripts/browser-client.mjs");
await setupAtlasRuntime({ globals: globalThis });
}
if (!globalThis.browser) {
globalThis.browser = await agent.browsers.get("extension");
}
await browser.nameSession("📰 Hacker News");
if (typeof tab === "undefined") {
globalThis.tab = await browser.tabs.new();
}
await tab.goto("https://news.ycombinator.com");
await display(await tab.playwright.screenshot({ fullPage: false }));
GOOD: re-using that variable to maintain state:
await tab.playwright.getByText("Interesting Post", { exact: false }).click();
await tab.playwright.waitForLoadState({ state: "load", timeoutMs: 10000 });
await display(await tab.playwright.screenshot({ fullPage: false }));
GOOD: if you intentionally want the main tab variable to point at a different tab later, declare it once with let and then reassign it:
let tab = await browser.tabs.new();
await tab.goto("https://news.ycombinator.com");
tab = await browser.tabs.get("other-tab-id");
await tab.playwright.getByText("Interesting Post", { exact: false }).click();
await tab.playwright.waitForLoadState({ state: "load", timeoutMs: 10000 });
await display(await tab.playwright.screenshot({ fullPage: false }));
GOOD: if you need both tabs live at once, give the second tab a new descriptive variable:
const detailsTab = await browser.tabs.get("other-tab-id");
await detailsTab.playwright.getByText("Interesting Post", { exact: false }).click();
await detailsTab.playwright.waitForLoadState({ state: "load", timeoutMs: 10000 });
await display(await detailsTab.playwright.screenshot({ fullPage: false }));
BAD: refetching the same tab into a new variable just to avoid reuse:
const tab2 = await browser.tabs.get("tab-id");
await tab2.playwright.getByText("Interesting Post", { exact: false }).click();
await tab2.playwright.waitForLoadState({ state: "load", timeoutMs: 10000 });
await display(await tab2.playwright.screenshot({ fullPage: false }));
BAD: wrapping a whole cell in block scope when there is no specific naming collision to solve:
{
const snap = await tab.playwright.domSnapshot();
console.log(snap);
}
BAD: redeclaring an existing variable (const tab = will fail):
const tab = await browser.tabs.get("tab-id");
await tab.playwright.getByText("Interesting Post", { exact: false }).click();
await tab.playwright.waitForLoadState({ state: "load", timeoutMs: 10000 });
await display(await tab.playwright.screenshot({ fullPage: false }));
GOOD: if you only need a snapshot once, avoid creating a new reusable variable name for it:
console.log(await tab.playwright.domSnapshot());
In node_repl you can use Node filesystem libraries when needed.
For file operations, prefer the Node runtime libraries directly:
const fs = await import("node:fs/promises");
// write a file
await fs.writeFile("hello.txt", "Hello world");
// read a file
const contents = await fs.readFile("hello.txt", "utf-8");
Use the guarded first-browser-cell pattern above when starting browser work. It creates the top-level agent object and display function for browser work.
The ability to interact directly with the browser is exposed through the browser-client runtime via the agent.browsers.* API.
Only the Node REPL js tool (mcp__node_repl__js) can be used to control the Chrome extension. Do not use external MCP browser-control tools, separate browser automation servers, or other browser skills for this surface. References to Playwright mean the in-skill tab.playwright API after browser-client setup.
Image type that can ONLY be put into context by using the top-level display function (e.g. await display(screenshot);).
tab once and keep using it. Only re-query a tab when you are intentionally switching to a different tab, after a kernel reset, or after a failed cell that never created the binding.a href in the DOM.goto with the same URL. This will reload the page and may lose any in-progress information the user has provided. When you intentionally need to reload, call tab.reload().localhost, 127.0.0.1, ::1, or another local development URL in a framework that does not support hot reloading or hot reloading is disabled, call tab.reload() after code or build changes before verifying the UI. After reloading, take a fresh DOM snapshot or screenshot before continuing.Playwright is a critical part of the JavaScript API available to you.
You only have access to a limited subset of the Playwright API, so only call functions that are explicitly defined.
Notably, you do not have access to evaluate.
When using Playwright, keep and reuse a recent tab.playwright.domSnapshot() when it is available and you need it for locator construction or retry decisions. Treat the latest relevant snapshot as the source of truth for locator construction and retry decisions.
domSnapshot() until the page state changes or the snapshot proves stale.domSnapshot() after navigation or any major UI state change.domSnapshot() after opening or closing a menu, modal, dropdown, accordion, or filter.domSnapshot() before forming the next locator.count(), a specific attribute, or a direct locator check would answer the question with fewer tokens.all() and call getAttribute(...), textContent(), or innerText() on each match. Each read crosses the browser boundary and becomes extremely expensive on large pages.locator.getAttribute(...) is a single-element read, not a batch read. If the locator matches multiple elements, expect a strict-mode error rather than an array of attributes.locator(...).allTextContents(), locator("body").textContent(), or locator("body").innerText() as exploratory search tools across a page or large container.domSnapshot() and parse the relevant lines, use the site's own search/filter UI, or navigate directly to a focused results page. Only fall back to per-element reads for a small, already-scoped set of candidates.__NEXT_DATA__, or repeated full-page extraction across multiple candidate pages as an exploratory search strategy.name to getByRole(...) in this environment. Use a plain string name only..first(), .last(), or .nth() unless you have just called count() on the same locator and explicitly confirmed why that position is correct.domSnapshot().tab.playwright.waitForTimeout(...) in this environment.locator(...).selectOption(...) exists in this environment.Before every click, fill, select-like action, or press:
domSnapshot() for the current UI state.count() on that locator.If count() is 0:
If count() is greater than 1:
.first() as a shortcut.Build locators from what the snapshot actually shows, not what looks visually obvious.
Prefer the most stable contract, in this order:
data-testiddata-* attributeshref (prefer exact or strong matches over broad substrings)namegetByText(...)locator(...)Use the most specific locator that is still durable.
Treat a stable href as a strong hint, not proof of uniqueness. If multiple elements share the same href, scope to the correct card or container and confirm count() before clicking.
Treat generic labels like Menu, Main Menu, Help, Close, Default, Color, Size, single-letter size labels such as S, M, L, XL, Sort by, Search, and Add to cart as ambiguous by default. Scope them to the correct container before acting.
On search results, product grids, carousels, and modal-heavy pages, repeated hrefs and repeated generic labels are ambiguous by default. First identify the stable card or container, then scope the locator inside that container before clicking.
getByRole(..., { name })name is the accessible name, which may differ from visible text.link "X" usually reflects the accessible name.getByRole only when the accessible name is clearly present and likely unique in the latest snapshot.count() on a locator, store the result in a local variable and reuse it unless the DOM changes.timeoutMs to routine click, fill, check, or setChecked calls unless you have a concrete reason the target is slow to become actionable.tab.goto(url) over a brittle locator click.tab inside each node_repl call. Reuse the existing tab binding to save tokens and preserve state. Only reacquire or reassign it when you intentionally switch tabs, after a kernel reset, or after a failed call that did not create the binding.data-*, href, etc.), not brittle CSS structure.href values copied from the snapshot over guessed URL patterns.getByText(...) only when role-based or attribute-based locators are not reliable, and scope it to a container whenever possible.Because Browser Use can trigger external side effects through live browser actions, follow the below policy and request user confirmation before risky actions. Normal non-browser actions do not need the same policy.
This policy is strictly limited to actions taken in the browser, such as navigating, clicking, typing, scrolling, dragging, uploading, downloading, submitting forms, or changing browser or web app state. The assistant should not follow this policy when performing non-browser actions.
The agent should ask the user to take over or find an alternative.
Blocking confirmation required immediately before the action.
If explicitly permitted in the initial prompt, proceed without re-confirming; otherwise confirm right before the action.
Use this as the supported agent.browsers.* surface.
// Installed by setupAtlasRuntime({ globals: globalThis }).
const browser = await agent.browsers.get("extension");
interface Agent {
browsers: Browsers; // API for finding and selecting browsers.
}
interface Browsers {
get(id: string): Promise<Browser>; // Get a browser by id or client type.
list(): Promise<Array<BrowserInfo>>; // List available browsers.
}
interface Browser {
browserId: string; // Browser id selected by `agent.browsers.get()`.
capabilities: BrowserCapabilityCollection; // Browser-scoped optional capabilities advertised by the connected backend; discover IDs with `await browser.capabilities.list()`, then read `docs/capabilities/browser/{id}.md` in plugin output or `references/capabilities/browser/{id}.md` in skill output for method details.
tabs: Tabs; // API for interacting with browser tabs.
user: BrowserUser; // Readonly context about tabs and history in the user's browser windows.
nameSession(name: string): Promise<void>; // Name the current browser automation session.
}
interface BrowserUser {
claimTab(tab: string | BrowserUserTabInfo): Promise<Tab>; // Claim a user tab returned by `openTabs()` and return it as a controllable agent tab.
history(options: BrowserHistoryOptions): Promise<Array<BrowserHistoryEntry>>; // List recent browsing history ordered by `dateVisited` descending.
openTabs(): Promise<Array<BrowserUserTabInfo>>; // List open top-level tabs across the user's browser windows ordered by `lastOpened` descending.
}
interface Tabs {
finalize(options: FinalizeTabsOptions): Promise<void>; // Finalize the browser session's tabs by cleaning up tabs that are no longer needed.
get(id: string): Promise<Tab>; // Get a tab by id.
list(): Promise<Array<TabInfo>>; // List open tabs in the browser.
new(): Promise<Tab>; // Create and return a new tab in the browser.
selected(): Promise<undefined | Tab>; // Return the currently selected tab, if any.
}
interface Tab {
capabilities: TabCapabilityCollection; // Tab-scoped optional capabilities advertised by the connected backend; discover IDs with `await tab.capabilities.list()`, then read `docs/capabilities/tab/{id}.md` in plugin output or `references/capabilities/tab/{id}.md` in skill output for method details.
clipboard: TabClipboardAPI; // API for interacting with clipboard content in this tab.
cua: CUAAPI; // API for interacting with the tab via the cua api
dev: TabDevAPI; // API for developer-oriented tab inspection.
dom_cua: DomCUAAPI; // API for interacting with the tab via the dom based cua api
id: string; // A tab's unique identifier
playwright: PlaywrightAPI; // API for interacting with the tab via the playwright api
back(): Promise<void>; // Navigate this tab back in history.
close(): Promise<void>; // Close this tab.
forward(): Promise<void>; // Navigate this tab forward in history.
goto(url: string): Promise<void>; // Open a URL in this tab.
reload(): Promise<void>; // Reload this tab.
title(): Promise<undefined | string>; // Get the current title for this tab.
url(): Promise<undefined | string>; // Get the current URL for this tab.
}
interface CUAAPI {
click(options: ClickOptions): Promise<void>; // Click at a coordinate in the current viewport.
double_click(options: DoubleClickOptions): Promise<void>; // Double click at a coordinate in the current viewport.
drag(options: DragOptions): Promise<void>; // Drag from a point to a point by the provided path.
get_visible_screenshot(): Promise<Image>; // Capture the visible portion of the page as an image.
keypress(options: KeypressOptions): Promise<void>; // Press control characters at the current focused element (focus it first via click/dblclick).
move(options: MoveOptions): Promise<void>; // Move the mouse to a point by the provided x and y coordinates.
scroll(options: ScrollOptions): Promise<void>; // Scroll by a delta from a specific viewport coordinate.
type(options: TypeOptions): Promise<void>; // Type text at the current focus.
}
interface DomCUAAPI {
click(options: DomClickOptions): Promise<void>; // Click a DOM node by its id from the visible DOM snapshot.
double_click(options: DomClickOptions): Promise<void>; // Double-click a DOM node by its id.
get_visible_dom(): Promise<unknown>; // Return a filtered DOM with node ids for interactable elements.
keypress(options: DomKeypressOptions): Promise<void>; // Press control characters at the currently focused element (focus it first via click/dblclick).
scroll(options: DomScrollOptions): Promise<void>; // Scroll either the page or a specific node (if node_id provided) by deltas.
type(options: DomTypeOptions): Promise<void>; // Type text into the currently focused element (focus via click first).
}
interface PlaywrightAPI {
domSnapshot(): Promise<string>; // Return a snapshot of the current DOM as a string.
expectNavigation<T>(action: () => Promise<T>, options: { timeoutMs?: number; url?: string; waitUntil?: LoadState }): Promise<T>; // Expect a navigation triggered by an action.
frameLocator(frameSelector: string): PlaywrightFrameLocator; // Create a frame-scoped locator builder.
getByLabel(text: TextMatcher, options: { exact?: boolean }): PlaywrightLocator; // Find elements by label text within the page.
getByPlaceholder(text: TextMatcher, options: { exact?: boolean }): PlaywrightLocator; // Find elements by placeholder text within the page.
getByRole(role: string, options: { exact?: boolean; name?: TextMatcher }): PlaywrightLocator; // Find elements by ARIA role within the page.
getByTestId(testId: string): PlaywrightLocator; // Find elements by test id within the page.
getByText(text: TextMatcher, options: { exact?: boolean }): PlaywrightLocator; // Find elements by text within the page.
locator(selector: string): PlaywrightLocator; // Create a locator scoped to this tab.
screenshot(options: ScreenshotOptions): Promise<Image>; // Capture a screenshot of the current page.
waitForEvent(event: "download", options?: WaitForEventOptions): Promise<PlaywrightDownload>; // Wait for the next event on the page.
waitForLoadState(options: PageWaitForLoadStateOptions): Promise<void>; // Wait for the page to reach a specific load state.
waitForTimeout(timeoutMs: number): Promise<void>; // Wait for a fixed duration.
waitForURL(url: string, options: PageWaitForURLOptions): Promise<void>; // Wait for the page URL to match the provided value.
}
interface PlaywrightFrameLocator {
getByLabel(text: TextMatcher, options: { exact?: boolean }): PlaywrightLocator; // Find elements by label within this frame.
getByPlaceholder(text: TextMatcher, options: { exact?: boolean }): PlaywrightLocator; // Find elements by placeholder within this frame.
getByRole(role: string, options: { exact?: boolean; name?: TextMatcher }): PlaywrightLocator; // Find elements by ARIA role within this frame.
getByTestId(testId: string): PlaywrightLocator; // Find elements by test id within this frame.
getByText(text: TextMatcher, options: { exact?: boolean }): PlaywrightLocator; // Find elements by text within this frame.
locator(selector: string): PlaywrightLocator; // Create a locator scoped to this frame.
}
interface PlaywrightLocator {
all(): Promise<Array<PlaywrightLocator>>; // Resolve to a list of locators for each matched element.
allTextContents(options: { timeoutMs?: number }): Promise<Array<string>>; // Return `textContent` for *all* elements matched by this locator.
and(locator: PlaywrightLocator): PlaywrightLocator; // Return a locator matching elements that satisfy both this locator and `locator`.
check(options: LocatorCheckOptions): Promise<void>; // Check a checkbox or switch-like control.
click(options: LocatorClickOptions): Promise<void>; // Click the element matched by this locator.
count(): Promise<number>; // Number of elements matching this locator.
dblclick(options: LocatorClickOptions): Promise<void>; // Double-click the element matched by this locator.
downloadMedia(options: LocatorDownloadMediaOptions): Promise<void>; // Trigger a media download for the first matched element.
fill(value: string, options: { timeoutMs?: number }): Promise<void>; // Replace the element's value with the provided text.
filter(options: LocatorFilterOptions): PlaywrightLocator; // Narrow this locator by additional constraints.
first(): PlaywrightLocator; // Return a locator pointing at the first matched element.
getAttribute(name: string, options: { timeoutMs?: number }): Promise<null | string>; // Return an attribute value from the first matched element.
getByLabel(text: TextMatcher, options: { exact?: boolean }): PlaywrightLocator; // Find elements by label text, scoped to this locator.
getByPlaceholder(text: TextMatcher, options: { exact?: boolean }): PlaywrightLocator; // Find elements by placeholder text, scoped to this locator.
getByRole(role: string, options: { exact?: boolean; name?: TextMatcher }): PlaywrightLocator; // Find elements by ARIA role, scoped to this locator.
getByTestId(testId: string): PlaywrightLocator; // Find elements by test id, scoped to this locator.
getByText(text: TextMatcher, options: { exact?: boolean }): PlaywrightLocator; // Find elements by text content, scoped to this locator.
innerText(options: { timeoutMs?: number }): Promise<string>; // Return the rendered (visible) text of the first matched element.
isEnabled(): Promise<boolean>; // Whether the first matched element is currently enabled.
isVisible(): Promise<boolean>; // Whether the first matched element is currently visible.
last(): PlaywrightLocator; // Return a locator pointing at the last matched element.
locator(selector: string, options: LocatorLocatorOptions): PlaywrightLocator; // Create a descendant locator scoped to this locator.
nth(index: number): PlaywrightLocator; // Return a locator pointing at the Nth matched element.
or(locator: PlaywrightLocator): PlaywrightLocator; // Return a locator matching elements that satisfy either this locator or `locator`.
press(value: string, options: { timeoutMs?: number }): Promise<void>; // Press a keyboard key while this locator is focused.
selectOption(value: SelectOptionInput | Array<SelectOptionInput>, options: { timeoutMs?: number }): Promise<void>; // Select one or more options on a native `<select>` element.
setChecked(checked: boolean, options: LocatorCheckOptions): Promise<void>; // Set a checkbox or switch-like control to a checked/unchecked state.
textContent(options: { timeoutMs?: number }): Promise<null | string>; // Return the raw textContent of the first matched element (or null if missing).
type(value: string, options: { timeoutMs?: number }): Promise<void>; // Type text into the element without clearing existing content.
uncheck(options: LocatorCheckOptions): Promise<void>; // Uncheck a checkbox or switch-like control.
waitFor(options: LocatorWaitForOptions): Promise<void>; // Wait for the element to reach a specific state.
}
interface PlaywrightDownload {
}
interface TabClipboardAPI {
read(): Promise<Array<TabClipboardItem>>; // Read clipboard items, including text and binary payloads.
readText(): Promise<string>; // Read plain text from the browser clipboard.
write(items: Array<TabClipboardItem>): Promise<void>; // Write clipboard items.
writeText(text: string): Promise<void>; // Write plain text to the browser clipboard.
}
interface TabDevAPI {
logs(options: TabDevLogsOptions): Promise<Array<TabDevLogEntry>>; // Read console log messages captured for this tab.
}
interface Image {
toBase64(): string;
}
interface BrowserInfo {
capabilities: ClientCapabilities;
id: string;
name: string;
type: ClientType;
}
type BrowserCapabilityCollection = {
get(id: string): Promise<unknown>;
list(): Promise<Array<{ id: string; description: string }>>;
};
interface BrowserUserTabInfo {
id: string; // Opaque identifier for this browser tab.
lastOpened?: string; // ISO 8601 timestamp for the last time the tab was opened or focused.
tabGroup?: string; // User-visible tab group name when the tab belongs to one.
title?: string; // User-visible tab title.
url?: string; // Current tab URL.
}
interface BrowserHistoryOptions {
from?: string | Date; // Lower bound for visit timestamps.
limit?: number; // Maximum number of history entries to return.
query?: string; // Optional term to filter browser history with.
to?: string | Date; // Upper bound for visit timestamps.
}
interface BrowserHistoryEntry {
dateVisited: string; // ISO 8601 timestamp for the visit.
title?: string; // Page title captured for the visit.
url: string; // Visited URL.
}
interface TabsContentOptions {
timeoutMs?: number; // Maximum time to wait for each page load, in milliseconds.
urls: Array<string>; // URLs to load in temporary background tabs.
}
interface TabsContentResult {
title: null | string; // The resolved page title when available.
url: string; // The resolved page URL when available, otherwise the requested URL.
}
interface FinalizeTabsOptions {
keep?: Array<FinalizeTabsKeep>; // Tabs to keep open.
}
interface TabInfo {
id: string; // Metadata describing an open tab.
title?: string;
url?: string;
}
type TabCapabilityCollection = {
get(id: string): Promise<unknown>;
list(): Promise<Array<{ id: string; description: string }>>;
};
type ClickOptions = {
button?: number; // Mouse button (1-left, 2-middle/wheel, 3-right, 4-back, 5-forward).
keypress?: Array<string>; // Modifier keys held during the click.
x: number;
y: number;
};
type DoubleClickOptions = {
keypress?: Array<string>; // Modifier keys held during the double click.
x: number;
y: number;
};
type DragOptions = {
keys?: Array<string>; // Optional modifier keys held during the drag.
path: Array<{ x: number; y: number }>; // Drag path as a list of points.
};
type KeypressOptions = {
keys: Array<string>; // Key combination to press.
};
type MoveOptions = {
keys?: Array<string>; // Optional modifier keys held while moving.
x: number;
y: number;
};
type ScrollOptions = {
keypress?: Array<string>; // Modifier keys held during scroll.
scrollX: number;
scrollY: number;
x: number;
y: number;
};
type TypeOptions = {
text: string;
};
type DomClickOptions = {
node_id: string; // Node id from `get_visible_dom()`.
};
type DomKeypressOptions = {
keys: Array<string>; // Key combination to press.
};
type DomScrollOptions = {
node_id?: string; // Optional node id to scroll within.
x: number; // Horizontal scroll delta.
y: number; // Vertical scroll delta.
};
type DomTypeOptions = {
text: string; // Text to type into the currently focused element.
};
type ElementInfoOptions = {
includeNonInteractable?: boolean; // When true, include non-interactable elements in addition to interactable targets.
x: number;
y: number;
};
type ElementInfo = {
ariaName?: string | null; // Accessible name if available.
boundingBox?: ElementInfoRect | null; // Element bounds in screenshot coordinates.
preview: string; // Compact human-readable node preview.
role?: string | null; // Computed ARIA role if available.
selector: ElementInfoSelector; // Suggested selector data for this element.
tagName: string; // Lowercased HTML tag name.
testId?: string | null; // Configured test id attribute if present.
visibleText?: string | null; // Rendered visible text, selected option text, or visible form value when available.
};
type ElementScreenshotOptions = {
includeNonInteractable?: boolean; // When true, highlight non-interactable elements in addition to interactable targets.
x: number;
y: number;
};
type LoadState = "load" | "domcontentloaded" | "networkidle";
type TextMatcher = string | RegExp;
type ScreenshotOptions = {
clip?: ClipRect; // Crop to a specific rectangle instead of the full viewport.
fullPage?: boolean; // Capture the full page instead of the viewport.
};
type WaitForEventOptions = {
timeoutMs?: number;
};
type PageWaitForLoadStateOptions = {
state?: LoadState;
timeoutMs?: number;
};
type PageWaitForURLOptions = {
timeoutMs?: number;
waitUntil?: WaitUntil;
};
type LocatorCheckOptions = {
force?: boolean;
timeoutMs?: number;
};
type LocatorClickOptions = {
button?: MouseButton;
force?: boolean;
modifiers?: Array<KeyboardModifier>;
timeoutMs?: number;
};
type LocatorDownloadMediaOptions = {
timeoutMs?: number;
};
type LocatorFilterOptions = {
has?: PlaywrightLocator;
hasNot?: PlaywrightLocator;
hasNotText?: TextMatcher;
hasText?: TextMatcher;
visible?: boolean;
};
type LocatorLocatorOptions = {
has?: PlaywrightLocator;
hasNot?: PlaywrightLocator;
hasNotText?: TextMatcher;
hasText?: TextMatcher;
};
type SelectOptionInput = string | SelectOptionDescriptor;
type LocatorWaitForOptions = {
state: WaitForState;
timeoutMs?: number;
};
type TabClipboardItem = {
entries: Array<TabClipboardEntry>;
presentationStyle?: "unspecified" | "inline" | "attachment";
};
interface TabDevLogsOptions {
filter?: string; // Optional substring filter applied to the rendered log message.
levels?: Array<"debug" | "info" | "log" | "warn" | "error" | "warning">; // Optional levels to include.
limit?: number; // Maximum number of logs to return.
}
interface TabDevLogEntry {
level: "debug" | "info" | "log" | "warn" | "error"; // Console log level.
message: string; // Rendered log message text.
timestamp: string; // ISO 8601 timestamp for when the runtime captured the log.
url?: string; // Source URL reported by the browser runtime, when available.
}
interface ClientCapabilities {
browser?: Array<CapabilityInfo>;
tab?: Array<CapabilityInfo>;
}
type ClientType = "iab" | "extension" | "cdp";
type TabsContentType = "html" | "text" | "domSnapshot";
interface FinalizeTabsKeep {
status: FinalizeTabStatus; // Where the kept tab belongs after cleanup.
tab: string | Tab | TabInfo; // Tab to keep open after browser cleanup.
}
type ElementInfoRect = {
height: number;
width: number;
x: number;
y: number;
};
type ElementInfoSelector = {
candidates: Array<string>; // Ranked selector candidates for the element.
frameSelectors?: Array<string>; // Frame selectors to enter before using the element selector.
primary?: string | null; // The preferred selector for the element when available.
};
type ClipRect = {
height: number;
width: number;
x: number;
y: number;
};
type WaitUntil = LoadState | "commit";
type MouseButton = "left" | "right" | "middle";
type KeyboardModifier = "Alt" | "Control" | "ControlOrMeta" | "Meta" | "Shift";
type SelectOptionDescriptor = {
index?: number;
label?: string;
value?: string;
};
type WaitForState = "attached" | "detached" | "visible" | "hidden";
type TabClipboardEntry = {
base64?: string;
mimeType: string;
text?: string;
};
interface CapabilityInfo {
description: string;
docs?: string; // Model-facing pointer to the generated capability usage docs.
id: string;
}
type FinalizeTabStatus = "handoff" | "deliverable";