| name | e2e-testing |
| description | Write and review Playwright E2E tests for Langflow. Trigger when the user asks to write, fix, or review E2E tests, spec files, Playwright tests, or integration tests that exercise the full UI. Also trigger when modifying data-testid attributes, test helpers in tests/utils/, or fixture configuration. |
Langflow E2E Testing (Playwright)
When to Apply
- User asks to write E2E tests for a feature or flow
- User asks to fix a failing E2E test
- User asks to review E2E test coverage
- User modifies
data-testid attributes in components (may break existing tests)
- User changes test utilities in
src/frontend/tests/utils/
Do NOT apply when:
- User asks about unit tests (use
frontend-testing skill for Jest)
- User asks about backend tests (use
backend-code-review skill for pytest)
Tech Stack
| Tool | Version | Purpose |
|---|
| Playwright | 1.59.1 | E2E test runner + browser automation |
| Chromium | (bundled) | Default browser (Firefox/Safari disabled) |
| Custom fixtures | tests/fixtures.ts | Auto-detects API errors and flow execution failures |
Key Commands
npx playwright test
npx playwright test --grep "@release"
npx playwright test --grep "@workspace"
npx playwright test --grep "@starter-projects"
npx playwright test tests/core/features/run-flow.spec.ts
npx playwright test --debug
npx playwright show-report
npx playwright test --update-snapshots
Configuration
File: src/frontend/playwright.config.ts
| Setting | Value | Why |
|---|
fullyParallel | true | Tests run in parallel for speed |
timeout | 5 minutes | Flow builds can be slow; prevents false timeouts |
retries | 3 (local), 2 (CI) | Flaky network/rendering issues; retries catch them |
workers | 2 | Balances speed and resource usage |
actionTimeout | 20s | Individual action timeout (click, fill, etc.) |
trace | on-first-retry | Captures trace on failures for debugging |
baseURL | http://localhost:3000 | Vite dev server |
WebServer: Playwright auto-starts backend (uvicorn on 7860) + frontend (npm start on 3000).
Directory Structure
src/frontend/tests/
├── fixtures.ts # Custom test fixture with error detection
├── globalTeardown.ts # Cleanup (removes temp DB after tests)
├── core/
│ ├── features/ # Main feature tests (run-flow, playground, etc.)
│ ├── integrations/ # Starter project / template tests
│ ├── regression/ # Bug regression tests
│ └── unit/ # Component-level Playwright tests
├── extended/
│ ├── features/ # Extended features (MCP, auto-save, etc.)
│ ├── integrations/ # Extended integrations
│ └── regression/ # Extended regressions
└── utils/ # 37+ shared helper functions
File Naming
- kebab-case with
.spec.ts suffix: run-flow.spec.ts, playground.spec.ts, flow-lock.spec.ts
- Template tests may use spaces:
Document QA.spec.ts, Social Media Agent.spec.ts
- Sharded tests for parallelization:
chatInputOutputUser-shard-0.spec.ts
Note: E2E tests use .spec.ts (Playwright convention). Unit tests use .test.tsx (Jest convention). Do not mix them.
Test Anatomy
Basic Test
import { expect, test } from "../../fixtures";
import { awaitBootstrapTest } from "../../utils/await-bootstrap-test";
test(
"user should be able to run a flow successfully",
{ tag: ["@release", "@workspace"] },
async ({ page }) => {
await awaitBootstrapTest(page);
await page.getByTestId("blank-flow").click();
await page.getByTestId("sidebar-search-input").fill("Chat Output");
await expect(page.getByTestId("build-status-success")).toBeVisible({ timeout: 30000 });
},
);
With test.describe
test.describe("Flow Lock Feature", () => {
test(
"should lock and unlock a flow",
{ tag: ["@release", "@api"] },
async ({ page }) => {
},
);
test(
"should prevent editing when locked",
{ tag: ["@release"] },
async ({ page }) => {
},
);
});
With Serial Mode (tests that depend on order)
test.describe.configure({ mode: "serial" });
test("step 1: create flow", async ({ page }) => { });
test("step 2: edit flow", async ({ page }) => { });
test("step 3: delete flow", async ({ page }) => { });
With Event Delivery Modes (streaming/polling/direct)
import { withEventDeliveryModes } from "../../utils/withEventDeliveryModes";
withEventDeliveryModes(
"Document Q&A should work",
{ tag: ["@release", "@starter-projects"] },
async ({ page }) => {
},
);
Tags System
Every test MUST have at least one tag. Tags enable filtering and CI pipeline configuration.
| Tag | Purpose | When to Use |
|---|
@release | Tests that must pass before release | Critical user flows |
@workspace | Workspace/flow management | Creating, editing, deleting flows |
@api | API-dependent features | Tests that call backend endpoints |
@database | Database operations | Tests involving persistence |
@components | Component-level tests | Individual component behavior |
@starter-projects | Template/starter project tests | Pre-built flow templates |
@regression | Bug regression tests | Tests for specific fixed bugs |
test("my feature test", { tag: ["@release", "@workspace"] }, async ({ page }) => { ... });
test("my feature test", async ({ page }) => { ... });
Custom Fixtures: Error Detection
Always import test and expect from ../../fixtures, NOT from @playwright/test.
import { expect, test } from "../../fixtures";
import { expect, test } from "@playwright/test";
Why: The custom fixture automatically monitors all /api/ responses and fails the test if:
- HTTP 400, 404, 422, or 500 errors occur
- Flow execution returns
error: true in event streams
- Python exceptions appear in streamed responses
To opt-in to expected errors (e.g., testing error handling):
test("should show error on invalid input", { tag: ["@release"] }, async ({ page }) => {
page.allowFlowErrors();
});
Selector Strategy
Priority (in order of preference)
getByTestId — Most stable, used 95% of the time in Langflow
getByRole — For buttons, headings, and form elements
getByText — For visible text content
waitForSelector — For CSS selectors and dynamic elements
locator — For complex selectors (CSS, XPath)
Common data-testid Patterns
Canvas & Navigation:
blank-flow — New blank flow button
sidebar-search-input — Component search
canvas_controls_dropdown — Canvas controls menu
fit_view, zoom_out, zoom_in — Canvas controls
react-flow-id — ReactFlow canvas container
Component Fields:
popover-anchor-input-{fieldname} — Input field for a component parameter
input-chat-playground — Playground chat input
div-chat-message — Chat message in playground
Actions:
add-component-button-{component} — Add component to canvas
button-send — Send chat message
button_run_{component} — Run specific component
publish-button, save-flow-button — Flow actions
edit-fields-button — Toggle inspection panel field editor
Modals & Panels:
modal-title — Modal heading
icon-Globe — Global variables
icon-Lock — Flow lock toggle
session-selector — Playground session switcher
Important: Global Variables and Badges
When a component field has a global variable selected (load_from_db: true + value: "OPENAI_API_KEY"), the field renders a badge instead of an <input> element. This means getByTestId("popover-anchor-input-api_key") will NOT find the element — it doesn't exist in the DOM.
Templates with global variables pre-selected: Market Research, Price Deal Finder, Research Agent.
Templates without (input IS rendered): Instagram Copywriter.
Core Helper Functions
Located in src/frontend/tests/utils/:
| Function | What it Does | When to Use |
|---|
awaitBootstrapTest(page) | Waits for app to fully load | Start of every test |
initialGPTsetup(page) | Full setup: adjustView → updateComponents → selectModel → addKey → adjustView → unselectNodes | Tests that need OpenAI configured |
adjustScreenView(page, opts?) | Fit view + zoom out | After adding components to canvas |
zoomOut(page, times) | Zoom out N times | When components are too small |
selectGptModel(page) | Selects gpt-4o-mini for all Language Model nodes | GPT-dependent tests |
addOpenAiInputKey(page) | Fills OPENAI_API_KEY for all openai_api_key fields | Tests requiring API key |
enableInspectPanel(page) | Toggles inspection panel ON | MUST call before edit-fields-button |
disableInspectPanel(page) | Toggles inspection panel OFF | Cleanup after inspection |
updateOldComponents(page) | Clicks "Update all" if outdated components exist | After loading saved flows |
unselectNodes(page) | Clicks empty canvas area to deselect all nodes | After node operations |
renameFlow(page, { flowName }) | Renames the current flow | Flow management tests |
uploadFile(page, filename) | Uploads a file from test assets | File upload tests |
withEventDeliveryModes(...) | Runs test 3x: streaming, polling, direct | Starter project tests |
initialGPTsetup Options
await initialGPTsetup(page);
await initialGPTsetup(page, {
skipAdjustScreenView: true,
skipUpdateOldComponents: true,
skipSelectGptModel: true,
});
Inspection Panel Pattern (CRITICAL)
await enableInspectPanel(page);
await page.getByTestId("title-OpenAI").click();
await page.getByTestId("edit-fields-button").click();
await page.getByTestId("showmodel_name").click();
await page.getByTestId("edit-fields-button").click();
If you skip enableInspectPanel(page), the edit-fields-button will NOT be visible.
Skip Patterns
test.skip(!process?.env?.OPENAI_API_KEY, "OPENAI_API_KEY required to run this test");
test.skip(true, "Feature not yet implemented with new designs");
Writing Good E2E Tests
Do:
- Tag every test with at least one tag
- Import from
../../fixtures, not @playwright/test
- Start with
awaitBootstrapTest(page) — always
- Use
getByTestId for stable selectors
- Set explicit timeouts on
waitForSelector and expect(...).toBeVisible() for async operations
- Test the complete user flow: setup → action → verification
- Use
withEventDeliveryModes for tests that involve flow execution (chat, build)
Don't:
- Don't use
page.waitForTimeout() unless absolutely necessary — prefer waitForSelector or expect().toBeVisible()
- Don't hardcode API keys — read from
process.env.OPENAI_API_KEY
- Don't skip tests without a reason — always provide the second argument to
test.skip()
- Don't import from
@playwright/test — use the custom fixtures
- Don't forget
enableInspectPanel(page) before accessing edit-fields-button
- Don't assume input fields exist when global variables are selected (badge renders instead)
Challenge Tests (Apply Here Too)
E2E tests should also cover adversarial scenarios:
- Invalid input: paste 10K characters, special characters (
<script>alert(1)</script>), empty submissions
- Network interruption: what happens if the user loses connection mid-build?
- Permission boundaries: can a user access another user's flow via direct URL?
- Concurrent actions: double-click delete, rapid chat messages
- Error recovery: does the UI recover gracefully from a 500 error?
References