Run any Skill in Manus with one click

exploratory-qa

Stars0

Forks0

UpdatedJune 16, 2026 at 19:20

First-time-user exploratory QA walkthrough for web apps that FEEDS THE LIFECYCLE. Use when asked to experience an app the way a brand-new human user would — landing cold on the home page and clicking through to find anything confusing, broken, or hard to understand (unclear purpose or audience, human-facing jargon, contextless extracted data, machine-style labels, raw dates/enums, sparse data with no explanation, wrong control semantics, slow or unclear loads, late meaningful content, cramped or cut-off UI, inconsistent/non-standard UX, awkward scroll behavior, unclear affordances, dead-end flows that strand a user — e.g. a login page with no way to register or recover a password, or a primary action that drops the user into an incomplete/unsatisfiable state with no way to finish) across all breakpoints. Instead of writing a report file, it files every finding as a tracked work item via lisa:tracker-write (bugs and usability/UX issues). A `ready` parameter controls whether those tickets are created build-read

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

CodySwannGT

CodySwannGT/harperstarter

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

Related occupationsSOC

Based on SOC occupation classification

Software Quality Assurance Analysts and TestersComputer and Mathematical Occupations·SOC 15-1253

File Explorer

2 files

SKILL.md

readonly

Exploratory QA

Overview

Experience the app the way a brand-new human user would: land cold on the home page with no prior knowledge, then click through and actually try to use it — just like a real person. The goal is to surface anything confusing, broken, or hard to understand, and to do so at every breakpoint.

This is a usability/experience pass, not a test-coverage audit. It does not look at the Playwright suite or hunt for coverage gaps — for that, use the e2e-coverage-gaps skill. Here, every finding is filed as a tracked work item so it enters the Lisa lifecycle — no static report file.

Parameters

target-url | env (first positional) — what to explore.
ready=true|false — the build-ready state for the tickets this pass creates.
- ready=true → created build-ready, so lisa:intake / the build-intake scanner auto-picks them up.
- ready=false (default) → created in the backlog (not build-ready) for a human to review and promote into the queue.

Core Workflow

1. Establish Scope

Identify the target environment, account type, and browser requirement, and read the ready flag (default false).
Confirm the tracker is configured. Findings are filed as tickets, so read tracker from .lisa.config.json (local overrides global). If it is unset, stop and report that the tracker must be configured (via /lisa:setup:jira / :github / :linear) before exploratory QA can file findings — do not silently fall back to a report file.
If credentials, tenant, or seed data are missing and cannot be discovered safely, ask one concise clarifying question.
Treat production-like environments conservatively. Do not mutate production data unless the user explicitly approves it. Prefer a test user, dev/staging environment, or isolated seeded account.

2. Arrive Cold

Start at the home/landing page with no prior knowledge of the app. Do not pre-read the codebase to learn the intended flows — discover them the way a user would, by looking and clicking.
Form a first impression: is it obvious what this app is, what to do first, and where to go next?
On each major page, ask what a cold user would think the page is trying to do or tell them. If the page could plausibly be read as several different products or workflows (public browser vs admin workbench vs data-quality dashboard vs review queue), file a clarity/usability ticket.

3. Use It Like a Human

Click through the visible paths and actually attempt real tasks — a first-time user explores, makes mistakes, and tries the obvious thing. Cover at least these dimensions unless the user narrows scope:

Comprehension & labeling: human-facing copy must sound like something a normal first-time user would understand. Flag machine-style or developer labels shown to users (raw IDs, enum keys, snake_case, null/undefined, untranslated i18n keys), admin/database terms such as "metadata", "rows", "bucket", "record", "entity", or "loaded rows", implementation identifiers such as slugs, unexplained domain jargon, unclear button/menu names, and icons with no discernible meaning. Flag all-caps enum/source labels, raw timestamps, and typo-like machine strings. If a heading, label, or field would make a non-technical user ask "what does that mean?", file a usability/clarity ticket with plainer wording.
Data usefulness & context: extracted facts, metrics, summaries, and structured tables must help a person understand the surface. Flag machine residue that only proves extraction happened, such as repeated generic fields (Money Mention, Entity, Record) paired with values but no sentence, source, category, or explanation of why the value matters. If a user cannot tell what a number, fact, or field refers to without rereading the raw source, file a usability/clarity ticket to hide it from the default UI or add context such as excerpts, labels, grouping, or provenance.
Data volume and trust: compare the amount of visible data to what the page promises. A rich explorer, dashboard, or workbench with only a few rows/items can look broken, filtered, still loading, sample-only, or untrusted. File a finding when sparse data is not explicitly explained with result counts, active filters, reset affordances, ingestion/coverage status, or sample/demo labeling.
Dates, numbers, and source metadata: normal product UI should not expose storage formats unless it is intentionally a technical log. Flag ISO timestamps, inconsistent relative/absolute date styles, unexplained generated/imported/loaded dates, raw score/null states, and source metadata that does not explain what was sourced, when, and why it matters.
Controls and mental model: controls must match what they do. Sort is not a filter; search is not a facet; finite data domains usually need selects/typeaheads rather than blank text inputs. Flag controls that make users guess exact spelling/casing, hide the available option universe, mix filtering with sorting/view settings, or use the wrong component for the task.
Information hierarchy and panel value: scan cards, sidebars, workbenches, summaries, and metric tiles for whether they communicate anything useful at a glance. File findings for blank-looking panels, repeated cards with unclear labels, counts without named subjects, decorative summaries that consume more attention than the primary workflow, or duplicated navigation that squeezes the actual task area.
Navigation clarity: is it obvious how to get somewhere and back? Dead ends, hidden entry points, surprising redirects, broken links, no clear "home". Flag duplicated nav regions that compete for space without adding page-specific value, and icon systems that degrade into raw punctuation or mix unrelated visual languages.
Flow completeness & expected counterparts: a screen that gates access or shows one side of a standard paired flow must offer the other side — or a clear path to it. A brand-new user must never hit a dead end with no next step. Flag missing companion actions, especially on auth and entry screens:
- Sign-in with no sign-up: a login page with no "Create account" / "Register" link strands anyone who does not already have an account; likewise a registration page with no link back to sign in.
- No account recovery: login with no "Forgot password?", no way to reset, and no way to resend a verification email.
- No exit from a state: a signed-in app with no visible sign-out, or a modal / wizard / detail view with no back, close, or cancel.
- One-way actions: create/add with no matching edit or delete (or the reverse) where a user would reasonably expect both.
- Unreachable entry points: a feature only reachable by guessing a URL, or an empty state with no primary action to populate it. When the missing counterpart makes a core task impossible for a whole class of users (e.g. a new user literally cannot create an account), file a Bug; otherwise file a usability Improvement.
Action preconditions & incomplete end-states: an action whose result only makes sense with multiple inputs (compare, merge, combine, bulk-edit) or with some prerequisite met should guide the user to satisfy that precondition — by disabling/explaining it until it is met, by collecting the inputs first (e.g. a selection tray), or by giving the destination an obvious in-place control to complete it. Actually trigger these actions and watch where they land; flag when a primary action:
- Fires under-satisfied and strands the user: e.g. a per-row "Compare" that navigates to a comparison of a single item, shows an "under limit / add at least 2" notice, but exposes no visible "add another" control — the user is told what is wrong with no in-place means to fix it.
- Lands on an incomplete / empty end-state with no next step: a results or detail screen that announces it is empty, partial, or "needs more" yet offers no affordance to add, retry, or return to the selection that produced it.
- Is offered where it cannot succeed: a multi-item action exposed on a single item, or an action left enabled while its precondition (a selection, a minimum count, a required field) is unmet, with no explanation. When the user is left unable to complete the action they started, file a Bug; when it eventually works but the path is confusing or roundabout, file a usability Improvement.
Visual/layout quality: cut-off or truncated text, overlap, cramped/crowded density, offscreen or unreachable controls, accidental horizontal scroll, awkward empty space. Do not judge this by eyeballing a screenshot alone — a control clipped by a few pixels or pushed just past a container edge looks fine in a thumbnail. Confirm it with the programmatic layout-integrity sweep in §5 at every width.
Consistency / standard UX: components, spacing, button styles, terminology, and interaction patterns should be consistent across the app and follow common conventions. Flag anything non-standard or that differs screen-to-screen.
Load & responsiveness: long or unclear load times, blank screens, skeleton-only shells, spinners / Loading... / Connecting... with no progress, anything that feels slow or janky. Flag pages where the browser reports loaded / complete but meaningful content arrives much later, or where the visible shell appears quickly while the real task content remains missing. Capture user-perceived timings: shell visible, first meaningful content, and stable/complete content. If the delay is noticeable, file a usability/performance ticket even if the eventual content is correct.
Scroll behavior: unexpected scroll position, scroll jumps, nested or locked scroll, sticky elements that cover content, content that cannot be reached.
Behavior correctness: does the obvious action do what a user expects? Confusing errors, silent failures, disabled controls with no explanation, state that does not persist.
Affordance clarity: can the user tell what is clickable, required, in-progress, or complete?

4. Cover All Breakpoints

Discover breakpoints from the app (design tokens, CSS, responsive layout changes) when possible; if unknown, use a practical baseline: phone, tablet/narrow, desktop, plus any app-specific cutoff.
Do not test only the named breakpoints. Clipping and overflow most often appear at the in-between widths — where a row can no longer fit its contents but has not yet collapsed to the next layout. Sweep a range of widths (e.g. 360, 390, 414, 600, 768, 834, 1024, 1280, 1440) plus a few intermediate steps (e.g. ~900–1180) and re-check the key paths at each.
At each width, walk the key paths again and confirm the experience holds: expected shell/navigation variant, critical controls visible and reachable, no unintended horizontal overflow, intentional scroll containers still usable, nothing cut off or crowded.

5. Run Layout-Integrity Checks — Don't Eyeball Alone

A screenshot glance misses controls clipped by a few pixels or pushed just past a container edge. At every width, in addition to looking, take DOM measurements via the browser automation tool (Playwright, Chrome MCP, etc.) and treat any of these as a finding:

Document / container overflow: document.documentElement.scrollWidth > clientWidth, or a horizontal scrollbar on a container that should not scroll → accidental horizontal overflow.
Clipped or offscreen controls: for every interactive control (buttons, links, inputs, selects, menu items), compare its getBoundingClientRect() against the viewport and against each ancestor that has overflow: hidden | clip | auto | scroll. If any edge of the control falls outside those bounds, it is partially or fully clipped / unreachable — even when the page looks fine in a thumbnail. This is exactly the case that gets missed: a submit/apply button whose right edge is cut off by its filter card.
Truncated meaningful text: an element whose scrollWidth > clientWidth (or that renders an ellipsis) where the hidden text carries meaning — e.g. a select showing "Any CRD state" jammed into its chevron, a label cut mid-word.
Colliding controls: a label or value overlapping an adjacent control (icon, chevron, button) with no gap between them.

Record which width(s) trigger each, the offending element, and a screenshot. A primary or interactive control that is clipped, offscreen, or unreachable is a Bug, not merely an Improvement — a user literally cannot see or click all of it.

6. Watch Load & Latency

Measure separate milestones: visible app shell, document.readyState, first meaningful route-specific content, and visually stable/full route content.
Do not treat a visible shell, completed document, or technically clickable page as loaded if the route is still blank, skeleton-only, placeholder-only, or waiting for primary data.
Treat skeleton-only/placeholder-only screens as loading states. If they persist for a noticeable delay, they need clear progress/loading messaging that explains what is happening.
Treat long waits to meaningful content or stable/full content without clear progress, error, retry, or cancellation as findings. Use practical labels (noticeable, slow, unacceptable) and include observed durations when available.

Mutation Discipline

A real first-time user creates, edits, and deletes things — exercise those flows when the environment is safe.

Use unique names with a clear prefix such as qa- or codex-.
Before mutating, identify the cleanup path. After mutating, make a best effort to clean up through the UI, then verify cleanup. If UI cleanup is unavailable, that itself is a usability finding.
Avoid destructive bulk actions unless the user explicitly asks or the account is clearly disposable.
Record all mutations performed, cleanup attempts, and any residue left behind.

Filing findings as tracked work

This skill does not write a report file. Every finding becomes a leaf work item created via lisa:tracker-write (the vendor-neutral writer — it dispatches to the configured tracker and runs the validation gate; never call a vendor *-write-* skill directly). Map each finding to a type:

Finding	`issue_type`	`build_ready`
User-visible bug (broken behavior)	`Bug`	the `ready` flag (default `false`)
Usability / UX / clarity issue	`Improvement`	the `ready` flag (default `false`)

A control that is clipped, offscreen, or otherwise unreachable (per §5) counts as broken behavior → file it as a Bug, not an Improvement. Pure crowding/clarity with the control still fully usable is an Improvement.

Each finding is a flat leaf (no children), so build_ready applies directly — pass it explicitly on every create. Each ticket MUST be a complete spec (the validator rejects thin tickets):

Three-audience description (context / business value, technical approach, stakeholder impact).
For a bug: exact reproduction steps, observed-vs-expected, the env / account / breakpoint it occurred at, and console/network evidence.
For a usability issue: the observed friction (what was confusing, cramped, inconsistent, or hard to understand), who it affects, where (route + breakpoint), and the proposed improvement.
Gherkin acceptance criteria describing the fixed behavior.

Idempotency — don't spam duplicates

Re-running a pass must not refile the same finding. Before creating a ticket, search the tracker for an open ticket carrying a stable marker [lisa-exploratory-qa] <finding-key> in its body (the <finding-key> is a stable slug of surface + symptom, e.g. settings-modal/horizontal-overflow@tablet). If one exists, reference/update it instead of creating a duplicate; only create when none exists. Match by the marker, never by title (titles get edited). A closed prior ticket does not suppress a new one — a recurrence after a fix is a genuine regression.

Output

No report file. Emit a concise in-session summary:

Scope: target URL/env, browser/tool, account type, build/version if visible, date.
First impression: could a new user tell what the app is and what to do first?
Findings filed, bucketed by type — bugs, usability/clarity issues — each with its created or referenced ticket ref and its build-ready state (ready vs triage/backlog).
Observed but not filed: anything noticed but intentionally not ticketed, with why.

Quality Bar

Explore as a true first-time user — judge clarity, not whether you (who can read the code) can figure it out.
Prefer concrete, reproducible findings. Every ticket must stand alone for an implementer who was not in the session.
Do not claim cleanup succeeded unless verified.
File tickets per the ready flag (default: backlog for human triage).
This skill is about the human experience only — route automated-coverage gaps to e2e-coverage-gaps.
Preserve unrelated repo changes.

exploratory-qa

More from this repository

More from this repository

Exploratory QA

Overview

Parameters

Core Workflow

1. Establish Scope

2. Arrive Cold

3. Use It Like a Human

4. Cover All Breakpoints

5. Run Layout-Integrity Checks — Don't Eyeball Alone

6. Watch Load & Latency

Mutation Discipline

Filing findings as tracked work

Idempotency — don't spam duplicates

Output

Quality Bar

Exploratory QA

Overview

Parameters

Core Workflow

1. Establish Scope

2. Arrive Cold

3. Use It Like a Human

4. Cover All Breakpoints

5. Run Layout-Integrity Checks — Don't Eyeball Alone

6. Watch Load & Latency

Mutation Discipline

Filing findings as tracked work

Idempotency — don't spam duplicates

Output

Quality Bar