| name | opencli-getyourguide |
| description | Execute and troubleshoot opencli getyourguide commands for GetYourGuide activities — search, detail, pricing, probe. Use when the user mentions GetYourGuide, GYG, getyourguide.com, or provides a GYG activity URL (pattern `/city-l{id}/title-t{activityId}/`) or the trailing `t{N}` activity ID; also use when opencli-router dispatches a GYG task or a tours / compare workflow targets the getyourguide platform. |
opencli-getyourguide
- Platform:
getyourguide
- Owner: klook-cli core team (ryan.huang@klook.com) — scrape logic maintained here; listing-page discovery owned by BD team (see
docs/colleague-handoff.md)
- Strategy:
BROWSER_BRIDGE (GYG is SSR, no public API). Verify Browser Bridge with opencli doctor.
- Domain:
getyourguide.com
- Source:
src/clis/getyourguide/ (search.ts, detail.ts, pricing.ts — plus probe.ts and probe2.ts debug helpers)
Identifiers
- GYG activity IDs are numeric, extracted from the trailing
t{N} segment of the URL.
- Canonical URL:
https://www.getyourguide.com/<city>-l{cityId}/<slug>-t{activityId}/.
parseActivityId in src/clis/getyourguide/detail.ts accepts either the full URL or the bare numeric ID — no need to pass the whole URL if you already have the trailing number.
- City prefix (
l{cityId}) is separate from activity ID and not used by any command directly — it only shows up in search result URLs.
Commands
search
opencli getyourguide search-activities "<query>" --limit <N> -f json
Browser Bridge DOM extraction; expect ~10s. Returns { id, title, price, rating, review_count, url }[]. Rating pattern on result cards is 4.3(1,032) — the scraper parses both score and review count from that string.
detail
opencli getyourguide get-activity <id-or-url> -f json
- Title from
h1; description from meta[name="description"].
- Language dropdown: the detail scraper auto-clicks the language button (matches English / 日本語 / 中文 / 한국어 / Français / Deutsch / Español / Italiano / Thai / Vietnamese / etc.) and enumerates available language options. This is a variant axis GYG buries in UI — it's part of your package matrix.
- Generic dropdown walker runs after the language pass to catch passenger-tier / vehicle / other variants.
pricing
opencli getyourguide get-pricing-matrix <id-or-url> --days 7 -f json
Per-variant × per-date matrix. Flow:
- Click "Check availability" (the sticky CTA) to open the datepicker.
- For each target date, click the cell whose
aria-label matches "Weekday, Month Day, Year" (e.g. Saturday, April 25, 2026).
- Wait for the variants pane to re-render, scrape per-variant price.
- Re-open the datepicker between dates — after the first selection, the CTA is replaced by a date input that re-opens the picker on click.
The datepicker does embed a "special price" badge per day, but the scraper ignores it — those badges are promotional and don't reflect the real per-variant price that materializes after selection.
probe / probe2 (debug)
opencli getyourguide probe <url>
opencli getyourguide probe2 <url>
Debug-only. Use when pricing breaks and you need to see whether the "Check availability" button / datepicker selectors changed.
trending
Not supported on GYG. Do not offer this command.
Quirks
- Language is a first-class variant: GYG's UX nests language inside a button rather than a price matrix cell. The scraper pulls it separately — your output will have more "variants" than a naïve price-matrix scrape would suggest.
- Datepicker re-opens differently after first selection: the initial button literally reads "Check availability"; after a date is picked, it becomes a date input. The pricing scraper handles both but keep this in mind when writing probe scripts.
- Lazy card images: search cards lazy-load images; waiting for the image is NOT a reliable proxy for "card ready". The scraper keys off
-t\d+ URL matching instead.
- "Top pick" / "Booked N times" badges: these get stripped from titles with a regex — if titles look polluted, check whether GYG added a new badge format.
- Cancellation policy is an inline chip, not a heading: GYG renders "Free cancellation / Cancel up to 24 hours in advance for a full refund" as a small chip near the title — there's no
h2 for it, so the section walker misses it. The cancellation_policy field is filled by the body-text fallback (extractCancellationFromBody) matching the Free\s+cancellation pattern. Body text inserts a newline between the chip's two lines, so the regex must allow [\s\S] (multiline).
Fallback playbook
When opencli getyourguide get-pricing-matrix <id> fails or returns empty:
- Retry once. The datepicker click sequence is fragile to render jitter.
- Fall back to
detail via node dist/cli.js tours ingest-from-detail getyourguide <id> — detail returns a less granular package list but is enough for coverage tracking.
- Snapshot replay from
data/snapshots/getyourguide-<id>-*.json with tours ingest-from-snapshot getyourguide <file>.
- Manual capture via the
browse skill → navigate, select a date, save JSON matching PricingRunRaw → tours ingest-snapshot.
Known failure modes
-
Empty variants after date click: the variant pane sometimes renders behind a modal / overlay on first visit. The scraper waits, but if GYG tightens the animation, the first date will be empty. Symptom: variant count = 0 for date[0] but normal for dates[1..6].
-
Language dropdown not found: on some single-language tours the language button is absent — languages: [] is the expected output, not a failure.
-
Currency inconsistency: prices appear in whatever currency the Browser Bridge cookie pins. Pin to a canonical currency per environment if you need cross-run comparability.
-
Locale support — date click + bar price fixed, variant cards still TODO (partial, observed 2026-04-28): pricing under non-English locales (zh-TW etc.) used to return 0 rows. Three fixes shipped (commit c2b0329):
- Date click no longer reads localised aria-labels — uses
#month-YYYY-M (id is locale-blind, verified) + day text-content match.
- Bar-price extractor accepts ISO currency prefixes like "USD373" / "TWD12000" alongside symbols ($/€/£/¥/₩).
- When both prices are present, prefers
[class*="actual-price"] (post-discount) over [class*="from-base-price"] (strikethrough original).
Still TODO: option / variant cards (the multiple package choices like "At Low Price / In Crown / In Prado" on t797962) aren't found by current selectors [data-testid*="option-card"], [class*="option-card"], [class*="variant-card"]. After a date click the bar-price fallback succeeds but variant_index=0 single row, not the 3 cards visible in the UI. To fix: probe the post-click DOM (write a one-off in pricing.ts that dumps cards after first click + wait), find the actual class/structure GYG uses on this product, update READ_VARIANTS_JS.
Touchpoints
src/clis/getyourguide/search.ts — card URL regex (-t\d+), title badge stripping
src/clis/getyourguide/detail.ts — parseActivityId, language dropdown handler, generic variant walker
src/clis/getyourguide/pricing.ts — datepicker open/re-open, aria-label date click, variant scrape
src/clis/getyourguide/probe.ts / probe2.ts — debug-only DOM dumps
After any change: npm run build.
I/O Schema
Canonical reference: docs/io-schemas.md — input args, output JSON shapes, DB column mappings.
GYG-specific nuances:
- Language is a first-class axis:
get-activity / get-packages may return more packages[] entries than a naïve "package matrix" would — one per language option. Store under packages.available_languages (JSON array) rather than duplicating SKUs.
get-pricing-matrix returns per-variant × per-date where "variant" combines language + passenger tier.
- No dedicated
trending — skip that section when inserting.
cancellation_policy (cross-platform field, GYG-specific extraction path): filled by body-text fallback because the policy is rendered as an inline chip (no heading). Output is typically a short one-liner like "Free cancellation Cancel up to 24 hours in advance for a full refund". For non-cancellable activities GYG simply omits the chip — empty string is the expected result, not a scraper failure.
Writes when called via tours pipeline: activities, packages, skus, sku_observations — same as other platforms. Use the languages array from get-activity output to populate packages.available_languages JSON.
Listing (new pipeline)
GYG city URL: https://www.getyourguide.com/{slug}-l{LID}/. The country
page caps at the top ~700–1000 activities; for full coverage iterate every
verified city LID.
Reference data: data/listings/gyg-destinations.csv — verified Thailand
LIDs (from the colleague archive). Stale LIDs that redirect to other
countries are real; always verify document.title after navigation.
Reference (verbatim playbook):
docs/listings/gyg-skill-archive.md — colleague's Claude-driven Show-more
- extraction recipe. Useful as the engineering reference for the future
adapter command.
Locality: search.ts:41 currently returns cityName: ''. The slug
carries the city (/bangkok-l169/...-t12345/) — derive from URL when
producing listing JSON; or extend the adapter when convenient.
TODO: implement opencli getyourguide search-by-url <url> (parses
city from slug, walks Show-more, emits raw cards). See
docs/listings/design.md §6.