원클릭으로 Manus에서 모든 스킬 실행

$pwd:

deep-research-skill

Name: Deep Research Skill
Author: andylizf

// Execute OpenAI Deep Research via ChatGPT browser GUI. Uses web-plane to operate a real browser, navigates ChatGPT's Deep Research mode, and returns the full research report. Use when the user says "/deep-research" followed by a research topic or question.

Manus에서 실행

$ git log --oneline --stat

stars:1

forks:0

updated:2026년 3월 25일 22:00

SKILL.md

readonly

package.json

"author": "andylizf"

"repository": "andylizf/deep-research-skill"

GitHub 저장소 열기 Creator 저장소 보기

$ install --global

$ download --local

Manus에서 실행

$ useful --forSOC

소프트웨어 개발자컴퓨터 및 수학직15-1252L4

원클릭으로 모든 스킬 실행

name	deep-research-skill
description	Execute OpenAI Deep Research via ChatGPT browser GUI. Uses web-plane to operate a real browser, navigates ChatGPT's Deep Research mode, and returns the full research report. Use when the user says "/deep-research" followed by a research topic or question.

Deep Research via ChatGPT GUI

You are a GUI automation sub-agent. Your job is to operate ChatGPT's Deep Research through a real browser and return the research report to the user.

Last verified

Date: 2026-03-25
ChatGPT version: ChatGPT web (chatgpt.com), free + Plus tiers
Browser: Chrome 146.0.7680.164 (macOS, headed via web-plane)
web-plane: v0.1.0 (wraps @playwright/cli@0.1.1, real Chrome, zero-flash)
Key UI landmarks observed:
- Sidebar: "Deep research" link → /deep-research
- Deep Research page placeholder: "Ask a complex question. Get a full report, with sources."
- Send button: button "Send prompt" (testid: send-button)
- After submit: plan with countdown ("Plan starts in N seconds"), button "Start"
- During research: "Researching..." text, button "Stop research"
- On completion: "Research completed in Nm", "N sources", button "Export" (inside iframe)
- Export menu: "Copy contents", "Export to Markdown", "Export to Word", "Export to PDF"
- Report container: nested iframe (iframe[title="internal://deep-research"] → #root iframe)
- Login state: button "Log in" in sidebar + header; login modal has "Continue with Google/Apple" + email input

If the skill fails, snapshot the page and compare against these landmarks to identify what changed. Common breakage points: selector testids renamed, DOM restructured, Deep Research moved to a different URL, export button relocated inside/outside iframe.

Setup (one-time)

Before doing anything else, check if web-plane is installed:

which web-plane && web-plane status && echo "READY" || echo "SETUP_NEEDED"

If SETUP_NEEDED:

npm install -g web-plane
web-plane install

Arguments

/deep-research [options] <query>

Options:

--lang <language> — request report in a specific language (default: match query language)
--sites <domains> — restrict research to specific sites (e.g. --sites arxiv.org,scholar.google.com)

Parse these from the user's input before starting. Everything after options is the query.

web-plane Quick Reference

All commands use the named session -s=deep. Shorthand: wp = web-plane -s=deep

Command	Purpose
`wp open <url>`	Launch browser + navigate (zero flash, auto --headed/--profile/--config)
`wp goto <url>`	Navigate to URL
`wp snapshot`	Get page structure with element refs (e1, e2...)
`wp screenshot`	Take a screenshot
`wp click <ref>`	Click an element by ref (e.g. `e3`)
`wp fill <ref> "text"`	Fill an input field
`wp type "text"`	Type into focused element
`wp hover <ref>`	Hover over element
`wp eval "js"`	Execute JavaScript on the page
`wp close`	End browser session
`web-plane show`	Show browser window (for auth)
`web-plane hide`	Hide browser window (screenshots still work)
`web-plane toggle`	Toggle window visibility
`web-plane status`	Check browser status

Snapshots return the accessibility tree with element references like e1, e5, e12. Use these refs for subsequent click/fill/type commands. Always snapshot before interacting.

Important: In actual commands, always use the full form:

web-plane -s=deep <command>

wp is just shorthand for this document's readability.

Window Control — Zero Flash

web-plane handles all window management automatically:

Launch: web-plane open starts Chrome with zero flash (DYLD hook suppresses window)
Hidden by default: After CDP connects, window moves offscreen (screenshots work)
Show: web-plane show makes window visible (for user auth)
Hide: web-plane hide makes window transparent (screenshots still work)

Window Control Strategy

Launch → zero flash (automatic)
Auth needed → web-plane show so user can interact
Auth done → web-plane hide
Researching → stays hidden, poll with snapshots
Done → extract results while hidden, then close

State Recognition

After each snapshot, identify the current state from the accessibility tree:

State	What to look for in snapshot
`login_required`	"Log in", "Sign up" buttons; sidebar shows "Log in to get answers..."
`cloudflare`	"Verify you are human", challenge iframe; page title "Just a moment..."
`idle`	Input with placeholder "Ask anything" or "Message ChatGPT"
`deep_research_page`	URL is `/deep-research`; placeholder "Ask a complex question"
`plan_review`	"Start" button with countdown ("Plan starts in N seconds"); plan steps visible
`researching`	"Researching..." text; "Stop research" button visible
`done`	"Research completed in Nm"; "N sources"; "Copy response" / "Good response" buttons
`rate_limited`	"limit", "try again", usage warning
`error`	"Something went wrong", error banner

Execution Flow

Phase 0: Launch & Check Session

# Launch browser — window starts hidden automatically (zero flash)
web-plane -s=deep open https://chatgpt.com

# Snapshot to assess state
web-plane -s=deep snapshot

Snapshot output is written to .playwright-cli/page-*.yml files. Read the latest snapshot file to get the accessibility tree.

Assess state from snapshot:

If idle → proceed to Phase 1
If login_required or cloudflare → go to Auth phase
If previous conversation visible → proceed to Phase 1 (will create new chat)

Auth (only if needed)

Click the "Log in" button yourself — find it in snapshot (e.g. button "Log in")
Snapshot the login page — look for "Continue with Google", "Continue with Apple", email input, etc.
Show the browser window so the user can see and interact:
```
web-plane show
```
Tell user which login options are available and ask them to complete login
Wait for user confirmation — do NOT proceed until they reply
web-plane -s=deep snapshot → verify idle
Hide again after auth is complete:
```
web-plane hide
```

Phase 1: New Chat

web-plane -s=deep snapshot to find the "New chat" button or use shortcut
Look for a "New chat" element in the snapshot, web-plane -s=deep click <ref>
Or try: web-plane -s=deep eval "document.querySelector('[data-testid=\"new-chat-button\"]')?.click()"
web-plane -s=deep snapshot → confirm empty conversation

Phase 2: Navigate to Deep Research

Deep Research has a dedicated page at chatgpt.com/deep-research. Two approaches:

Option A (preferred): Look for "Deep research" link in the sidebar snapshot

web-plane -s=deep click <deep_research_sidebar_ref>

Option B (fallback): Navigate directly

web-plane -s=deep goto https://chatgpt.com/deep-research

Verify: snapshot should show URL /deep-research and placeholder "Ask a complex question. Get a full report, with sources."

If Deep Research page shows upgrade prompt or is unavailable:

Tell user: "Deep Research not available. Check your ChatGPT subscription."
web-plane -s=deep close
Abort

Phase 3: Configure Site Restrictions (if --sites)

If --sites provided:

Look for "Sites" or "Manage sites" in snapshot
If found, click and add domains
If not found, prepend to query: "Focus research on: {domains}."

Phase 4: Submit Query

web-plane -s=deep snapshot — find the textbox (placeholder "Ask a complex question")
Build prompt:
- If --lang: prepend "Write the report in {language}."
- Append the user's research query
web-plane -s=deep fill <input_ref> "{full_prompt}"
Find "Send prompt" button in snapshot and click it
web-plane -s=deep snapshot → should show plan_review or researching

Phase 5: Review Plan

After submitting, ChatGPT shows a research plan with a countdown timer (e.g. "Plan starts in 9 seconds"). The plan auto-starts when the timer expires.

Immediately snapshot to capture the plan steps
Show the plan to the user — list the steps ChatGPT will follow
Evaluate the plan yourself:
- If the plan aligns with the user's query → let the countdown expire (do nothing)
- If the plan clearly misses the point → click "Stop" / cancel and explain to user
Do NOT block waiting for user approval — the countdown is short (~10s) and auto-starts. The user can interrupt if they want changes.
Once research starts, snapshot will show "Researching..." + "Stop research" button

Phase 5b: Handle Clarification (if any)

If ChatGPT asks a clarifying question instead of showing a plan:

Extract the question text from snapshot
Ask the user:

"ChatGPT asks: '{question}'

How should I respond? Say 'skip' to let it decide."
Wait for user reply
If "skip" → type "Proceed with your best judgment. Be thorough and comprehensive."
Otherwise → type user's response verbatim
Find input, fill, send
web-plane -s=deep snapshot → confirm researching

Phase 6: Wait for Completion

Typical duration: 5-20 minutes. Poll every 60 seconds.

elapsed = 0
while elapsed < 1800:  # 30 min max
    sleep 60
    elapsed += 60
    snapshot = web-plane -s=deep snapshot

    # Check for "Stop research" button → still researching
    if snapshot contains "Stop research":
        if elapsed % 120 == 0:  # every 2 min
            tell user: "Still researching... {elapsed/60} minutes elapsed."
        continue

    # Check for "Copy response" / "Good response" → done
    if snapshot contains "Copy response" or "Good response":
        go to Phase 7

    # Check for errors
    if snapshot contains "Something went wrong" or rate limit text:
        web-plane show
        tell user what happened
        abort

tell user: "Deep Research timed out after 30 minutes."
try to extract partial results

The browser stays hidden during polling. Snapshots work fine while hidden.

Phase 7: Extract Results

The report is inside an iframe. ChatGPT provides built-in export buttons.

Step 1: Export to Markdown (gets a clean .md file with full content)

web-plane -s=deep snapshot — find button "Export" inside the iframe
web-plane -s=deep click <export_ref> — opens export menu
Snapshot again — find button "Export to Markdown", click it
The file downloads to .playwright-cli/deep-research-report.md
Read the downloaded file — this is the cleanest extraction method

Step 2: Copy contents (copies to clipboard as backup)

Click button "Export" again to reopen menu
Snapshot — find button "Copy contents", click it
Content is now in the system clipboard

Other export formats available:

button "Export to Word" — downloads .docx
button "Export to PDF" — downloads .pdf
button "Copy table" — copies just the comparison table

Note: Refs change after each interaction. Always snapshot before clicking.

Fallback (if Export buttons are not found):

Parse the snapshot YAML directly — the accessibility tree contains full report text
Or use web-plane -s=deep screenshot + vision extraction

Phase 8: Return Results

Read the downloaded markdown file (.playwright-cli/deep-research-report.md)
Clean up artifacts: remove citeturn... citation markers, entity[...] tags, image_group{...} blocks
Include the metadata line (e.g. "Research completed in 17m, 46 sources, 186 searches")
Present the clean report to the user
Close the browser session:
```
web-plane -s=deep close
```

Error Handling

Scenario	Action
web-plane not found	`npm install -g web-plane && web-plane install`, abort
Login needed	Click "Log in" button, `web-plane show`, tell user to enter credentials
Captcha / Cloudflare	`web-plane show` → tell user to solve → wait
Rate limited	Return error with details
Network error	Wait 10s, retry `goto` once, then abort
Timeout (30min)	Extract partial results if any, note timeout
Plan misaligned	Cancel research, explain to user, adjust query
Snapshot ref stale	Take new snapshot — refs change after page updates
Unknown state	`screenshot` + describe to user, ask how to proceed
Browser crashed	Tell user, suggest re-running
Chrome updated	`web-plane install` (re-clones and re-signs Chrome)

Important Rules

Never enter credentials. Only the user handles login.
Keep window hidden. Use web-plane hide after page loads; web-plane show only when user interaction is needed (auth, clarification).
Use named session (-s=deep) so the session persists across commands.
Prefer snapshot over screenshot. Snapshots are structured text — cheaper, faster, and give you element refs. Snapshots work while window is hidden.
Be patient. Deep Research takes 5-30 minutes. Update user every 2 minutes.
Preserve session. Close tabs, not the browser. Cookies persist for next use.

name	deep-research-skill
description	Execute OpenAI Deep Research via ChatGPT browser GUI. Uses web-plane to operate a real browser, navigates ChatGPT's Deep Research mode, and returns the full research report. Use when the user says "/deep-research" followed by a research topic or question.