تشغيل أي مهارة في Manus بنقرة واحدة

$pwd:

deploy-workspace

Name: Deploy Workspace
Author: dlt-hub

// Deploy dlt pipelines to dltHub Platform. Use when the user says "deploy to dltHub", "launch on dltHub", "run on dltHub", "schedule pipeline", or wants to deploy a pipeline or notebook to dltHub.

تشغيل في Manus

$ git log --oneline --stat

stars:١

forks:٠

updated:٢١ مايو ٢٠٢٦ في ١٥:٤٤

مستكشف الملفات

3 ملفات

SKILL.md

readonly

related-skills.json

نفس المستودع

data-quality-dq-rules.md

from "dlt-hub/dlthub-start"

ALWAYS read and follow this skill before acting. Data quality conventions

2026-05-211

data-quality-workflow.md

from "dlt-hub/dlthub-start"

ALWAYS read and follow this skill before acting. Data quality workflow

2026-05-211

dlthub-platform-profiles.md

from "dlt-hub/dlthub-start"

ALWAYS read and follow this skill before acting. Profiles

2026-05-211

dlthub-platform-workflow.md

from "dlt-hub/dlthub-start"

ALWAYS read and follow this skill before acting. Deploy to dltHub Platform

2026-05-211

filesystem-pipeline-workflow.md

from "dlt-hub/dlthub-start"

ALWAYS read and follow this skill before acting. Filesystem pipeline workflow

2026-05-211

prepare-deployment.md

from "dlt-hub/dlthub-start"

Prepare production credentials and destinations for dltHub Platform. Use when setting up prod profile secrets, splitting dev/prod credentials, or configuring a production destination like Motherduck.

2026-05-211

package.json

"author": "dlt-hub"

"repository": "dlt-hub/dlthub-start"

فتح مستودع GitHub عرض مستودعات المنشئ

$ install --global

$ download --local

تشغيل في Manus

$ useful --forSOC

مديرو الشبكات وأنظمة الحاسوبمهن الحاسوب والرياضيات15-1244L4

name	deploy-workspace
description	Deploy dlt pipelines to dltHub Platform. Use when the user says "deploy to dltHub", "launch on dltHub", "run on dltHub", "schedule pipeline", or wants to deploy a pipeline or notebook to dltHub.

Deploy to dltHub Platform

If this is a first deployment, complete (setup-runtime) and (prepare-deployment) first — they set up the workspace, configure credentials, and log in to runtime. Otherwise, continue from here.

Step 1: Prepare scripts for production

Review each script being deployed and fix patterns that are safe locally but harmful in production:

Remove dev_mode=True from dlt.pipeline() calls — it drops and recreates the dataset on every run, destroying production data.
Remove or externalize dev limits — limit=N parameters, .add_limit(N) calls, or hardcoded date ranges meant for testing. Either remove them or make them configurable (e.g. via dlt.config.value).
Verify write_disposition — "replace" is fine for full-refresh pipelines, but confirm the user doesn't actually want "merge" or "append" for incremental loads.
Check if __name__ == "__main__": block — every script must have one or the runtime job does nothing. The block should NOT contain interactive/debug-only code.
Pin the dlt version exactly in pyproject.toml — use == not >= to prevent unexpected upgrades on runtime. If user has a pre-release (e.g. 1.23.0a3), use uv pip install to install it and pin with == in pyproject (do NOT use uv add which may downgrade to latest stable).
Notebooks (marimo apps):
- Verify they use dlt.attach() (not dlt.pipeline()) and that destination and dataset_name are explicitly passed (this is a temporary limitation of the dltHub Platform)
- All visualization dependencies (altair, ibis-framework, pandas, etc.) are in pyproject.toml

Step 2: Deploy, launch, debug

Reference: scheduling-triggers.md | advanced-patterns.md

Step 2a. Deploy a workspace

SKIP for simple workspaces without deployment manifest If __deployment__.py is set up, first run dlthub deploy --dry-run to preview changes, then STOP — show the plan and get approval from the user before deploying.

dlthub deploy  # synchronizes deployment module with runtime

Summarize the output (which jobs created/updated/archived)

Step 2b. Run pipelines and notebooks

dlthub run my_pipeline.py              # sync code + run batch job on cloud
dlthub run my_pipeline.py -f           # sync + run, stream logs while running
dlthub run my_pipeline.py --refresh    # sync + run with a refresh signal
dlthub serve my_notebook.py           # sync code + run interactive job on cloud
dlthub serve my_notebook.py -f        # sync + serve, stream logs
dlthub local run <job_name>            # run locally (uses deployment manifest, no sync)
dlthub local run <job_name> --profile prod             # run under a specific profile
dlthub local run <job_name> --start 2024-01-01 --end 2024-02-01  # interval override (ISO 8601)
dlthub local run <job_name> --config KEY=VALUE         # ad-hoc config override (short: -c)
dlthub local run <job_name> --dry-run                  # resolve entry point without launching
dlthub local serve my_notebook.py     # serve locally

Step 2c. Read logs and debug

dlthub job logs my_pipeline            # check output (use job name)
dlthub job logs my_pipeline -f         # stream logs in real-time

After launching:

Check the first run completes successfully with dlthub job logs
If it fails, use (debug-deployment) to diagnose
Once successful, run dlthub show to open the dltHub web UI and show the user their pipeline is live

Step 3: Schedule a pipeline (cron)

Scheduling requires a __deployment__.py manifest. Go back to (prepare-deployment) and execute Step 5 if not yet done.

Add a trigger to the @run.pipeline decorator:

from dlt.hub import run
from dlt.hub.run import trigger

@run.pipeline("my_pipeline", trigger=trigger.schedule("0 0 * * *"))  # daily at midnight UTC
def run_my_pipeline():
    pipeline = dlt.pipeline(
        pipeline_name="my_pipeline",
        destination="warehouse",
        dataset_name="my_dataset",
    )
    pipeline.run(my_source())

A bare cron string also works: trigger="0 0 * * *".

Then deploy:

dlthub deploy                    # sync manifest to Runtime
dlthub deploy --dry-run          # preview without applying
dlthub job list                  # confirm triggers are set

Other trigger types (from dlt.hub.run.trigger):

trigger.every("6h") -- every 6 hours
trigger.once("2026-12-31T23:59:59Z") -- one-shot at a timestamp
upstream_job.success -- chain after another job succeeds (followup trigger)

Notes:

Triggers declared in code are the source of truth -- there is no CLI command for adding/removing schedules.
dlthub deploy reconciles all jobs -- new ones are added, removed ones are archived, unchanged ones are left alone.

See scheduling-triggers.md for the full trigger types table and more examples.

Step 4: Advanced trigger and scheduling options

See advanced-patterns.md for full examples of each pattern:

Followup jobs -- chain pipelines with trigger=ingest_job.success. The transform runs automatically after ingest succeeds. Use when you have non-incremental pipelines that should run in sequence.
Scheduler-driven intervals -- for incremental pipelines, declare interval={"start": "2026-01-01T00:00:00Z"} and read run_context["interval_start"] / interval_end from the scheduler. Runtime handles continuity and refresh resets.
Freshness gates -- freshness=[upstream.is_fresh] prevents a job from running until upstream's interval is complete. Use for transforms that shouldn't observe half-loaded data.
Refresh cascade -- a backfill job with refresh="always" cascades a full-refresh signal to all downstream jobs without loading data itself.
Non-pipeline jobs -- @run.job for general batch work (DQ checks, reports), @run.interactive for MCP servers, dashboards, REST APIs.
Dependency groups -- require={"dependency_groups": ["ibis"]} installs extra packages only for jobs that need them. Declare groups in [dependency-groups] in pyproject.toml.
Timeouts -- execute={"timeout": "6h"} overrides the default 120-minute limit. Use for backfill or long-running jobs.

Step 5: Public links for interactive jobs

Share an interactive job (notebook or dashboard) publicly:

dlthub job publish <job_name>    # generate a public URL
dlthub job unpublish <job_name>  # revoke public access

Note: the argument is a job name (e.g. my_notebook), not a file path. Drop any .py extension — passing my_notebook.py will fail because the CLI looks for a job literally named my_notebook.py.

Important

Scripts must have if __name__ == "__main__": or the job does nothing.
Runtime installs from pyproject.toml — add all needed packages (e.g. uv add numpy pandas if using .df()).
Jobs are killed after 120 minutes. Overwrite timeout in the decorators for long running (backfill) jobs
One workspace per GitHub account — connecting a new repo replaces existing deployments.

deploy-workspace

المزيد من هذا المستودع

المزيد من هذا المستودع

Deploy to dltHub Platform

Step 1: Prepare scripts for production

Step 2: Deploy, launch, debug

Step 2a. Deploy a workspace

Step 2b. Run pipelines and notebooks

Step 2c. Read logs and debug

Step 3: Schedule a pipeline (cron)

Step 4: Advanced trigger and scheduling options

Step 5: Public links for interactive jobs

Important

Deploy to dltHub Platform

Step 1: Prepare scripts for production

Step 2: Deploy, launch, debug

Step 2a. Deploy a workspace

Step 2b. Run pipelines and notebooks

Step 2c. Read logs and debug

Step 3: Schedule a pipeline (cron)

Step 4: Advanced trigger and scheduling options

Step 5: Public links for interactive jobs

Important