| name | alibabacloud-ecs-patch-management |
| description | Alibaba Cloud ECS Patch Management Skill. Use for scanning and installing OS patches on ECS instances via OOS (Operation Orchestration Service).
Triggers: "patch management", "scan patches", "install patches", "OS update", "security patches", "ACS-ECS-BulkyApplyPatchBaseline", "oos patch", "系统补丁扫描", "系统补丁安装".
|
ECS Patch Management
Scan and install operating system patches on Alibaba Cloud ECS instances using the OOS ACS-ECS-BulkyApplyPatchBaseline template.
Architecture
ECS + OOS + Cloud Assistant + Snapshot (optional)
- OOS Template:
ACS-ECS-BulkyApplyPatchBaseline — built-in public template for batch patch operations
- Cloud Assistant: Executes patch commands on ECS instances (requires Cloud Assistant client installed)
- Snapshot Service: Creates pre-patch disk snapshots for rollback safety (optional)
- Patch Baseline: Defines which patches to scan/install (uses default system baseline or custom)
Pre-check: Aliyun CLI >= 3.3.3 Required
Run aliyun version to verify >= 3.3.3. If not installed or version too low,
run curl -fsSL https://aliyuncli.alicdn.com/setup.sh | bash to install/update,
or see references/cli-installation-guide.md for installation instructions.
Pre-check: Aliyun CLI plugin update required
[MUST] run aliyun configure set --auto-plugin-install true to enable automatic plugin installation.
[MUST] run aliyun plugin update to ensure that any existing plugins are always up-to-date.
Pre-check: Alibaba Cloud Credentials Required
Security Rules:
- NEVER read, echo, or print AK/SK values
- NEVER ask the user to input AK/SK directly in the conversation or command line
- NEVER use
aliyun configure set with literal credential values
- ONLY use
aliyun configure list to check credential status
aliyun configure list
Check the output for a valid profile (AK, STS, or OAuth identity).
If no valid profile exists, STOP here.
- Obtain credentials from Alibaba Cloud Console
- Configure credentials outside of this session (via
aliyun configure in terminal or environment variables in shell profile)
- Return and re-run after
aliyun configure list shows a valid profile
AI-Mode Lifecycle Management
At the start of the Core Workflow (before any CLI invocation):
[MUST] Enable AI-Mode — AI-mode is required for Agent Skill execution.
Run the following commands before any CLI invocation:
aliyun configure ai-mode enable
aliyun configure ai-mode set-user-agent --user-agent "AlibabaCloud-Agent-Skills/alibabacloud-ecs-patch-management"
[MUST] Disable AI-Mode at EVERY exit point — Before delivering the final response for ANY reason, always disable AI-mode first. This applies to ALL exit paths: workflow success, workflow failure, error/exception, user cancellation, session end, or any other scenario where no further CLI commands will be executed.
AI-mode is only used for Agent Skill invocation scenarios and MUST NOT remain enabled after the skill stops running.
aliyun configure ai-mode disable
RAM Policy
Required permissions for this skill:
| Service | Actions | Purpose |
|---|
| OOS | StartExecution, ListExecutions, CancelExecution, ListTemplates, GetTemplate | Manage patch executions |
| OOS | ListPatchBaselines, GetPatchBaseline, ListInstancePatches, ListInstancePatchStates | Patch baseline and status queries |
| ECS | DescribeInstances, DescribeInvocations, DescribeInvocationResults, InvokeCommand, | Instance verification and Cloud Assistant |
| ECS | CreateSnapshot, DescribeSnapshots | Snapshot management (optional) |
Full details: references/ram-policies.md
[MUST] Permission Failure Handling: When any command or API call fails due to permission errors at any point during execution, follow this process:
- Read
references/ram-policies.md to get the full list of permissions required by this SKILL
- Use
ram-permission-diagnose skill to guide the user through requesting the necessary permissions
- Pause and wait until the user confirms that the required permissions have been granted
Parameters Requiring User Confirmation
IMPORTANT: Parameter Confirmation — Before executing any command or API call,
ALL user-customizable parameters (e.g., RegionId, instance IDs, action type,
snapshot settings, etc.) MUST be confirmed with the user. Do NOT assume or use
default values without explicit user approval.
| Parameter | Required | Description | Default |
|---|
RegionId | Yes | Alibaba Cloud region ID (e.g., cn-hangzhou, cn-shanghai) | None |
InstanceIds | Yes | Target ECS instance IDs (e.g., ["i-bp1example0000000001"]) | None |
Action | Yes | Operation type: scan (scan only) or install (scan + install) | None |
rebootIfNeed | No (install only) | Whether to reboot the instance if patches require it | false |
whetherCreateSnapshot | No (install only) | Whether to create a snapshot before installing patches | false |
retentionDays | No (install only) | Snapshot retention period in days (1-65536) | 7 |
patchBaselineName | No | Custom patch baseline name (uses default system baseline if not specified) | System default |
rebootMethod | No (install only) | Reboot method: later (reboot later), immediately (reboot immediately), never (never reboot) | later |
loopMode | No | Batch mode: Automatic, FirstBatchPause, EveryBatchPause | Automatic |
safetyCheck | No | Security check: Skip, ConfirmEveryHighRiskAction | ConfirmEveryHighRiskAction |
Core Workflow
Step 0: Enable AI-Mode
aliyun configure ai-mode enable
aliyun configure ai-mode set-user-agent --user-agent "AlibabaCloud-Agent-Skills/alibabacloud-ecs-patch-management"
Step 1: Verify CLI and Credentials
aliyun version
aliyun configure list
Step 2: Verify Target Instances
Confirm the target ECS instances exist and are in Running status:
aliyun ecs describe-instances \
--region <RegionId> \
--instance-ids '<InstanceIds_JSON>' \
--cli-query 'Instances.Instance[].{InstanceId:InstanceId, Status:Status, OSName:OSName}'
Prerequisite: Target ECS instances must have the Cloud Assistant client installed and running. Most Alibaba Cloud public images include it by default.
Step 3: Start Patch Execution
[MUST] StartExecution is asynchronous. The response only confirms that the execution has been submitted, not that it has finished. The response returns an ExecutionId in the format exec-xxx (e.g., exec-example0000000001). You MUST capture this ExecutionId and poll ListExecutions (Step 4) until Status reaches a terminal value (Success or Failed) before considering the operation complete. Do NOT treat a successful StartExecution response as proof that the patch operation succeeded.
Option A: Scan Only
aliyun oos start-execution \
--region <RegionId> \
--biz-region-id <RegionId> \
--template-name ACS-ECS-BulkyApplyPatchBaseline \
--parameters '{"regionId":"<RegionId>","action":"scan","targets":{"ResourceIds":<InstanceIds_JSON>,"RegionId":"<RegionId>","Type":"ResourceIds"}}'
Option B: Install Patches
aliyun oos start-execution \
--region <RegionId> \
--biz-region-id <RegionId> \
--template-name ACS-ECS-BulkyApplyPatchBaseline \
--parameters '{"regionId":"<RegionId>","action":"install","rebootIfNeed":<true/false>,"whetherCreateSnapshot":<true/false>,"retentionDays":<number>,"targets":{"ResourceIds":<InstanceIds_JSON>,"RegionId":"<RegionId>","Type":"ResourceIds"}}'
Example — Scan for instance i-bp1example0000000001 in cn-hangzhou:
aliyun oos start-execution \
--region cn-hangzhou \
--biz-region-id cn-hangzhou \
--template-name ACS-ECS-BulkyApplyPatchBaseline \
--parameters '{"regionId":"cn-hangzhou","action":"scan","targets":{"ResourceIds":["i-bp1example0000000001"],"RegionId":"cn-hangzhou","Type":"ResourceIds"}}'
Example — Install patches with snapshot and auto-reboot:
aliyun oos start-execution \
--region cn-hangzhou \
--biz-region-id cn-hangzhou \
--template-name ACS-ECS-BulkyApplyPatchBaseline \
--parameters '{"regionId":"cn-hangzhou","action":"install","rebootIfNeed":true,"whetherCreateSnapshot":true,"retentionDays":7,"targets":{"ResourceIds":["i-bp1example0000000001"],"RegionId":"cn-hangzhou","Type":"ResourceIds"}}'
Step 4: Monitor Execution Status
Extract the ExecutionId (format: exec-xxx) from Step 3's response, then poll ListExecutions until the execution reaches a terminal status:
aliyun oos list-executions \
--region <RegionId> \
--biz-region-id <RegionId> \
--execution-id <ExecutionId> \
--cli-query 'Executions.Execution[0].{ExecutionId:ExecutionId, Status:Status, StartDate:StartDate, EndDate:EndDate}'
Terminal statuses (stop polling when Status matches one of these):
| Status | Meaning | Next action |
|---|
Success | Execution finished successfully | Proceed to Step 5 (logs) and Step 6 (verify patches) |
Failed | Execution failed | Inspect logs in Step 5 to diagnose the failure |
Cancelled | Execution was cancelled | No further action; re-run if needed |
Non-terminal statuses (keep polling): Started, Running, Queued, Waiting. Continue polling at a reasonable interval (e.g., every 10–30 seconds) until a terminal status is reached. Only consider the patch operation complete once Status is Success or Failed.
Step 5: View Execution Logs
aliyun oos list-execution-logs \
--region <RegionId> \
--execution-id <ExecutionId>
Step 6: Verify Patch Results
After a scan or install execution reaches Success, two complementary APIs report patch state on each instance:
6a. ListInstancePatches — per-patch detail on a single instance
Returns the full list of individual patches detected on the instance, with metadata such as patch name/KB, classification, severity, and per-patch status (e.g., Installed, Missing, NotApplicable). Use this to inspect which patches are present, missing, or failed.
aliyun oos list-instance-patches \
--region <RegionId> \
--biz-region-id <RegionId> \
--instance-id <InstanceId>
6b. ListInstancePatchStates — per-state count summary across instances
Returns aggregate counts of patches grouped by state for each instance (e.g., InstalledCount, MissingCount, FailedCount, NotApplicableCount). Use this to quickly assess how many patches fall into each category without enumerating individual patches.
aliyun oos list-instance-patch-states \
--region <RegionId> \
--biz-region-id <RegionId> \
--instance-ids '<InstanceIds_JSON>'
When to use which:
- Need a high-level compliance summary (e.g., "instance i-xxx still has 3 missing patches")? →
ListInstancePatchStates
- Need to identify specific patches (e.g., "which CVEs are still missing")? →
ListInstancePatches
Step 7: Disable AI-Mode
aliyun configure ai-mode disable
[MUST] Always disable AI-mode at every exit point — success, failure, error, or cancellation.
Success Verification
See references/verification-method.md for detailed verification steps.
Quick verification checklist:
| Step | Check | Command |
|---|
| 1 | CLI version >= 3.3.3 | aliyun version |
| 2 | Credentials valid | aliyun configure list |
| 3 | Instance running | aliyun ecs describe-instances |
| 4 | Execution started | Response contains ExecutionId |
| 5 | Execution succeeded | Status = Success via list-executions |
| 6 | Patches applied | list-instance-patches shows reduced Missing count |
Cleanup
Cancel a Running Execution
aliyun oos cancel-execution \
--region <RegionId> \
--execution-id <ExecutionId>
Delete Old Snapshots (if created)
aliyun ecs describe-snapshots \
--region <RegionId> \
--instance-id <InstanceId>
aliyun ecs delete-snapshot \
--region <RegionId> \
--snapshot-id <SnapshotId>
Best Practices
- Always scan before install — Run
action: scan first to understand what patches are available before committing to installation.
- Enable snapshots for production — Set
whetherCreateSnapshot: true with an appropriate retentionDays for production instances to enable rollback.
- Use
FirstBatchPause for large fleets — When patching many instances, set loopMode: FirstBatchPause to validate the first batch before proceeding.
- Schedule during maintenance windows — Patch installation may require reboots. Coordinate with business stakeholders.
- Test on non-production first — Always validate patches on a staging/dev instance before applying to production.
- Monitor execution logs — Use
list-execution-logs to track real-time progress and troubleshoot failures.
- Handle reboot carefully — Set
rebootIfNeed: true only when you can tolerate instance downtime. For critical services, use rebootMethod: later and reboot manually.
- Keep retentionDays reasonable — Set snapshot retention between 7-30 days. Longer retention increases storage costs.
Reference Links
Related Documentation