ワンクリックで
aws-sagemaker
Amazon SageMaker for building, training, and deploying machine learning models. Use for SageMaker AI endpoints, model training, inference, MLOps, and AWS machine learning services.
メニュー
Amazon SageMaker for building, training, and deploying machine learning models. Use for SageMaker AI endpoints, model training, inference, MLOps, and AWS machine learning services.
Amazon Elastic Kubernetes Service (EKS) for running Kubernetes on AWS. Use for container orchestration, deploying applications, managing clusters, and Kubernetes workloads on AWS.
NVIDIA NeMo framework for building and training conversational AI models. Use for NeMo Retriever models, RAG (Retrieval-Augmented Generation), embedding models, enterprise search, and multilingual retrieval systems.
AWS Prescriptive Guidance for best practices and architectural patterns. Use for AWS architecture recommendations, SageMaker AI endpoints guidance, deployment patterns, and AWS solution architectures.
NVIDIA API documentation for integrating NVIDIA services. Use for NVIDIA NIM (NVIDIA Inference Microservices), LLM APIs, visual models, multimodal APIs, retrieval APIs, healthcare APIs, and CUDA-X microservices integration.
NVIDIA NIM (NVIDIA Inference Microservices) for deploying and managing AI models. Use for NIM microservices, model inference, API integration, and building AI applications with NVIDIA's inference infrastructure.
| name | aws-sagemaker |
| description | Amazon SageMaker for building, training, and deploying machine learning models. Use for SageMaker AI endpoints, model training, inference, MLOps, and AWS machine learning services. |
Comprehensive assistance with Amazon SageMaker development, covering the complete ML lifecycle from data preparation to model deployment and monitoring.
This skill should be triggered when:
Model Training & Development
Model Deployment & Inference
Data Preparation
Model Management & MLOps
SageMaker Studio & Environments
Edge Deployment
SageMaker Domain: A centralized environment for ML workflows, providing authentication, authorization, and resource management for teams.
Model Registry: Versioned catalog of ML models with metadata, approval workflows, and deployment tracking.
Endpoints: Deployed models that provide real-time or serverless inference capabilities.
Model Monitor: Automated monitoring for data quality, model quality, bias drift, and feature attribution drift in production.
Training Jobs: Managed infrastructure for training ML models at scale with automatic resource provisioning.
Model Packages: Versioned entities in Model Registry containing model artifacts, inference specifications, and metadata.
Monitor your model's performance by checking execution history:
# List the latest monitoring executions
mon_executions = my_default_monitor.list_executions()
print("Waiting for the 1st execution to happen...")
while len(mon_executions) == 0:
print("Waiting for the 1st execution to happen...")
time.sleep(60)
mon_executions = my_default_monitor.list_executions()
Set up AWS CLI for SageMaker operations:
# Configure AWS credentials
aws configure
# This will prompt for:
# - AWS Access Key ID
# - AWS Secret Access Key
# - Default region name
# - Default output format
If using a firewall, whitelist these Data Wrangler URLs:
https://ui.prod-1.data-wrangler.sagemaker.aws/
https://ui.prod-2.data-wrangler.sagemaker.aws/
https://ui.prod-3.data-wrangler.sagemaker.aws/
https://ui.prod-4.data-wrangler.sagemaker.aws/
Deploy models with fine-grained control using the SageMaker Python SDK:
from sagemaker.model_builder import ModelBuilder
# Initialize ModelBuilder with custom configuration
model = ModelBuilder(
model_data="s3://my-bucket/model.tar.gz",
role=execution_role,
instance_type="ml.m5.xlarge",
framework="pytorch",
framework_version="1.12"
)
# Deploy to endpoint
predictor = model.deploy(
initial_instance_count=1,
instance_type="ml.m5.xlarge"
)
Model packages in Registry follow this ARN structure:
arn:aws:sagemaker:region:account:model-package-group/version
Example:
arn:aws:sagemaker:us-east-1:123456789012:model-package-group/my-model-group/version/1
Grant permissions for Partner AI Apps:
# Attach the managed policy for AWS Marketplace
policy_arn = "AWSMarketplaceManageSubscriptions"
# This policy allows administrators to:
# - Subscribe to Partner AI Apps
# - Manage marketplace subscriptions
# - Purchase apps from AWS Marketplace
Key CloudWatch metrics for serverless endpoints:
# Monitor for cold starts
metric_name = "OverheadLatency"
# Handle validation errors
error_type = "ValidationError"
# These metrics help you understand:
# - Cold start frequency and duration
# - Request validation failures
# - Overall endpoint performance
Work with model metadata using resource groups:
# Resource groups help organize and manage models
resource_group_tag = "sagemaker"
# Model artifacts should include this tag for:
# - Easier discovery in Model Registry
# - Integration with IAM policies
# - Automated resource management
Configure processing jobs with custom environment:
from sagemaker.processing import ProcessingInput, ProcessingOutput
processing_job_config = {
"Environment": {
"MY_VARIABLE": "value",
"DATA_PATH": "/opt/ml/processing/input"
},
"ProcessingInputs": [
ProcessingInput(
source="s3://my-bucket/data/",
destination="/opt/ml/processing/input"
)
]
}
# Environment variables follow pattern: [a-zA-Z_][a-zA-Z0-9_]*
Check model quality violations:
# List generated reports
reports = monitor.list_reports()
# Check violations report for issues
violations = monitor.list_violations()
# Violations are generated when:
# - Data quality degrades below threshold
# - Model predictions drift from baseline
# - Bias metrics exceed acceptable limits
This skill includes comprehensive documentation organized by topic:
Start Here:
getting_started.md for prerequisites and domain setupstudio.md for JumpStart pre-trained modelsendpoints.md for low-code MLFirst Tasks:
Focus Areas:
training.md - Create custom training jobsinference.md - Deploy models with Inference Recommendermodels.md - Set up Model Registry for version controlCommon Workflows:
Advanced Topics:
Best Practices:
When working with SageMaker domains, IDs follow this pattern:
d-(-*[a-z0-9]){1,61}
Example: d-abc123def456
User profiles use this naming convention:
[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}
Example: data-scientist-john-doe
Full ARN structure for model packages:
arn:aws[a-z\-]*:sagemaker:[a-z0-9\-]*:[0-9]{12}:user-profile/.*
SageMaker execution roles follow this format:
arn:aws[a-z\-]*:iam::\d{12}:role/?[a-zA-Z_0-9+=,.@\-_/]+
Processing job environment variables must match:
Key: [a-zA-Z_][a-zA-Z0-9_]*
Value: [\S\s]*
1. Build model in Canvas (low-code)
2. Deploy model to endpoint with one click
3. Model hosted on SageMaker infrastructure
4. Invoke endpoint for real-time predictions
5. Integrate with applications via API
1. Train model (SageMaker training job or external)
2. Register model version in Model Registry
3. Evaluate model performance
4. Update approval status (Approved/Rejected)
5. Deploy approved models to production
6. Track deployment history and lineage
1. Deploy model to endpoint
2. Configure Model Monitor baseline
3. Schedule monitoring jobs (hourly/daily)
4. Monitor metrics in Model Dashboard
5. Set CloudWatch alarms for violations
6. Receive alerts when quality degrades
7. Investigate and retrain model if needed
OverheadLatency CloudWatch metricCompletedWithViolations statusFailureReason and ExitMessage in logsAmazonSageMakerFullAccess or create custom policyAmazonSageMakerModelRegistryFullAccessTo refresh this skill with updated documentation: