with one click
ec2-cluster-provision
// uses the code present in this repository for provision an EC2 cluster for benchmarking purposes
// uses the code present in this repository for provision an EC2 cluster for benchmarking purposes
| name | ec2-cluster-provision |
| description | uses the code present in this repository for provision an EC2 cluster for benchmarking purposes |
This project uses a remote benchmarks EC2 cluster constructed with AWS CDK located at benchmarks/cdk.
There's a package.json file in benchmarks/cdk/package.json with relevant commands about deploying.
Authentication for this skill should be handled in this order of preference:
./claude/settings.local.json key aws-commands-prefix (for example aws-vault exec <profile> --)AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY/AWS_SESSION_TOKEN)For method 2, define an awscmd shell function in your session that applies the chosen prefix.
Setup references:
aws-vault project: https://github.com/99designs/aws-vaultBefore running AWS/CDK commands, ensure auth is valid for the chosen method:
# Method 1: AWS SSO profile commands (preferred)
# How to get these values:
# - aws configure sso
# - aws configure list-profiles
# - aws configure get region --profile <profile>
unset AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY AWS_SESSION_TOKEN AWS_SECURITY_TOKEN AWS_CREDENTIAL_EXPIRATION
export AWS_PROFILE=<profile>
export AWS_REGION=${AWS_REGION:-us-east-1}
export AWS_DEFAULT_REGION="$AWS_REGION"
export AWS_SDK_LOAD_CONFIG=1
aws sso login --profile "$AWS_PROFILE"
aws sts get-caller-identity --profile "$AWS_PROFILE" --region "$AWS_REGION"
# Method 2: Command prefix wrapper (example: aws-vault)
# How to get these values:
# - same <profile> discovery as method 1
# - aws-vault list
# - aws-vault exec <profile> -- aws sts get-caller-identity
unset AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY AWS_SESSION_TOKEN AWS_SECURITY_TOKEN AWS_CREDENTIAL_EXPIRATION
export AWS_REGION=${AWS_REGION:-us-east-1}
export AWS_DEFAULT_REGION="$AWS_REGION"
awscmd() { aws-vault exec <profile> -- "$@"; }
awscmd aws sts get-caller-identity --region "$AWS_REGION"
# Method 3: Explicit environment credentials
# How to get these values:
# - https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html
# - include AWS_SESSION_TOKEN when using temporary credentials
unset AWS_PROFILE
export AWS_ACCESS_KEY_ID=<access-key-id>
export AWS_SECRET_ACCESS_KEY=<secret-access-key>
# export AWS_SESSION_TOKEN=<session-token> # when credentials are temporary
export AWS_REGION=${AWS_REGION:-us-east-1}
export AWS_DEFAULT_REGION="$AWS_REGION"
aws sts get-caller-identity --region "$AWS_REGION"
Bootstrap once per account/region:
# Method 1: AWS SSO profile commands
ACCOUNT_ID=$(aws sts get-caller-identity --profile "$AWS_PROFILE" --query Account --output text)
npm run bootstrap -- aws://$ACCOUNT_ID/$AWS_REGION
# Method 2: Command prefix wrapper (example: aws-vault)
ACCOUNT_ID=$(awscmd aws sts get-caller-identity --region "$AWS_REGION" --query Account --output text)
awscmd npm run bootstrap -- aws://$ACCOUNT_ID/$AWS_REGION
# Method 3: Explicit environment credentials
ACCOUNT_ID=$(aws sts get-caller-identity --region "$AWS_REGION" --query Account --output text)
npm run bootstrap -- aws://$ACCOUNT_ID/$AWS_REGION
Running npm run deploy will provision the cluster with the resources specified in benchmarks/cdk/lib/.
This takes a while typically (~5 mins). If the user data of the EC2 machines was changed, and you want those changes
to take effect you will need to prepend the deployment command with USER_DATA_CAUSES_REPLACEMENT=true.
Deployment writes .cdk-outputs.json used by benchmark scripts for bucket resolution.
Once the deployment is complete, the list of instance IDs will be printed to stdout. Pick one instance for validation commands with:
# Method 1: AWS SSO profile commands
INSTANCE_ID=$(aws cloudformation describe-stacks \
--stack-name DataFusionDistributedBenchmarks \
--profile "$AWS_PROFILE" \
--region "$AWS_REGION" \
--query "Stacks[0].Outputs[?OutputKey=='WorkerInstanceIds'].OutputValue" \
--output text | cut -d',' -f1)
# Method 2: Command prefix wrapper (example: aws-vault)
INSTANCE_ID=$(awscmd aws cloudformation describe-stacks \
--stack-name DataFusionDistributedBenchmarks \
--region "$AWS_REGION" \
--query "Stacks[0].Outputs[?OutputKey=='WorkerInstanceIds'].OutputValue" \
--output text | cut -d',' -f1)
# Method 3: Explicit environment credentials
INSTANCE_ID=$(aws cloudformation describe-stacks \
--stack-name DataFusionDistributedBenchmarks \
--region "$AWS_REGION" \
--query "Stacks[0].Outputs[?OutputKey=='WorkerInstanceIds'].OutputValue" \
--output text | cut -d',' -f1)
For all aws examples below, keep using the selected auth style:
aws ... --profile "$AWS_PROFILE" --region "$AWS_REGION"awscmd aws ... --region "$AWS_REGION"aws ... --region "$AWS_REGION"It's usually necessary to verify that everything was deployed correctly, and it's running fine. For that it's necessary to perform the following steps for the following engines:
aws ssm start-session --target $INSTANCE_ID --document-name AWS-StartPortForwardingSession --parameters "portNumber=9000,localPortNumber=9000"curl http://localhost:9000/info | jq .aws ssm start-session --target $INSTANCE_ID --document-name AWS-StartPortForwardingSession --parameters "portNumber=8080,localPortNumber=8080"curl -s -H "X-Trino-User: admin" http://localhost:8080/v1/node | jq .
curl -s -H "X-Trino-User: admin" http://localhost:8080/v1/info | jq .benchmarks/cdk/lib/trino.tsbenchmarks/cdk/bin/spark_http.py):
aws ssm start-session --target $INSTANCE_ID --document-name AWS-StartPortForwardingSession --parameters "portNumber=9003,localPortNumber=9003"http://localhost:9003/health and http://localhost:9003/query to double-check that
everything is consistent with what's expected from benchmarks/cdk/lib/spark.tsaws ssm start-session --target $INSTANCE_ID --document-name AWS-StartPortForwardingSession --parameters "portNumber=9002,localPortNumber=9002"--external-host is incorrectly set to localhost rather than the actual scheduler EC2 private IPRemember that for running port forward commands in the background, they take like 5 secs until the "waiting for connections" message appears. Until then, the port is still not forwarded.
If at some point you need to run a command in all machines and get its output, you can do it
with npm run send-command your custom command