Load the latest model checkpoint, run evaluation on the test set, and generate a metrics report with confusion matrix. Use this after training to assess model performance or to re-evaluate a specific checkpoint.
Generate a comprehensive summary report of the latest experiment including metrics, plots, and comparison with baseline. Use this after training and evaluation to create a shareable experiment summary.
Run the full data science pipeline: validate raw data, preprocess, engineer features, train model, and evaluate. Use this when you want to execute the end-to-end ML pipeline or re-run it after data or code changes.
Run API integration tests against the running backend, verify endpoints return expected responses and status codes. Use after deploying a preview or starting the dev server.
Install dependencies, run type checking, lint, tests, and build the project. Use after making code changes to verify nothing is broken.
Build Docker images and launch a local preview environment with docker-compose. Use to test the full stack locally before merging.
Build the Xcode project and run the full test suite. Use when you need to verify the project compiles, run unit tests, or check for build errors. Reports pass/fail results with detailed error output.
Build and launch the app in the iOS Simulator. Automatically selects an appropriate simulator device, boots it if needed, and installs and launches the app.