MLA-C01 Domain 3 Complete Guide: Deployment and Orchestration for 22%

Domain 3, Deployment and Orchestration of ML Workflows, is 22% of MLA-C01. It is about two things: delivering a model as an endpoint and automating the workflow that keeps it running. This guide covers choosing inference endpoints, infrastructure as code, container strategy, autoscaling, and CI/CD.
- The Big Picture: Delivering and Automating, Worth 22%
- Task 3.1: Choosing Among Four Inference Endpoint Types
- Deployment Is Not Only SageMaker: Multi-Model and Alternative Targets
- Task 3.2: IaC – CloudFormation versus CDK
- Container Strategy: ECR, ECS, EKS, BYOC
- Autoscaling: What Triggers the Stretch and Shrink
- Task 3.3: CI/CD – CodePipeline, CodeBuild, CodeDeploy
- Deployment Strategies: Blue/Green, Canary, Linear
- High-Frequency Checklist: Self-Diagnosis for Exam Day
- Conclusion: Deployment Is a Chain of Choices, Orchestration Is Automation Design
The Big Picture: Delivering and Automating, Worth 22%
| Task | Theme | What is tested |
| Task 3.1 | Choose deployment infrastructure | Endpoint type, compute selection, containers, edge optimization |
| Task 3.2 | Build and script infrastructure | IaC (CloudFormation / CDK), container operations, autoscaling |
| Task 3.3 | CI/CD pipelines | CodePipeline family, deployment strategies, automated tests, retraining |
Task 3.1: Choosing Among Four Inference Endpoint Types
This is one of the most heavily tested tables on the exam. Match the endpoint type to the payload size, latency, and traffic pattern in the scenario.
| Type | Payload limit | Processing time | Best for |
| Real-time | 25 MB | 60 sec (8 min for streaming) | Sustained traffic, millisecond latency |
| Serverless | 4 MB | 60 sec | Intermittent, unpredictable traffic, no charge when idle |
| Asynchronous | 1 GB | Up to 1 hour | Large payloads, long processing, request queuing |
| Batch transform | GB-scale datasets | Long-running | Bulk inference on prepared data, no persistent endpoint |
Deployment Is Not Only SageMaker: Multi-Model and Alternative Targets
To cut cost, multi-model endpoints host many models behind one endpoint, and multi-container endpoints chain or host different containers together. Beyond SageMaker, models can run on Lambda for lightweight inference, on ECS or EKS for container-based serving, or at the edge with SageMaker Neo and AWS IoT Greengrass. The exam asks you to pick the target that fits the latency and cost constraints.
Task 3.2: IaC – CloudFormation versus CDK
Infrastructure as code makes deployments repeatable. CloudFormation uses declarative JSON or YAML templates, while the AWS CDK lets you define the same resources in a programming language such as Python or TypeScript. CDK suits teams who want loops, conditions, and reuse; CloudFormation suits straightforward declarative stacks.
Container Strategy: ECR, ECS, EKS, BYOC
Amazon ECR stores your images, ECS runs containers with less operational overhead, and EKS gives you Kubernetes when you need its ecosystem. Bring Your Own Container (BYOC) lets you package a custom runtime for SageMaker when the built-in images do not fit. Know when managed simplicity (ECS) beats Kubernetes flexibility (EKS).
Autoscaling: What Triggers the Stretch and Shrink
SageMaker endpoints scale on target metrics such as invocations per instance or CloudWatch metrics like CPU and latency. Target tracking is the common pattern. For spiky traffic, serverless inference avoids managing scaling at all. Expect questions that pair a traffic shape with the right scaling approach.
Task 3.3: CI/CD – CodePipeline, CodeBuild, CodeDeploy
The developer tools chain automates the path from commit to production: CodePipeline orchestrates the stages, CodeBuild compiles and tests, and CodeDeploy rolls out the release. SageMaker Pipelines adds ML-specific orchestration for data prep, training, evaluation, and registration, often triggering retraining automatically when new data or drift appears.
Deployment Strategies: Blue/Green, Canary, Linear
Reduce release risk by controlling how traffic shifts. Blue/green swaps all traffic to a new fleet at once with fast rollback, canary sends a small percentage first to test in production, and linear shifts traffic in equal increments. SageMaker supports these for endpoint updates, and the exam asks which one fits a stated risk tolerance.
High-Frequency Checklist: Self-Diagnosis for Exam Day
Conclusion: Deployment Is a Chain of Choices, Orchestration Is Automation Design
Domain 3 is a sequence of trade-offs: endpoint type, compute target, IaC tool, container platform, and rollout strategy. Get fluent in those choices and in the CI/CD chain that automates them, and 22% of the exam turns into a set of clear decisions.





