MLA-C01 Domain 2 Complete Guide: ML Model Development for 26%

Domain 2, ML Model Development, is worth 26% of MLA-C01 and behaves like a diagnosis-to-prescription exam: you read a symptom in the data or training run and pick the right fix. This guide covers the three task statements – choosing a modeling approach, training and refining, and analyzing performance – with the AWS tools behind each.
- The Character of Domain 2: Symptom to Prescription
- Task 2.1: Consider Not Building Before You Build
- A Map of Built-in Algorithms, from XGBoost to DeepAR
- Task 2.2: The Vocabulary of Training – Epochs, Batches, Distributed Training
- Fighting Overfitting: Regularization, Dropout, Catastrophic Forgetting
- Hyperparameter Tuning: AMT Retires Brute Force
- Slimming and Combining Models: Ensembles to the Model Registry
- Task 2.3: Choosing Evaluation Metrics, from Confusion Matrix to AUC
- Clarify and Model Debugger: Interpretation, Bias, Convergence
- High-Frequency Checklist: Self-Diagnosis for Exam Day
- Conclusion: Learn Model Development as Diagnostics
The Character of Domain 2: Symptom to Prescription
| Task | Theme | What is tested |
| Task 2.1 | Choose a modeling approach | Algorithm fit, when to use pre-built AI services, cost and interpretability |
| Task 2.2 | Train and refine models | Training control, regularization, hyperparameter tuning, versioning |
| Task 2.3 | Analyze model performance | Metric selection, baselines, overfitting detection, convergence debugging |
Task 2.1: Consider Not Building Before You Build
The cheapest model is the one you do not train. Before custom modeling, the exam wants you to weigh managed AI services such as Amazon Rekognition, Amazon Comprehend, Amazon Transcribe, and Amazon Bedrock. If a ready-made service solves the problem, that is often the right answer on cost and time to value. Custom SageMaker modeling is for when those do not fit.
A Map of Built-in Algorithms, from XGBoost to DeepAR
| Algorithm | Task | In a phrase |
| XGBoost | Classification / regression | Gradient-boosted trees, the first pick for tabular data |
| Linear Learner | Classification / regression | Fast, interpretable linear baseline |
| K-Means | Clustering | Unsupervised grouping |
| PCA | Dimensionality reduction | Compress features, prep for visualization |
| Random Cut Forest | Anomaly detection | Unsupervised outlier scoring |
| DeepAR | Time-series forecasting | RNN-based probabilistic forecasts across many series |
Task 2.2: The Vocabulary of Training – Epochs, Batches, Distributed Training
Know the levers: an epoch is one pass over the training data, batch size controls how many samples update the weights at once, and the learning rate sets the step size. For large models, distributed training splits the work, with data parallelism replicating the model across GPUs and model parallelism splitting the model itself. SageMaker provides libraries for both.
Fighting Overfitting: Regularization, Dropout, Catastrophic Forgetting
When a model memorizes the training set, reach for L1 and L2 regularization, dropout, early stopping, or more data and augmentation. In transfer learning and fine-tuning, watch for catastrophic forgetting, where new training erases earlier capability. The exam frames these as fixes for a described symptom.
Hyperparameter Tuning: AMT Retires Brute Force
SageMaker Automatic Model Tuning (AMT) searches the hyperparameter space for you. Grid and random search are the baselines, but Bayesian optimization converges faster by learning from past trials, and Hyperband stops weak runs early. Knowing why Bayesian beats grid search on cost is a common question.
Slimming and Combining Models: Ensembles to the Model Registry
Ensembles such as bagging and boosting raise accuracy, while distillation, pruning, and quantization shrink models for cheaper inference. Once a model is ready, the SageMaker Model Registry versions it and gates approval before deployment, linking Domain 2 to the deployment workflow in Domain 3.
Task 2.3: Choosing Evaluation Metrics, from Confusion Matrix to AUC
Pick the metric that matches the business cost. Accuracy misleads on imbalanced data, so reach for precision, recall, F1, and ROC-AUC for classification, and RMSE, MAE, or R-squared for regression. The exam loves scenarios where recall matters more than precision (fraud, disease) or the reverse, and expects you to read the confusion matrix accordingly.
Clarify and Model Debugger: Interpretation, Bias, Convergence
SageMaker Clarify explains predictions with SHAP values and checks post-training bias, while SageMaker Debugger captures tensors during training to diagnose vanishing gradients, overfitting, and stalled convergence. Together they cover interpretability, fairness, and training health.
High-Frequency Checklist: Self-Diagnosis for Exam Day
Conclusion: Learn Model Development as Diagnostics
Domain 2 rewards the engineer who reads symptoms and prescribes the right tool. Internalize the algorithm map, the training levers, the tuning strategies, and the metrics, and 26% of the exam becomes a series of familiar diagnoses.




