2026.06.09 2026.07.04

AIF-C01 Domain 1 Complete Guide: AI/ML Fundamentals and the ML Lifecycle (20%)

swiftwand

Domain 1, Fundamentals of AI and ML, is the foundation of the AIF-C01 exam at 20% of the scored content. It does not ask you to build models, but it does demand a precise vocabulary: the difference between AI, machine learning, deep learning, and generative AI, the types of learning, and the metrics used to judge a model. Domains 2 and 3 (52% combined) are built on the terms you learn here, so a solid Domain 1 pays compounding dividends.

The shape of Domain 1 — three task statements
The nested structure of AI, ML, deep learning, and generative AI
Two forms of inference and the types of data
Three types of learning — supervised, unsupervised, reinforcement
When to use AI — and when not to
Choosing an ML approach — regression, classification, clustering
Real-world use cases and the AWS AI service map
The ML development lifecycle — the official nine stages
The SageMaker AI family mapped to the lifecycle
Model evaluation metrics — technical and business
Summary — the scoring strategy for Domain 1
References

忍者AdMax

The shape of Domain 1 — three task statements

The official exam guide splits Domain 1 into three task statements: explain basic AI concepts and terminology; identify practical use cases for AI; and describe the machine-learning development lifecycle. In plain terms, the domain tests whether you can place a problem on the AI map, decide whether ML is even the right approach, and name the stages a model passes through from data to deployment.

The nested structure of AI, ML, deep learning, and generative AI

Think of four nested circles. Artificial intelligence is the outermost: any system that performs tasks normally requiring human intelligence. Machine learning sits inside it: systems that learn patterns from data rather than following hand-written rules. Deep learning is a subset of ML that uses multi-layer neural networks. Generative AI lives inside deep learning: foundation models that generate new text, images, or code. Exam questions often test this hierarchy directly, so be able to place any example in the correct circle.

Two forms of inference and the types of data

Inference comes in two forms. Batch inference processes large volumes of data on a schedule when latency does not matter. Real-time inference returns a prediction immediately for a single request, at higher cost. On data, distinguish structured data (rows and columns, as in a database), semi-structured data (JSON, logs), and unstructured data (text, images, audio). Labeled data carries the answer; unlabeled data does not — a distinction that decides which learning type applies.

Three types of learning — supervised, unsupervised, reinforcement

Supervised learning trains on labeled data to predict a known target — classification (spam or not) and regression (predict a price).
Unsupervised learning finds structure in unlabeled data — clustering customers, detecting anomalies, reducing dimensions.
Reinforcement learning learns by trial and error through rewards — robotics, game playing, and the RLHF used to align foundation models.

When to use AI — and when not to

A recurring exam pattern asks you to judge fit. Use this three-way test: if a deterministic result is required, ML is unsuitable; if the rules can be fully written out, ML is unnecessary; only when patterns are too complex to codify and some error is tolerable does ML fit. Tasks such as tax calculation, inventory allocation, or regulatory determinations — where being 95% right is not acceptable — should not use ML. Recognizing this saves you from over-engineering and answers a whole class of questions mechanically.

Choosing an ML approach — regression, classification, clustering

Match the problem to the method. Predicting a continuous number is regression. Sorting items into categories is classification (binary or multi-class). Grouping similar items with no labels is clustering. The exam phrases these as business scenarios, so practice translating “estimate next month’s demand” into regression and “flag fraudulent transactions” into classification.

Real-world use cases and the AWS AI service map

Domain 1 expects you to map a use case to the right managed AWS AI service. A reliable way to remember them is by function:

Use case	AWS service
Image and video analysis	Amazon Rekognition
Text to speech	Amazon Polly
Speech to text	Amazon Transcribe
Translation	Amazon Translate
Text analysis and entities	Amazon Comprehend
Document data extraction	Amazon Textract
Demand and time-series forecasts	Amazon SageMaker AI / Canvas
Generative AI and foundation models	Amazon Bedrock

A memory hook helps: Transcribe writes down what it hears; Polly speaks like a parrot. Linking the service to its function this way cuts down mix-ups on matching questions.

The ML development lifecycle — the official nine stages

The exam guide lists the lifecycle as a sequence, and ordering questions test it: business goal framing, ML problem framing, data collection, data preparation, feature engineering, model training and tuning, model evaluation, deployment, and monitoring. The key insight is that the cycle loops — monitoring feeds back into data and retraining. Memorize this order end to end.

The SageMaker AI family mapped to the lifecycle

Amazon SageMaker AI provides a tool for each stage: Data Wrangler for preparation, Feature Store for features, Training and Automatic Model Tuning for training, Clarify for bias and explainability at evaluation, and endpoints plus Model Monitor for deployment and monitoring. You do not need to operate these, but you should recognize which stage each one serves.

Model evaluation metrics — technical and business

For classification, accuracy is the baseline, but on imbalanced data it misleads: a model that calls everything “normal” can be 99% accurate yet useless for fraud. That is why precision, recall, their harmonic mean F1, and the threshold-independent AUC matter. For regression, error metrics such as RMSE apply. Crucially, the exam also tests business metrics — ROI, cost reduction, customer satisfaction — because a technically strong model that does not move a business metric has not succeeded.

Summary — the scoring strategy for Domain 1

Domain 1 (20%) is half memorization, half judgment. Lock in the AI/ML/DL/GenAI hierarchy, the three learning types, the service map, and the nine-stage lifecycle as recall; practice the “when not to use ML” and metric-selection judgments. With that foundation set, move on to the 24% generative-AI domain.