2026.06.11 2026.07.04

AIF-C01 Domain 3 Complete Guide: Applications of Foundation Models (28%)

swiftwand

Domain 3, Applications of Foundation Models, is the largest domain on AIF-C01 at 28% of the scored content — the section that most often decides pass or fail. It is also the most practical: model selection, inference parameters, RAG, vector databases, the customization cost gradient, agents, and prompt engineering. If your study time is limited, spend it here.

The shape of Domain 3 — four task statements
Model-selection criteria — eight design factors
Inference parameters — controlling output with temperature
RAG and Amazon Bedrock Knowledge Bases
Vector databases — five options on AWS
The cost gradient of the four customization methods
Agents — automating multi-step tasks
Prompt engineering — choosing the technique
Prompt attack risks — four threats
Training, fine-tuning, and data preparation
Evaluating foundation models — ROUGE, BLEU, BERTScore
Domain 3 on the workbench — a real customization decision
Summary — the scoring strategy for Domain 3
References

忍者AdMax

The shape of Domain 3 — four task statements

The four task statements cover: design considerations for applications that use foundation models; choosing effective prompt-engineering techniques; the training and fine-tuning process for foundation models; and how to evaluate foundation-model performance. In short, how to take a pre-trained model and make it useful, safely and cost-effectively, for a specific job.

Model-selection criteria — eight design factors

Selecting a model weighs eight factors: capability for the task, latency requirements, cost per token, context-window size, modality, language coverage, customizability, and licensing or compliance constraints. The exam rewards the instinct to pick the smallest, cheapest model that satisfies the requirement rather than the most powerful one available.

Inference parameters — controlling output with temperature

Three parameters shape output. Temperature controls randomness: low for deterministic, factual answers; high for creative variety. Top-p (nucleus sampling) and top-k limit the candidate token pool. A common exam scenario asks how to make output more consistent — the answer is to lower temperature.

RAG and Amazon Bedrock Knowledge Bases

Retrieval-Augmented Generation grounds a model in your own data: documents are chunked, embedded, and stored in a vector database; at query time the most relevant passages are retrieved and added to the prompt. This reduces hallucination and lets the model answer from private, current information without retraining. Amazon Bedrock Knowledge Bases is the managed service that wires this pipeline together.

Vector databases — five options on AWS

Amazon OpenSearch Serverless — the default managed vector store for Knowledge Bases.
Amazon Aurora PostgreSQL with the pgvector extension.
Amazon Neptune Analytics for graph plus vector search.
Pinecone — managed third-party vector database.
Redis Enterprise Cloud — low-latency vector search.

The cost gradient of the four customization methods

This is the single highest-yield concept in Domain 3. Customization runs from cheapest and fastest to most expensive and slowest: prompt engineering (no training, change the prompt) to RAG (add retrieval, no model change) to fine-tuning (train on labeled examples) to continued pre-training (further train on large unlabeled domain data). The exam expects you to pick the lowest-cost method that solves the problem — and RAG, not fine-tuning, is the right answer whenever the need is current or private knowledge.

Method	Cost	Best for
Prompt engineering	Lowest	Steering behavior and format
RAG	Low	Current or private knowledge
Fine-tuning	Medium	Consistent style or a narrow task
Continued pre-training	Highest	Deep domain adaptation

Agents — automating multi-step tasks

Agents let a foundation model plan and carry out multi-step tasks by calling tools and APIs — for example, looking up an order, checking inventory, and drafting a reply. Amazon Bedrock Agents orchestrate this. Recognize agents as the answer when a scenario needs the model to take actions, not just generate text.

Prompt engineering — choosing the technique

Zero-shot: ask directly with no examples.
Few-shot: include a few examples to set the pattern.
Chain-of-thought: ask the model to reason step by step for harder problems.
System prompts: set role, tone, and guardrails up front.

Prompt attack risks — four threats

Security shows up here as four threats: prompt injection (malicious instructions hidden in input), jailbreaking (bypassing safety guardrails), prompt leaking (extracting the hidden system prompt), and poisoning (corrupting training or retrieval data). Mitigations include input validation, guardrails, and least-privilege access for any tools the model can call.

Training, fine-tuning, and data preparation

Fine-tuning needs well-labeled, representative data; poor data produces a worse model, not a better one. Know the difference between instruction fine-tuning (task behavior) and continued pre-training (domain knowledge), and remember that RAG is often the cheaper path to the same business outcome.

Evaluating foundation models — ROUGE, BLEU, BERTScore

Generative output needs its own metrics: ROUGE for summarization (overlap with reference text), BLEU for translation quality, and BERTScore for semantic similarity. For subjective quality, human evaluation remains the gold standard. Match the metric to the task on the exam.

Domain 3 on the workbench — a real customization decision

Suppose you want an assistant that answers questions about your own 3D-printing build logs. The instinct might be to fine-tune, but the right call is RAG: embed the logs, retrieve the relevant passages, and let the model answer from them. It is cheaper, updates instantly when you add a log, and avoids baking stale data into a model. That single judgment captures the heart of Domain 3.

Summary — the scoring strategy for Domain 3

Domain 3 (28%) is where the exam is won. Master the customization cost gradient, know when RAG beats fine-tuning, control output with temperature, and match evaluation metrics to tasks. Over-invest here and the rest of the exam becomes much easier.