AIF-C01 Domain 3 Complete Guide: Applications of Foundation Models (28%)

Domain 3, Applications of Foundation Models, is the largest domain on AIF-C01 at 28% of the scored content — the section that most often decides pass or fail. It is also the most practical: model selection, inference parameters, RAG, vector databases, the customization cost gradient, agents, and prompt engineering. If your study time is limited, spend it here.
- The shape of Domain 3 — four task statements
- Model-selection criteria — eight design factors
- Inference parameters — controlling output with temperature
- RAG and Amazon Bedrock Knowledge Bases
- Vector databases — five options on AWS
- The cost gradient of the four customization methods
- Agents — automating multi-step tasks
- Prompt engineering — choosing the technique
- Prompt attack risks — four threats
- Training, fine-tuning, and data preparation
- Evaluating foundation models — ROUGE, BLEU, BERTScore
- Domain 3 on the workbench — a real customization decision
- Summary — the scoring strategy for Domain 3
- References
The shape of Domain 3 — four task statements
The four task statements cover: design considerations for applications that use foundation models; choosing effective prompt-engineering techniques; the training and fine-tuning process for foundation models; and how to evaluate foundation-model performance. In short, how to take a pre-trained model and make it useful, safely and cost-effectively, for a specific job.
Model-selection criteria — eight design factors
Selecting a model weighs eight factors: capability for the task, latency requirements, cost per token, context-window size, modality, language coverage, customizability, and licensing or compliance constraints. The exam rewards the instinct to pick the smallest, cheapest model that satisfies the requirement rather than the most powerful one available.
Inference parameters — controlling output with temperature
Three parameters shape output. Temperature controls randomness: low for deterministic, factual answers; high for creative variety. Top-p (nucleus sampling) and top-k limit the candidate token pool. A common exam scenario asks how to make output more consistent — the answer is to lower temperature.
RAG and Amazon Bedrock Knowledge Bases
Retrieval-Augmented Generation grounds a model in your own data: documents are chunked, embedded, and stored in a vector database; at query time the most relevant passages are retrieved and added to the prompt. This reduces hallucination and lets the model answer from private, current information without retraining. Amazon Bedrock Knowledge Bases is the managed service that wires this pipeline together.
Vector databases — five options on AWS
- Amazon OpenSearch Serverless — the default managed vector store for Knowledge Bases.
- Amazon Aurora PostgreSQL with the pgvector extension.
- Amazon Neptune Analytics for graph plus vector search.
- Pinecone — managed third-party vector database.
- Redis Enterprise Cloud — low-latency vector search.
The cost gradient of the four customization methods
This is the single highest-yield concept in Domain 3. Customization runs from cheapest and fastest to most expensive and slowest: prompt engineering (no training, change the prompt) to RAG (add retrieval, no model change) to fine-tuning (train on labeled examples) to continued pre-training (further train on large unlabeled domain data). The exam expects you to pick the lowest-cost method that solves the problem — and RAG, not fine-tuning, is the right answer whenever the need is current or private knowledge.
| Method | Cost | Best for |
|---|---|---|
| Prompt engineering | Lowest | Steering behavior and format |
| RAG | Low | Current or private knowledge |
| Fine-tuning | Medium | Consistent style or a narrow task |
| Continued pre-training | Highest | Deep domain adaptation |
Agents — automating multi-step tasks
Agents let a foundation model plan and carry out multi-step tasks by calling tools and APIs — for example, looking up an order, checking inventory, and drafting a reply. Amazon Bedrock Agents orchestrate this. Recognize agents as the answer when a scenario needs the model to take actions, not just generate text.
Prompt engineering — choosing the technique
- Zero-shot: ask directly with no examples.
- Few-shot: include a few examples to set the pattern.
- Chain-of-thought: ask the model to reason step by step for harder problems.
- System prompts: set role, tone, and guardrails up front.
Prompt attack risks — four threats
Security shows up here as four threats: prompt injection (malicious instructions hidden in input), jailbreaking (bypassing safety guardrails), prompt leaking (extracting the hidden system prompt), and poisoning (corrupting training or retrieval data). Mitigations include input validation, guardrails, and least-privilege access for any tools the model can call.
Training, fine-tuning, and data preparation
Fine-tuning needs well-labeled, representative data; poor data produces a worse model, not a better one. Know the difference between instruction fine-tuning (task behavior) and continued pre-training (domain knowledge), and remember that RAG is often the cheaper path to the same business outcome.
Evaluating foundation models — ROUGE, BLEU, BERTScore
Generative output needs its own metrics: ROUGE for summarization (overlap with reference text), BLEU for translation quality, and BERTScore for semantic similarity. For subjective quality, human evaluation remains the gold standard. Match the metric to the task on the exam.
Domain 3 on the workbench — a real customization decision
Suppose you want an assistant that answers questions about your own 3D-printing build logs. The instinct might be to fine-tune, but the right call is RAG: embed the logs, retrieve the relevant passages, and let the model answer from them. It is cheaper, updates instantly when you add a log, and avoids baking stale data into a model. That single judgment captures the heart of Domain 3.
Summary — the scoring strategy for Domain 3
Domain 3 (28%) is where the exam is won. Master the customization cost gradient, know when RAG beats fine-tuning, control output with temperature, and match evaluation metrics to tasks. Over-invest here and the rest of the exam becomes much easier.





