The AIP-C01 is AWS’s professional-level certification for developers building generative AI applications on AWS. It tests your ability to integrate foundation models into production workflows, build RAG pipelines, implement agentic AI patterns, and handle the security and governance requirements that come with deploying GenAI at scale. This is not a foundational cert. AWS expects 2+ years of production experience on AWS and at least 1 year of hands-on GenAI implementation.
The exam has 75 questions (65 scored, 10 unscored pilot questions you cannot identify). Question types are multiple choice and multiple response. You get 170 minutes and need a 750 out of 1,000 to pass. Scoring is compensatory, meaning you do not need to pass each domain individually. The exam costs $300.
The 5 Domains
1. Foundation Model Integration, Data Management, and Compliance (31%)
This is the largest domain by a wide margin, and it covers the core of what a GenAI developer actually does: selecting the right foundation model for a use case, building data pipelines that feed those models, designing vector store architectures, implementing RAG, and managing prompt engineering at scale.
Six task areas live here. You need to know how to evaluate FMs against business requirements using performance benchmarks and capability analysis. You need to understand vector store implementation with Amazon OpenSearch Service, Amazon Aurora with pgvector, Amazon Bedrock Knowledge Bases, and DynamoDB for metadata. RAG architecture is tested in depth: document chunking strategies, embedding selection (Amazon Titan embeddings), hybrid search combining keywords and vectors, and query decomposition patterns.
Prompt engineering goes well beyond writing good prompts. The exam tests prompt management and governance: Amazon Bedrock Prompt Management for parameterized templates, version control for prompts, and quality assurance systems using Lambda and Step Functions to test prompt regression. If you have only written ad-hoc prompts, this section will be difficult.
2. Implementation and Integration (26%)
This domain tests whether you can build real systems, not just prototype them. Agentic AI is a major focus. The exam covers Strands Agents, AWS Agent Squad for multi-agent orchestration, and the Model Context Protocol (MCP) for tool integrations. You need to understand ReAct patterns, chain-of-thought reasoning, stopping conditions, and how to build safeguarded workflows with timeout mechanisms and circuit breakers.
Model deployment strategy matters here. Know the difference between on-demand invocation through Lambda, Bedrock provisioned throughput, and SageMaker AI endpoints for fine-tuned models. The exam asks about deployment patterns specific to LLMs: container-based deployments optimized for GPU utilization, parameter-efficient techniques like LoRA, and model lifecycle management through SageMaker Model Registry.
Enterprise integration gets specific. Expect questions about API Gateway patterns for GenAI (streaming responses, token limit management, retry strategies), event-driven architectures with EventBridge, CI/CD pipelines for GenAI components using CodePipeline and CodeBuild, and cross-environment compliance with AWS Outposts and Wavelength.
3. AI Safety, Security, and Governance (20%)
People underestimate this domain. A fifth of the exam is dedicated to making GenAI systems safe, and the questions are specific. Input and output safety controls are tested thoroughly: Amazon Bedrock Guardrails for content filtering, Lambda-based custom moderation workflows, accuracy verification using Bedrock Knowledge Base grounding, hallucination reduction through structured outputs, and adversarial defense including prompt injection detection and jailbreak prevention.
Data security covers VPC endpoints for FM isolation, IAM policies for model and data access, Amazon Comprehend and Amazon Macie for PII detection, and Amazon Bedrock native privacy features. Governance questions test model cards, data lineage tracking with AWS Glue Data Catalog, CloudTrail for audit logging, and automated compliance monitoring for bias drift and policy violations.
Responsible AI principles are explicitly tested: transparency in FM outputs (reasoning displays, confidence metrics, source attribution), fairness evaluations, and policy-compliant AI systems using Bedrock guardrails.
4. Operational Efficiency and Optimization (12%)
Smaller domain, but the questions are practical. Cost optimization is the biggest topic: token efficiency (context window optimization, prompt compression, response limiting), cost-effective model selection with tiered FM usage based on query complexity, provisioned throughput vs. on-demand pricing, and semantic caching to reduce unnecessary FM invocations.
Performance optimization covers latency-cost tradeoffs, retrieval performance tuning for RAG (index optimization, hybrid search), FM throughput optimization (batching, concurrent invocation), and model-specific parameter tuning (temperature, top-k/top-p). Monitoring questions test CloudWatch metrics for GenAI workloads, Bedrock Model Invocation Logs, and observability dashboards tracking token usage, hallucination rates, and response quality.
5. Testing, Validation, and Troubleshooting (11%)
The smallest domain, but do not ignore it. Evaluation systems go beyond traditional ML metrics. The exam tests evaluation for relevance, factual accuracy, consistency, and fluency. You need to know Amazon Bedrock Model Evaluations, A/B testing, LLM-as-a-Judge techniques, RAG evaluation (relevance scoring, context matching), and agent performance frameworks (task completion rates, tool usage effectiveness).
Troubleshooting is GenAI-specific. Context window overflow diagnostics, prompt engineering debugging (version comparison, systematic refinement), retrieval system issues (embedding quality, chunking problems, vectorization drift), and hallucination detection using golden datasets and output diffing. These are the problems you actually encounter in production, and the exam expects you to know how to diagnose them.
Key AWS Services to Know
The exam is heavily weighted toward Amazon Bedrock and its ecosystem. You should be comfortable with:
- Amazon Bedrock: Model invocation, Knowledge Bases, Agents, Guardrails, Prompt Management, Prompt Flows, Model Evaluations, Cross-Region inference
- Amazon SageMaker AI: Endpoint deployment, Model Registry, Processing, fine-tuned model hosting with LoRA adapters
- Amazon OpenSearch Service: Vector search, hybrid search, sharding strategies for semantic retrieval
- AWS Lambda & Step Functions: Agentic workflows, custom moderation, prompt QA pipelines, orchestration patterns
- Amazon Q Developer: Code generation, API assistance, troubleshooting GenAI applications
- Security services: IAM, KMS, Amazon Comprehend (PII), Amazon Macie, VPC endpoints, CloudTrail
- Monitoring: CloudWatch, X-Ray, Bedrock Model Invocation Logs
Model development and training are explicitly out of scope. You are not expected to train models, perform feature engineering, or implement advanced ML techniques. The exam assumes you are consuming foundation models, not building them.
What Trips People Up
- RAG vs. fine-tuning: The exam frequently presents scenarios where you need to choose between retrieval augmentation and model customization. RAG is the answer when you need current information or domain-specific data that changes. Fine-tuning (with LoRA or adapters) is for consistent behavioral changes or domain-specific language. Many questions are designed to see if you default to fine-tuning when RAG is the simpler and cheaper solution.
- Agentic AI patterns: Know when to use single-agent vs. multi-agent patterns, when to add tool integrations vs. keeping the workflow model-only, and how to implement proper stopping conditions. The exam tests Strands Agents and AWS Agent Squad specifically.
- Token economics: Questions about cost optimization require you to understand how token pricing works (input vs. output tokens, context window costs), when provisioned throughput makes financial sense vs. on-demand, and how semantic caching reduces costs.
- Guardrails configuration: The safety domain requires specific knowledge of how Bedrock Guardrails work: content filters, denied topics, word filters, PII redaction, and grounding checks. Generic knowledge of “AI safety” is not enough.
- Multiple response questions: Unlike multiple choice, multiple response questions require you to select all correct answers. Partial credit is not given. If a question asks you to select three, you need all three correct or you get zero for that question.
Study Strategy
Allocate study time proportional to domain weights. Domain 1 at 31% should get roughly a third of your total preparation time. Domains 4 and 5, at 12% and 11%, are smaller but still represent nearly a quarter of the exam combined.
- Weeks 1-3: Domain 1. Foundation model selection, vector stores, RAG architectures, prompt engineering and governance. This is your largest investment. Get hands-on time with Bedrock Knowledge Bases and OpenSearch.
- Weeks 3-5: Domain 2. Agentic AI, deployment strategies, enterprise integration. Build something with Bedrock Agents if you can. Understanding the agent lifecycle (planning, tool use, iteration) matters more than memorizing API signatures.
- Weeks 5-6: Domain 3. Safety, security, governance. Read the Bedrock Guardrails documentation thoroughly. Set up guardrails with content filters and test them against adversarial inputs. This domain rewards specific knowledge over general principles.
- Week 7: Domains 4 and 5. Cost optimization, monitoring, evaluation, troubleshooting. These are practical topics that benefit from real experience. If you have run GenAI workloads in production, this will feel familiar.
- Week 8: Full-length practice exams under timed conditions. Review every wrong answer. Pay attention to which domain is costing you points and go back to the source material.
The official AWS Skill Builder has a free exam prep course for the AIP-C01. Use it. The AWS Well-Architected Framework Generative AI Lens is also worth reading for Domains 1 and 2.
TechPrep AWS GenAI Developer
2,500+ practice questions across all 5 AIP-C01 domains. Confidence calibration catches the topics where you think you know the answer but are getting it wrong. Spaced repetition keeps Bedrock APIs, guardrail configurations, and RAG patterns locked in. 50 questions free, no account required.