Over the first eleven lessons, we built a strong foundation in how agents work - reasoning, tools, memory, planning, multi-agent systems, RAG, evaluation, safety, and production operations. All of those concepts are platform-agnostic.
Now we shift to practice. This lesson maps those concepts to the specific tools and services available on Google Cloud. The goal is to give you a clear picture of what exists, how the pieces fit together, and which service to reach for when you are building a particular type of agent.
Imagine you are setting up a woodworking workshop. You need raw materials (wood, metal), power tools (saws, drills), a workbench (a stable surface to build on), safety equipment (goggles, gloves), and a space to display or sell your finished products.
The Google Cloud AI stack works similarly:
You can use these pieces independently or together. Not every project needs every tool.
Key takeaway: Google Cloud provides a full stack for building agents, from models to managed runtime. Understanding which piece does what helps you pick the right tool for each job.
Here is a map of the major components and how they relate to each other:
+------------------------------------------------------------------+
| Your Agent Application |
+------------------------------------------------------------------+
| |
| +--------------------+ +----------------------------------+ |
| | Agent Development | | Agent Engine | |
| | Kit (ADK) | | (Managed Runtime) | |
| | | | | |
| | - Build agents | | - Deploy and run agents | |
| | - Define tools |--->| - Session management | |
| | - Orchestration | | - Scaling and monitoring | |
| +--------------------+ +----------------------------------+ |
| | | |
| v v |
| +----------------------------------------------------+ |
| | Vertex AI Platform | |
| | | |
| | - Model hosting (Gemini, partner models) | |
| | - Evaluation tools | |
| | - Context caching | |
| | - Grounding with Google Search | |
| +----------------------------------------------------+ |
| | | |
| v v |
| +----------------------+ +------------------------+ |
| | Vertex AI Search & | | Model Armor | |
| | RAG Engine | | (Safety & Guardrails) | |
| +----------------------+ +------------------------+ |
| |
+------------------------------------------------------------------+
| Gemini Models |
| Pro (complex reasoning) | Flash (balanced) | Flash-Lite (volume) |
+------------------------------------------------------------------+
Let’s walk through each component.
Gemini is Google’s family of multimodal AI models. For agent development, you will primarily work with three tiers:
| Model | Best For | Characteristics |
|---|---|---|
| Gemini Pro | Complex reasoning, multi-step planning, nuanced decisions | Highest capability, higher latency, higher cost |
| Gemini Flash | Balanced tasks - tool use, summarization, conversation | Good capability, fast, moderate cost |
| Gemini Flash-Lite | High-volume, simpler tasks - classification, routing, extraction | Fast, lowest cost, good for high-throughput use cases |
Think of model selection like choosing the right vehicle for a trip:
Most production agents use multiple model tiers through model routing (covered in Lesson 11). Use Flash-Lite for the simple steps, Flash for the core logic, and Pro only when the task genuinely requires it.
Gemini models can process text, images, audio, and video. This means your agents can:
This is a significant advantage over text-only models because it lets you build agents that interact with the real world in richer ways.
For model details and capabilities, see the Gemini model documentation.
Vertex AI is Google Cloud’s machine learning platform. For agent developers, the most relevant capabilities are:
Vertex AI provides API access to Gemini models along with partner and open-source models. You get:
Vertex AI includes evaluation capabilities that align with what we covered in Lesson 9:
These tools integrate with your CI/CD pipeline for evaluation-gated deployment (Lesson 11).
The Vertex AI Model Garden provides access to a wide range of models beyond Gemini - including open-source models and models from partner companies. This is useful when you need specialized models for particular tasks or want to compare different options.
For the full platform overview, see the Vertex AI documentation.
ADK is Google’s open-source, code-first framework for building AI agents. If Vertex AI is the workbench, ADK is the set of power tools you use to actually build things.
| Feature | Detail |
|---|---|
| Open source | Available on GitHub, MIT licensed |
| Multi-language | Python, TypeScript/JavaScript, Go, Java |
| Model-agnostic | Works with Gemini, but also supports other LLMs |
| Deployment-agnostic | Run locally, on Agent Engine, on Cloud Run, on any container platform |
| Opinionated but flexible | Provides structure without locking you in |
You could build agents by calling the Gemini API directly - writing your own tool-calling loop, managing conversation state, and handling orchestration. ADK saves you from reinventing these common patterns:
Think of it like the difference between writing raw HTTP handlers and using a web framework. You can do either, but the framework handles the boilerplate so you can focus on your agent’s unique logic.
ADK organizes agents into three categories:
These are agents powered by a language model that can reason, plan, and decide which tools to call. This is the most common type and corresponds to the ReAct-style agents we covered in Lesson 4.
from google.adk.agents import Agent
from google.adk.tools import FunctionTool
# Define a tool
def get_weather(city: str) -> str:
"""Get the current weather for a city."""
# In practice, this would call a weather API
return f"The weather in {city} is sunny, 72F."
# Create an agent
weather_agent = Agent(
name="weather_agent",
model="gemini-2.0-flash",
instruction="You are a helpful weather assistant. Use the get_weather "
"tool to answer questions about weather conditions.",
tools=[get_weather],
)
These agents follow predefined orchestration patterns rather than relying on the LLM to decide the flow. ADK provides three built-in workflow types:
| Workflow Type | How It Works | Use When |
|---|---|---|
| SequentialAgent | Runs sub-agents one after another in a fixed order | Steps must happen in sequence (e.g., validate -> process -> respond) |
| ParallelAgent | Runs sub-agents simultaneously | Steps are independent and can happen at the same time (e.g., search multiple sources) |
| LoopAgent | Runs a sub-agent repeatedly until a condition is met | Iterative refinement (e.g., generate -> evaluate -> improve) |
from google.adk.agents import SequentialAgent
# A pipeline that validates input, processes it, and formats the response
pipeline = SequentialAgent(
name="order_pipeline",
sub_agents=[
input_validator_agent,
order_processor_agent,
response_formatter_agent,
],
)
Workflow agents map directly to the agentic design patterns from Lesson 4. Sequential corresponds to pipeline patterns. Parallel corresponds to fan-out/fan-in. Loop corresponds to reflection and iterative refinement.
For orchestration patterns that do not fit the built-in types, you can create custom agents by subclassing the base agent class and implementing your own control flow.
One of ADK’s biggest strengths is its tools ecosystem. Tools are how agents interact with the outside world (Lesson 3), and ADK provides several ways to define them:
| Tool Type | What It Is | Example |
|---|---|---|
| Function Tools | Plain Python/JS functions decorated as tools | A function that queries your database |
| MCP Tools | Tools from Model Context Protocol servers | Connect to any MCP-compatible tool server |
| OpenAPI Tools | Auto-generated from OpenAPI/Swagger specs | Wrap any REST API as agent tools |
| Built-in Tools | Pre-built integrations provided by ADK | Google Search, code execution, RAG |
ADK includes 60+ pre-built tool integrations, covering common needs like:
from google.adk.tools import FunctionTool
# A simple function tool
def search_products(query: str, max_results: int = 5) -> list[dict]:
"""Search the product catalog.
Args:
query: The search query string.
max_results: Maximum number of results to return.
Returns:
A list of matching products with name, price, and description.
"""
# Your implementation here
return product_database.search(query, limit=max_results)
# ADK automatically generates the tool schema from the function signature
# and docstring, so the LLM knows how to call it correctly.
Skills are a newer ADK concept that packages agent capabilities as self-contained, reusable units. Think of them as “plugins” for agents.
Skills have three levels of increasing complexity:
| Level | What It Includes | Example |
|---|---|---|
| L1 - Metadata | Name, description, and tags that help the agent understand when to use the skill | “This skill handles flight booking” |
| L2 - Instructions | Detailed instructions for how the agent should use the skill | Step-by-step guide for the booking flow |
| L3 - Resources | Tools, data sources, and sub-agents that the skill needs | Flight search API tool, airline database |
Skills make it easier to share and compose agent capabilities across teams and projects.
For complete ADK documentation, see the ADK docs site.
Agent Engine is a managed runtime service on Google Cloud for deploying and running agents. If ADK is how you build agents, Agent Engine is how you run them in production without managing infrastructure.
| Capability | What It Does |
|---|---|
| Managed hosting | Run your agent without provisioning servers or containers |
| Session management | Built-in conversation state persistence |
| Scaling | Automatic scaling based on traffic |
| Monitoring | Integration with Cloud Monitoring and Cloud Logging |
| Security | IAM-based access control, VPC Service Controls support |
| Deployment Option | Best For |
|---|---|
| Agent Engine | Production agents where you want managed infrastructure and do not want to handle scaling, session management, or deployment yourself |
| Cloud Run | Agents that need custom runtime environments, specific dependencies, or more control over the container |
| GKE (Kubernetes) | Agents that are part of a larger microservices architecture already running on Kubernetes |
| Local / Self-hosted | Development, testing, or when you cannot use cloud services |
ADK agents can be deployed to any of these targets. Agent Engine is the most managed option - you give it your agent code and it handles the rest.
For deployment details, see the Agent Engine documentation.
In Lesson 8, we covered how agents can use retrieval-augmented generation (RAG) to access your data. Vertex AI provides managed services for this:
A managed search service that can index and search across:
It handles chunking, embedding, indexing, and retrieval - the full RAG pipeline we discussed in Lesson 8 - as a managed service.
RAG Engine provides a managed retrieval pipeline specifically designed for grounding LLM responses in your data:
The advantage of these managed services is that you do not have to run your own vector database, manage embeddings, or build retrieval pipelines. The tradeoff is less control over the details.
For RAG capabilities, see the RAG Engine overview.
Model Armor is Google Cloud’s managed guardrails service (we covered guardrails in depth in Lesson 10). It provides:
| Feature | What It Does |
|---|---|
| Prompt screening | Detect and block harmful or adversarial prompts before they reach the model |
| Response filtering | Screen model outputs for harmful, toxic, or inappropriate content |
| Prompt injection detection | Identify attempts to override model instructions |
| Configurable policies | Set your own thresholds for different content categories |
| Integration | Works with Vertex AI endpoints and can be added to any generative AI application |
Model Armor gives you a production-ready Layer 2 defense (from the defense-in-depth model in Lesson 10) without building content filtering from scratch.
Here is how to get started with ADK and Google Cloud for agent development.
pip install google-adk
You have two options for authenticating:
Option A: API Key (simplest for getting started)
export GOOGLE_API_KEY="your-api-key-here"
You can get an API key from Google AI Studio.
Option B: Google Cloud project (for production and Vertex AI features)
# Install the Google Cloud CLI
# https://cloud.google.com/sdk/docs/install
# Authenticate
gcloud auth application-default login
# Set your project
gcloud config set project YOUR_PROJECT_ID
Create a file called agent.py:
from google.adk.agents import Agent
root_agent = Agent(
name="greeting_agent",
model="gemini-2.0-flash",
instruction="You are a friendly assistant that greets users "
"and answers basic questions.",
)
adk web
This starts a local web interface where you can chat with your agent and inspect its behavior. The ADK dev UI shows you the agent’s reasoning steps, tool calls, and state - which is invaluable for debugging.
From here, you can add tools, create multi-agent systems, integrate RAG, and eventually deploy to Agent Engine or Cloud Run. The ADK getting started guide walks through these steps in detail.
When building an agent, use this decision tree to pick the right Google Cloud services:
"I want to build..."
|
+-- "A simple chatbot with no tools"
| --> Gemini API directly (no framework needed)
|
+-- "An agent with tools and reasoning"
| --> ADK + Gemini Flash
| |
| +-- "...and I need it in production"
| --> Deploy to Agent Engine or Cloud Run
|
+-- "An agent that searches my documents"
| --> ADK + Vertex AI Search or RAG Engine
|
+-- "A multi-agent system"
| --> ADK (multi-agent orchestration built in)
|
+-- "An agent with strict safety requirements"
| --> ADK + Model Armor + custom guardrails
|
+-- "A high-volume, cost-sensitive application"
| --> Model routing (Flash-Lite for simple tasks,
| Flash for complex) + context caching
|
+-- "An agent that needs to use external APIs"
--> ADK with OpenAPI tools or MCP tools
| I Need… | Use… |
|---|---|
| An LLM to call | Gemini models via Vertex AI or AI Studio |
| A framework to build agents | Agent Development Kit (ADK) |
| Managed agent hosting | Agent Engine |
| Custom container hosting | Cloud Run or GKE |
| Document search / RAG | Vertex AI Search or RAG Engine |
| Content safety guardrails | Model Armor |
| Model evaluation | Vertex AI Evaluation tools |
| Prompt management | Vertex AI prompt management |
| Interop with external tools | MCP tools in ADK |
| Interop with other agents | A2A protocol (covered in Lesson 14) |
Here is how a typical production agent uses multiple Google Cloud services together:
User asks: "What is the return policy for my recent order?"
1. [ADK Agent] receives the request
|
2. [Model Armor] screens the input for safety
|
3. [Gemini Flash] reasons about the request:
"I need to look up the order and find the return policy"
|
4. [ADK Tool: Order Lookup] calls your order database
|
5. [RAG Engine] searches your policy documents
for the relevant return policy
|
6. [Gemini Flash] synthesizes a response from
the order details and policy documents
|
7. [Model Armor] screens the output for safety
|
8. [Agent Engine] manages the session state
and returns the response to the user
Each Google Cloud service handles one part of the puzzle. ADK orchestrates the flow. Gemini provides the reasoning. RAG Engine provides the knowledge. Model Armor provides the safety. Agent Engine provides the runtime.
Here is a reference connecting the concepts from earlier lessons to specific Google Cloud services:
| Lesson | Concept | Google Cloud Service |
|---|---|---|
| 2 - How Agents Think | LLM reasoning | Gemini models |
| 3 - Tools | Function calling | ADK Function Tools, MCP Tools, OpenAPI Tools |
| 4 - Design Patterns | Orchestration | ADK Sequential/Parallel/Loop Agents |
| 5 - Memory | Session state, long-term memory | ADK session management, Agent Engine |
| 6 - Planning | Multi-step reasoning | Gemini Pro for complex planning |
| 7 - Multi-Agent | Agent coordination | ADK multi-agent support |
| 8 - RAG | Knowledge retrieval | Vertex AI Search, RAG Engine |
| 9 - Evaluation | Testing agents | Vertex AI Evaluation |
| 10 - Safety | Guardrails | Model Armor |
| 11 - Production | Deployment, CI/CD | Agent Engine, Cloud Run, Agent Starter Pack |
Google Cloud provides a full stack for agents. From Gemini models at the base to Agent Engine at the top, you can build and deploy complete agent systems.
ADK is the code-first framework. It is open-source, multi-language, model-agnostic, and deployment-agnostic. It handles the common patterns (tool calling, orchestration, state management) so you can focus on your agent’s logic.
Pick the right model tier. Use Flash-Lite for simple tasks, Flash for most agent work, and Pro for complex reasoning. Model routing across tiers is a key cost optimization strategy.
Managed services reduce operational burden. Agent Engine, RAG Engine, and Model Armor handle infrastructure and operations so you can focus on building. The tradeoff is less control over implementation details.
Everything is composable. You can use ADK without Agent Engine, use Vertex AI without ADK, or use the full stack together. Start with what you need and add services as your requirements grow.
ADK agents map directly to the concepts in this course. LLM Agents for reasoning, Workflow Agents for orchestration patterns, tools for external interaction, and skills for reusable capabilities.
Next lesson: Building Your First Agent