Lesson 12: getting started with Vertex AI and ADK

Introduction

Over the first eleven lessons, we built a strong foundation in how agents work - reasoning, tools, memory, planning, multi-agent systems, RAG, evaluation, safety, and production operations. All of those concepts are platform-agnostic.

Now we shift to practice. This lesson maps those concepts to the specific tools and services available on Google Cloud. The goal is to give you a clear picture of what exists, how the pieces fit together, and which service to reach for when you are building a particular type of agent.

ELI5: Think of the Google Cloud AI stack like a workshop

Imagine you are setting up a woodworking workshop. You need raw materials (wood, metal), power tools (saws, drills), a workbench (a stable surface to build on), safety equipment (goggles, gloves), and a space to display or sell your finished products.

The Google Cloud AI stack works similarly:

You can use these pieces independently or together. Not every project needs every tool.

Key takeaway: Google Cloud provides a full stack for building agents, from models to managed runtime. Understanding which piece does what helps you pick the right tool for each job.


The Google Cloud AI stack for agents

Here is a map of the major components and how they relate to each other:

+------------------------------------------------------------------+
|                        Your Agent Application                     |
+------------------------------------------------------------------+
|                                                                    |
|  +--------------------+    +----------------------------------+   |
|  |  Agent Development |    |  Agent Engine                    |   |
|  |  Kit (ADK)         |    |  (Managed Runtime)               |   |
|  |                    |    |                                  |   |
|  |  - Build agents    |    |  - Deploy and run agents         |   |
|  |  - Define tools    |--->|  - Session management            |   |
|  |  - Orchestration   |    |  - Scaling and monitoring        |   |
|  +--------------------+    +----------------------------------+   |
|           |                            |                          |
|           v                            v                          |
|  +----------------------------------------------------+          |
|  |              Vertex AI Platform                     |          |
|  |                                                     |          |
|  |  - Model hosting (Gemini, partner models)           |          |
|  |  - Evaluation tools                                 |          |
|  |  - Context caching                                  |          |
|  |  - Grounding with Google Search                     |          |
|  +----------------------------------------------------+          |
|           |                            |                          |
|           v                            v                          |
|  +----------------------+    +------------------------+           |
|  | Vertex AI Search &   |    |  Model Armor           |           |
|  | RAG Engine           |    |  (Safety & Guardrails) |           |
|  +----------------------+    +------------------------+           |
|                                                                    |
+------------------------------------------------------------------+
|                     Gemini Models                                  |
|  Pro (complex reasoning) | Flash (balanced) | Flash-Lite (volume) |
+------------------------------------------------------------------+

Let’s walk through each component.


Gemini Models

Gemini is Google’s family of multimodal AI models. For agent development, you will primarily work with three tiers:

Model Best For Characteristics
Gemini Pro Complex reasoning, multi-step planning, nuanced decisions Highest capability, higher latency, higher cost
Gemini Flash Balanced tasks - tool use, summarization, conversation Good capability, fast, moderate cost
Gemini Flash-Lite High-volume, simpler tasks - classification, routing, extraction Fast, lowest cost, good for high-throughput use cases

Choosing the right model

Think of model selection like choosing the right vehicle for a trip:

Most production agents use multiple model tiers through model routing (covered in Lesson 11). Use Flash-Lite for the simple steps, Flash for the core logic, and Pro only when the task genuinely requires it.

Multimodal capabilities

Gemini models can process text, images, audio, and video. This means your agents can:

This is a significant advantage over text-only models because it lets you build agents that interact with the real world in richer ways.

For model details and capabilities, see the Gemini model documentation.


Vertex AI Platform

Vertex AI is Google Cloud’s machine learning platform. For agent developers, the most relevant capabilities are:

Model hosting and API access

Vertex AI provides API access to Gemini models along with partner and open-source models. You get:

Evaluation tools

Vertex AI includes evaluation capabilities that align with what we covered in Lesson 9:

These tools integrate with your CI/CD pipeline for evaluation-gated deployment (Lesson 11).

Model garden

The Vertex AI Model Garden provides access to a wide range of models beyond Gemini - including open-source models and models from partner companies. This is useful when you need specialized models for particular tasks or want to compare different options.

For the full platform overview, see the Vertex AI documentation.


Agent Development Kit (ADK)

ADK is Google’s open-source, code-first framework for building AI agents. If Vertex AI is the workbench, ADK is the set of power tools you use to actually build things.

Key characteristics

Feature Detail
Open source Available on GitHub, MIT licensed
Multi-language Python, TypeScript/JavaScript, Go, Java
Model-agnostic Works with Gemini, but also supports other LLMs
Deployment-agnostic Run locally, on Agent Engine, on Cloud Run, on any container platform
Opinionated but flexible Provides structure without locking you in

Why a framework?

You could build agents by calling the Gemini API directly - writing your own tool-calling loop, managing conversation state, and handling orchestration. ADK saves you from reinventing these common patterns:

Think of it like the difference between writing raw HTTP handlers and using a web framework. You can do either, but the framework handles the boilerplate so you can focus on your agent’s unique logic.

ADK core concepts

ADK organizes agents into three categories:

1. LLM agents

These are agents powered by a language model that can reason, plan, and decide which tools to call. This is the most common type and corresponds to the ReAct-style agents we covered in Lesson 4.

from google.adk.agents import Agent
from google.adk.tools import FunctionTool

# Define a tool
def get_weather(city: str) -> str:
    """Get the current weather for a city."""
    # In practice, this would call a weather API
    return f"The weather in {city} is sunny, 72F."

# Create an agent
weather_agent = Agent(
    name="weather_agent",
    model="gemini-2.0-flash",
    instruction="You are a helpful weather assistant. Use the get_weather "
                "tool to answer questions about weather conditions.",
    tools=[get_weather],
)

2. workflow agents

These agents follow predefined orchestration patterns rather than relying on the LLM to decide the flow. ADK provides three built-in workflow types:

Workflow Type How It Works Use When
SequentialAgent Runs sub-agents one after another in a fixed order Steps must happen in sequence (e.g., validate -> process -> respond)
ParallelAgent Runs sub-agents simultaneously Steps are independent and can happen at the same time (e.g., search multiple sources)
LoopAgent Runs a sub-agent repeatedly until a condition is met Iterative refinement (e.g., generate -> evaluate -> improve)
from google.adk.agents import SequentialAgent

# A pipeline that validates input, processes it, and formats the response
pipeline = SequentialAgent(
    name="order_pipeline",
    sub_agents=[
        input_validator_agent,
        order_processor_agent,
        response_formatter_agent,
    ],
)

Workflow agents map directly to the agentic design patterns from Lesson 4. Sequential corresponds to pipeline patterns. Parallel corresponds to fan-out/fan-in. Loop corresponds to reflection and iterative refinement.

3. custom agents

For orchestration patterns that do not fit the built-in types, you can create custom agents by subclassing the base agent class and implementing your own control flow.

ADK tools ecosystem

One of ADK’s biggest strengths is its tools ecosystem. Tools are how agents interact with the outside world (Lesson 3), and ADK provides several ways to define them:

Tool Type What It Is Example
Function Tools Plain Python/JS functions decorated as tools A function that queries your database
MCP Tools Tools from Model Context Protocol servers Connect to any MCP-compatible tool server
OpenAPI Tools Auto-generated from OpenAPI/Swagger specs Wrap any REST API as agent tools
Built-in Tools Pre-built integrations provided by ADK Google Search, code execution, RAG

ADK includes 60+ pre-built tool integrations, covering common needs like:

from google.adk.tools import FunctionTool

# A simple function tool
def search_products(query: str, max_results: int = 5) -> list[dict]:
    """Search the product catalog.

    Args:
        query: The search query string.
        max_results: Maximum number of results to return.

    Returns:
        A list of matching products with name, price, and description.
    """
    # Your implementation here
    return product_database.search(query, limit=max_results)

# ADK automatically generates the tool schema from the function signature
# and docstring, so the LLM knows how to call it correctly.

ADK Skills

Skills are a newer ADK concept that packages agent capabilities as self-contained, reusable units. Think of them as “plugins” for agents.

Skills have three levels of increasing complexity:

Level What It Includes Example
L1 - Metadata Name, description, and tags that help the agent understand when to use the skill “This skill handles flight booking”
L2 - Instructions Detailed instructions for how the agent should use the skill Step-by-step guide for the booking flow
L3 - Resources Tools, data sources, and sub-agents that the skill needs Flight search API tool, airline database

Skills make it easier to share and compose agent capabilities across teams and projects.

For complete ADK documentation, see the ADK docs site.


Agent engine

Agent Engine is a managed runtime service on Google Cloud for deploying and running agents. If ADK is how you build agents, Agent Engine is how you run them in production without managing infrastructure.

What agent engine provides

Capability What It Does
Managed hosting Run your agent without provisioning servers or containers
Session management Built-in conversation state persistence
Scaling Automatic scaling based on traffic
Monitoring Integration with Cloud Monitoring and Cloud Logging
Security IAM-based access control, VPC Service Controls support

When to use Agent Engine vs. other deployment options

Deployment Option Best For
Agent Engine Production agents where you want managed infrastructure and do not want to handle scaling, session management, or deployment yourself
Cloud Run Agents that need custom runtime environments, specific dependencies, or more control over the container
GKE (Kubernetes) Agents that are part of a larger microservices architecture already running on Kubernetes
Local / Self-hosted Development, testing, or when you cannot use cloud services

ADK agents can be deployed to any of these targets. Agent Engine is the most managed option - you give it your agent code and it handles the rest.

For deployment details, see the Agent Engine documentation.


Vertex AI Search And RAG engine

In Lesson 8, we covered how agents can use retrieval-augmented generation (RAG) to access your data. Vertex AI provides managed services for this:

A managed search service that can index and search across:

It handles chunking, embedding, indexing, and retrieval - the full RAG pipeline we discussed in Lesson 8 - as a managed service.

RAG Engine

RAG Engine provides a managed retrieval pipeline specifically designed for grounding LLM responses in your data:

The advantage of these managed services is that you do not have to run your own vector database, manage embeddings, or build retrieval pipelines. The tradeoff is less control over the details.

For RAG capabilities, see the RAG Engine overview.


Model Armor

Model Armor is Google Cloud’s managed guardrails service (we covered guardrails in depth in Lesson 10). It provides:

Feature What It Does
Prompt screening Detect and block harmful or adversarial prompts before they reach the model
Response filtering Screen model outputs for harmful, toxic, or inappropriate content
Prompt injection detection Identify attempts to override model instructions
Configurable policies Set your own thresholds for different content categories
Integration Works with Vertex AI endpoints and can be added to any generative AI application

Model Armor gives you a production-ready Layer 2 defense (from the defense-in-depth model in Lesson 10) without building content filtering from scratch.


Quick setup guide

Here is how to get started with ADK and Google Cloud for agent development.

Prerequisites

Step 1: install ADK

pip install google-adk

Step 2: configure authentication

You have two options for authenticating:

Option A: API Key (simplest for getting started)

export GOOGLE_API_KEY="your-api-key-here"

You can get an API key from Google AI Studio.

Option B: Google Cloud project (for production and Vertex AI features)

# Install the Google Cloud CLI
# https://cloud.google.com/sdk/docs/install

# Authenticate
gcloud auth application-default login

# Set your project
gcloud config set project YOUR_PROJECT_ID

Step 3: Create your first agent

Create a file called agent.py:

from google.adk.agents import Agent

root_agent = Agent(
    name="greeting_agent",
    model="gemini-2.0-flash",
    instruction="You are a friendly assistant that greets users "
                "and answers basic questions.",
)

Step 4: run it locally

adk web

This starts a local web interface where you can chat with your agent and inspect its behavior. The ADK dev UI shows you the agent’s reasoning steps, tool calls, and state - which is invaluable for debugging.

Step 5: Add tools and complexity

From here, you can add tools, create multi-agent systems, integrate RAG, and eventually deploy to Agent Engine or Cloud Run. The ADK getting started guide walks through these steps in detail.


Decision tree: which service do i use?

When building an agent, use this decision tree to pick the right Google Cloud services:

"I want to build..."
    |
    +-- "A simple chatbot with no tools"
    |       --> Gemini API directly (no framework needed)
    |
    +-- "An agent with tools and reasoning"
    |       --> ADK + Gemini Flash
    |       |
    |       +-- "...and I need it in production"
    |               --> Deploy to Agent Engine or Cloud Run
    |
    +-- "An agent that searches my documents"
    |       --> ADK + Vertex AI Search or RAG Engine
    |
    +-- "A multi-agent system"
    |       --> ADK (multi-agent orchestration built in)
    |
    +-- "An agent with strict safety requirements"
    |       --> ADK + Model Armor + custom guardrails
    |
    +-- "A high-volume, cost-sensitive application"
    |       --> Model routing (Flash-Lite for simple tasks,
    |           Flash for complex) + context caching
    |
    +-- "An agent that needs to use external APIs"
            --> ADK with OpenAPI tools or MCP tools

Quick reference table

I Need… Use…
An LLM to call Gemini models via Vertex AI or AI Studio
A framework to build agents Agent Development Kit (ADK)
Managed agent hosting Agent Engine
Custom container hosting Cloud Run or GKE
Document search / RAG Vertex AI Search or RAG Engine
Content safety guardrails Model Armor
Model evaluation Vertex AI Evaluation tools
Prompt management Vertex AI prompt management
Interop with external tools MCP tools in ADK
Interop with other agents A2A protocol (covered in Lesson 14)

How the pieces connect: a full example

Here is how a typical production agent uses multiple Google Cloud services together:

User asks: "What is the return policy for my recent order?"

1. [ADK Agent] receives the request
       |
2. [Model Armor] screens the input for safety
       |
3. [Gemini Flash] reasons about the request:
       "I need to look up the order and find the return policy"
       |
4. [ADK Tool: Order Lookup] calls your order database
       |
5. [RAG Engine] searches your policy documents
       for the relevant return policy
       |
6. [Gemini Flash] synthesizes a response from
       the order details and policy documents
       |
7. [Model Armor] screens the output for safety
       |
8. [Agent Engine] manages the session state
       and returns the response to the user

Each Google Cloud service handles one part of the puzzle. ADK orchestrates the flow. Gemini provides the reasoning. RAG Engine provides the knowledge. Model Armor provides the safety. Agent Engine provides the runtime.


Where each lesson concept maps to Google Cloud

Here is a reference connecting the concepts from earlier lessons to specific Google Cloud services:

Lesson Concept Google Cloud Service
2 - How Agents Think LLM reasoning Gemini models
3 - Tools Function calling ADK Function Tools, MCP Tools, OpenAPI Tools
4 - Design Patterns Orchestration ADK Sequential/Parallel/Loop Agents
5 - Memory Session state, long-term memory ADK session management, Agent Engine
6 - Planning Multi-step reasoning Gemini Pro for complex planning
7 - Multi-Agent Agent coordination ADK multi-agent support
8 - RAG Knowledge retrieval Vertex AI Search, RAG Engine
9 - Evaluation Testing agents Vertex AI Evaluation
10 - Safety Guardrails Model Armor
11 - Production Deployment, CI/CD Agent Engine, Cloud Run, Agent Starter Pack

Key takeaways

  1. Google Cloud provides a full stack for agents. From Gemini models at the base to Agent Engine at the top, you can build and deploy complete agent systems.

  2. ADK is the code-first framework. It is open-source, multi-language, model-agnostic, and deployment-agnostic. It handles the common patterns (tool calling, orchestration, state management) so you can focus on your agent’s logic.

  3. Pick the right model tier. Use Flash-Lite for simple tasks, Flash for most agent work, and Pro for complex reasoning. Model routing across tiers is a key cost optimization strategy.

  4. Managed services reduce operational burden. Agent Engine, RAG Engine, and Model Armor handle infrastructure and operations so you can focus on building. The tradeoff is less control over implementation details.

  5. Everything is composable. You can use ADK without Agent Engine, use Vertex AI without ADK, or use the full stack together. Start with what you need and add services as your requirements grow.

  6. ADK agents map directly to the concepts in this course. LLM Agents for reasoning, Workflow Agents for orchestration patterns, tools for external interaction, and skills for reusable capabilities.


Further reading


Next lesson: Building Your First Agent