In Lesson 3, we learned about tools - functions that let agents take actions like calling APIs, querying databases, and running code. Tools are about doing things.
Skills are about knowing things. A skill packages domain expertise - instructions, best practices, decision frameworks, and reference materials - into a modular unit that an agent can discover and use when needed.
Think about the difference between giving someone a wrench (a tool) and giving them a repair manual (a skill). The wrench lets them turn bolts. The manual tells them which bolts to turn, in what order, and what to watch out for.
A professional kitchen has tools (knives, pans, ovens) and recipe cards. A new chef can pick up a knife without instructions. But to make a specific dish, they need the recipe card - it tells them which tools to use, in what order, at what temperature, and what the result should look like.
Agent skills work the same way. They are the recipe cards that tell an agent how to approach a specific type of task, which tools to use, and what good output looks like.
Key takeaway: Skills encode domain expertise as portable, reusable packages. Tools let agents act. Skills tell agents how and when to act.
Consider this scenario: your team has an agent that helps with code reviews. You want it to follow your team’s specific review checklist, flag common patterns you care about, and format its feedback in a particular way.
You could put all of this in the agent’s system prompt. But system prompts get crowded fast. If you add review instructions, deployment procedures, documentation standards, and testing conventions all into one prompt, you end up with a bloated context window and an agent that is mediocre at everything.
Skills solve this by letting you:
This is the key technical motivation. Every token in the context window has a cost - both in money and in attention. If you load 50,000 tokens of instructions at startup, the agent pays that cost on every single turn, even when most of those instructions are irrelevant.
Skills use progressive disclosure to keep the cost low:
One analysis showed this approach reducing a 150,000-token workflow to approximately 2,000 tokens at startup.
Skills follow an open specification maintained at agentskills.io. The format is simple:
my-skill/
SKILL.md # Required: metadata + instructions
references/ # Optional: additional documentation
assets/ # Optional: templates, schemas, data files
scripts/ # Optional: executable code
The only required file is SKILL.md. Everything else is optional.
A SKILL.md file has two parts: YAML frontmatter for metadata, and Markdown content for instructions.
---
name: code-review
description: >
Reviews pull requests following team standards. Checks for
security issues, test coverage, naming conventions, and
documentation. Use when asked to review code or a PR.
---
## Code Review Process
When reviewing code, follow these steps in order:
### 1. Security check
- Look for hardcoded secrets, SQL injection, XSS vulnerabilities
- Check that user input is validated and sanitized
- Verify authentication and authorization on new endpoints
### 2. Test coverage
- New public functions should have tests
- Edge cases should be covered (empty input, null values, errors)
- Check that tests actually assert meaningful behavior
### 3. Naming and structure
- Functions and variables should have descriptive names
- Files should be in the correct directory per project conventions
- No single function should exceed 50 lines
### 4. Documentation
- Public APIs should have docstrings
- Non-obvious logic should have inline comments
- README should be updated if behavior changes
### Output format
Present findings as a list grouped by category (Security, Tests,
Style, Docs). For each finding, include the file path, line number,
severity (high/medium/low), and a suggested fix.
| Field | Required | Purpose |
|---|---|---|
name |
Yes | Unique identifier, lowercase with hyphens (e.g., code-review) |
description |
Yes | What the skill does and when to trigger it (up to 1024 chars) |
license |
No | License for the skill |
compatibility |
No | Environment requirements (e.g., “Requires Python 3.10+”) |
metadata |
No | Arbitrary key-value pairs (author, version, tags) |
The description field is critical. It is the primary way agents decide whether to activate a skill. Write it to clearly describe both what the skill does and when it should be used.
Skills are designed to load incrementally. This is the key architectural idea that makes them practical:
At startup, the agent loads only the name and description from the frontmatter of every installed skill. This costs roughly 100 tokens per skill. Even with 50 skills installed, the startup cost is only about 5,000 tokens.
The agent uses this metadata to decide: “Given the current task, is this skill relevant?”
When the agent decides a skill is relevant, it loads the full SKILL.md body. This is where the step-by-step instructions, decision frameworks, and examples live. The recommendation is to keep this under 5,000 tokens.
Files in the references/, assets/, and scripts/ directories are loaded only when the Level 2 instructions reference them. These might include:
references/security-checklist.md - Extended security review criteriaassets/api-schema.json - API specification for validationassets/response-template.md - Template for formatted outputscripts/run-linter.sh - Script the agent can executeThis three-level approach means you can write very detailed skills without paying the context cost upfront.
Startup: [L1: name + description] ~100 tokens per skill
|
Task matches: [L2: full instructions] ~2,000-5,000 tokens
|
As needed: [L3: reference files] Variable
These three concepts work at different layers. Understanding the distinction helps you decide which to use:
| Dimension | Skills | Tools / Function Calling | MCP |
|---|---|---|---|
| What it provides | Knowledge and instructions | Executable functions | Standardized protocol for tool integration |
| Analogy | A recipe card | A kitchen appliance | A power outlet standard |
| Nature | Natural language guidance | Code that runs | JSON-RPC communication layer |
| Execution | LLM interprets instructions | Deterministic function call | Protocol for calling remote tools |
| Latency | Local (just text) | Depends on function | Network round-trip |
| Best for | Encoding expertise, workflows, review criteria | Taking actions (API calls, file ops, queries) | Connecting to external services with auth and discovery |
| Context cost | Low (progressive loading) | Medium (schema per tool) | Higher (full schemas upfront) |
In a typical agent, all three are used:
Example: A “deploy-to-staging” skill might include instructions like:
run_tests tooldeploy toolThe skill provides the workflow logic. The tools and MCP servers provide the execution capability.
A skill should do one thing well. Instead of a “development” skill that covers everything, create separate skills for code review, deployment, documentation, and testing.
Skills are interpreted by a language model. Be explicit about:
Real expertise includes knowing when to deviate from the standard process:
### Handling large PRs (>500 lines changed)
If the PR changes more than 500 lines:
- Focus review on the most critical files first (API endpoints, auth, data models)
- Skip cosmetic issues (formatting, naming) unless they affect readability
- Suggest splitting the PR if the changes cover multiple unrelated concerns
Include examples of what the skill’s output should look like:
### Example output
**Security - High**
`src/api/auth.py:45` - Password is compared using `==` instead of
`hmac.compare_digest()`. This is vulnerable to timing attacks.
Suggested fix: Replace with `hmac.compare_digest(stored_hash, provided_hash)`
If your instructions are getting long, move detailed reference material to L3 files in the references/ directory and reference them from the main instructions:
For the full security checklist, refer to `references/security-checklist.md`.
Google’s Agent Development Kit supports skills through the SkillToolset class. Here is a conceptual overview of how it works:
Place skill directories in a skills/ folder within your agent project:
my-agent/
agent.py
skills/
code-review/
SKILL.md
references/
security-checklist.md
deploy/
SKILL.md
scripts/
pre-deploy-check.sh
The agent discovers and loads skills from this directory. Only the L1 metadata is loaded at startup. Full instructions load when the agent activates the skill.
For dynamic skill creation or modification, ADK also supports defining skills in code using the Skill model class. This is useful when skill content needs to change based on runtime conditions.
For detailed implementation guidance, see the ADK Skills documentation.
The Agent Skills specification has been adopted by multiple platforms:
| Platform | Support | Details |
|---|---|---|
| Claude Code (Anthropic) | Yes | Skills as /slash-commands, anthropics/skills repo |
| Google ADK | Yes | SkillToolset class, file-based and code-based |
| GitHub Copilot | Yes | Works in VS Code, CLI, and Copilot coding agent |
| OpenAI | Yes | Agents SDK with skills support |
| Spring AI | Yes | Java ecosystem via spring-ai-agent-utils |
The specification is maintained by a community working group and published at agentskills.io. Because the format is just Markdown files in a directory, skills are portable across platforms that support the spec.
---
name: database-migration
description: >
Creates and reviews database migrations. Use when the user asks to
add, modify, or remove database tables or columns, or when reviewing
migration files.
---
## Creating Migrations
1. Verify the current migration state: run `alembic heads` to check for conflicts
2. Create the migration: `alembic revision --autogenerate -m "description"`
3. Review the generated migration file for:
- Correct up/down operations (both directions should work)
- No data loss in down migration
- Appropriate indexes for new columns
- Nullable columns for existing tables (to avoid breaking existing rows)
4. Test the migration: `alembic upgrade head` then `alembic downgrade -1`
## Common Pitfalls
- Adding a NOT NULL column to an existing table without a default value
will fail if the table has existing rows. Always add a default or make
it nullable first, then backfill.
- Renaming columns requires a two-step migration: add new column, migrate
data, drop old column. Alembic's autogenerate does not handle renames.
- Large table alterations should be done in batches on production. Add a
note in the migration file if the table has >1M rows.
---
name: incident-response
description: >
Guides incident response and post-mortem creation. Use when there is
a production incident, outage, or when creating post-mortem documents.
---
## During an Incident
1. Assess severity using the service dashboard at `monitoring.internal/overview`
2. Check recent deployments: `gcloud run revisions list --service=api --limit=5`
3. Check error rates: `gcloud logging read "severity>=ERROR" --limit=50 --freshness=1h`
4. If a recent deployment is suspect, rollback:
`gcloud run services update-traffic api --to-revisions=PREVIOUS_REVISION=100`
## After Resolution
Create a post-mortem document using the template in `assets/postmortem-template.md`
with these sections filled in:
- Timeline of events (with timestamps)
- Root cause analysis
- Impact (users affected, duration, data loss if any)
- What went well in the response
- Action items with owners and due dates
| Situation | Use Skills | Use Something Else |
|---|---|---|
| Team has specific review criteria | Yes | - |
| Agent needs to follow a multi-step workflow | Yes | - |
| Agent needs to call an API | No | Use a tool or MCP |
| Agent needs project context (build commands, structure) | No | Use AGENTS.md |
| Workflow is simple and one-off | No | Just put it in the prompt |
| Knowledge changes rarely and is domain-specific | Yes | - |
| Knowledge changes frequently or needs live data | No | Use RAG or tools |
| Previous Lesson: MCP Deep Dive | Next Lesson: Orchestrators -> |