Guardrails

The bumpers on the bowling lane - they don't tell the agent what to do, but they keep it from doing something catastrophic.

The simple explanation

Guardrails are constraints you set up to limit what an AI agent can do wrong. They're not instructions - they're boundaries. Like a bowling lane with bumper rails, the agent can still aim wherever it wants, but it can't throw the ball into the gutter.

In software, guardrails come in many forms: type checkers that catch incorrect data shapes, test suites that catch regressions, linters that enforce code style, file access restrictions that prevent agents from touching production configs, and mandatory human review before code gets merged.

The key insight is that guardrails are automated. They don't require you to watch the agent constantly. They fire automatically when something goes wrong, giving the agent feedback in its observe phase or blocking a bad change before it lands.

Why it matters for agentic engineering

Without guardrails, you're doing vibe coding - letting the AI do whatever it wants and hoping for the best. Guardrails are what make agentic engineering a disciplined practice. They let you give agents more autonomy without proportionally increasing risk.

Think of it as a trust gradient. New agents or unfamiliar tasks get tight guardrails - sandboxed environments, small scopes, mandatory review. As you build confidence in the agent's capabilities for specific types of tasks, you can relax certain guardrails. But you never remove them entirely.

In practice

Type systems: TypeScript's compiler, Python's mypy, Rust's borrow checker - catch bugs at build time before they reach runtime
Test suites: If the agent's changes break existing tests, the agent knows immediately and can self-correct
Linting: Enforces code style and catches common mistakes (unused variables, missing error handling)
File access restrictions: Limit which directories the agent can read or write - keep it out of secrets, configs, and infrastructure code
Iteration limits: Cap the number of times an agent can retry before escalating to a human
Sandboxing: Run agents in isolated environments so mistakes don't affect production
Code review: The ultimate guardrail - a human reviews every change before it ships
Scope limits: Restrict agents to specific tasks rather than giving them free rein over the entire codebase

The best guardrails are the ones you'd want in place anyway, even without AI. Type checking, testing, linting, and code review are just good engineering practice. AI agents make them essential.

← Back to Glossary

Guardrails

The simple explanation

Why it matters for agentic engineering

In practice

Related Terms