Tokens

The units AI models think in - roughly ¾ of a word each.

The simple explanation

Tokens are the basic units that AI models use to process text. They're not exactly words - they're more like syllable-sized chunks. The word "engineering" might be two tokens ("engine" + "ering"), while common short words like "the" are one token each. Code has its own tokenization patterns - a variable name like getUserById might be three or four tokens.

Think of tokens like the atoms of AI language. Everything the model reads (input tokens) and everything it writes (output tokens) is measured in tokens. When someone says a model has a "200K context window," they mean it can process 200,000 tokens at once.

Why it matters for agentic engineering

Tokens directly translate to two things you care about: cost and capability.

Cost: API pricing is per-token. Input tokens (what you send to the model) are cheaper than output tokens (what the model generates). A coding agent that reads 50 files and generates a large implementation can burn through millions of tokens in a session. Understanding token economics helps you design cost-efficient workflows - like using smaller models for simple tasks and reserving large models for complex reasoning.

Capability: The token budget determines how much the agent can see and do in one turn. More tokens in the context means the agent can reason over more code at once. But there's a quality tradeoff - very long contexts can degrade attention and accuracy.

In practice

Rough math: 1 token ≈ 4 characters ≈ ¾ of a word. A typical source file (200 lines) is ~2,000-4,000 tokens
Input vs. output: Reading your codebase (input) is cheaper than generating code (output). Most providers charge 3-5x more for output tokens
Caching: Many providers offer prompt caching - if you send the same system prompt or context repeatedly, cached tokens are significantly cheaper
Token-efficient prompting: Being concise in your instructions saves tokens. But don't sacrifice clarity for brevity - a vague prompt that causes three retries costs more than a detailed prompt that works the first time
Monitoring: Track your token usage per task. If a simple bug fix is consuming 100K+ tokens, something is wrong with the workflow

← Back to Glossary

Tokens

The simple explanation

Why it matters for agentic engineering

In practice

Related Terms