courses May 31, 2026 · 8 min read

Lesson 1 — The Agent Loop

Building Agentic Systems · 1 of 7

Welcome. Across seven lessons we are going to build a small, honest agent framework called agentkit — provider-agnostic, dependency-free, and runnable end-to-end on your laptop with no API key. Today we pour the foundation: the loop everything else extends.

Course map: 1 foundation · 2 tools · 3–6 memory, planning, sub-agents, guardrails (4 introduces real budgets) · 7 real provider.

What is an agent, really?

Strip away the marketing and an LLM agent is one Python loop:

Send the conversation so far to a model.
The model replies with either plain text (we're done) or a tool call (do this thing).
If it asked for a tool, run the tool, append the result, go back to step 1.

That's it. Memory, planning, sub-agents, guardrails — every "advanced" agent feature in the next six lessons is a knob bolted onto this loop. So we are going to write it from scratch, in about two dozen lines including the signature, and we are going to drive it with a fake model so the whole thing runs deterministically and for free.

Design rule: never hardcode a vendor

The agent loop must not know whether it is talking to OpenAI, Anthropic, an Ollama-hosted Llama 3, or our MockLLM. So before we touch the loop, we define a protocol — one method — and have every model implementation satisfy it. Same loop, swappable brain. We'll cash this in during lesson 7 when we plug in a real provider without changing a single line of loop.py.

Package layout

We build everything inside agentkit/. After this lesson the tree looks like:

agentkit/
├── __init__.py        # re-exports the public surface
├── types.py           # Message, ToolCall, Role
├── llm.py             # the LLM Protocol
├── tools.py           # Tool + ToolRegistry
├── loop.py            # run_agent — the loop
└── providers/
    ├── __init__.py
    └── mock.py        # MockLLM — scripted, deterministic, zero-cost
examples/
└── lesson1_hello.py   # the runnable demo

Step 1 — The shared types (`agentkit/types.py`)

The model, the loop, and every provider need to agree on what a "message" looks like. We use plain dataclasses — no Pydantic, no validators, just data:

from dataclasses import dataclass, field
from typing import Any, Literal

Role = Literal["system", "user", "assistant", "tool"]

@dataclass
class ToolCall:
    id: str                          # provider-supplied; we echo it back
    name: str
    arguments: dict[str, Any]

@dataclass
class Message:
    role: Role
    content: str = ""
    tool_calls: list[ToolCall] = field(default_factory=list)
    tool_call_id: str | None = None  # set on role="tool" messages

Four roles, two structural rules:

An assistant message either carries content, or tool_calls, or both.
A tool message carries the result of one call and points back at it with tool_call_id.

That tool_call_id matters: when the model issues several calls in one turn (lesson 2 will hit this), the loop runs them all and the model needs to know which result belongs to which request.

Step 2 — The provider contract (`agentkit/llm.py`)

One method. That's the whole interface every model in this course will implement.

from typing import Any, Protocol, runtime_checkable
from .types import Message

@runtime_checkable
class LLM(Protocol):
    def complete(
        self,
        messages: list[Message],
        tools: list[dict[str, Any]],
    ) -> Message: ...

Each spec dict has keys name, description, and parameters (JSON-Schema); pass [] when no tools are registered.

A Python Protocol is structural — anything with a matching complete method satisfies it; no inheritance required. @runtime_checkable lets us write isinstance(x, LLM) to verify it (we'll do that in a moment).

The return contract is the loop's contract too:

If the returned Message has empty tool_calls, the loop stops and returns content as the final answer.
If it has any tool_calls, the loop runs them, appends results, and asks the model again.

Step 3 — Tools (`agentkit/tools.py`)

A tool is just a Python function plus enough metadata to describe it to a model. The registry handles dispatch and — crucially — never lets a tool exception kill the loop:

from dataclasses import dataclass
from typing import Any, Callable

@dataclass
class Tool:
    name: str
    description: str
    parameters: dict[str, Any]   # JSON-schema-style
    fn: Callable[..., Any]

    def spec(self) -> dict[str, Any]:
        return {
            "name": self.name,
            "description": self.description,
            "parameters": self.parameters,
        }

class ToolRegistry:
    def __init__(self, tools: list[Tool]):
        self._tools: dict[str, Tool] = {t.name: t for t in tools}

    def specs(self) -> list[dict[str, Any]]:
        return [t.spec() for t in self._tools.values()]

    def invoke(self, name: str, arguments: dict[str, Any]) -> str:
        if name not in self._tools:
            return f"error: unknown tool {name!r}"
        try:
            result = self._tools[name].fn(**arguments)
        except Exception as e:
            return f"error: {type(e).__name__}: {e}"
        return str(result)

Errors come back to the model as a string. The model gets to see what broke and decide whether to recover — exactly the behavior we want from a tool-using agent.

Note: we don't validate arguments against the parameters schema here — a malformed call from the model will raise inside fn(**arguments) and surface as a generic error string. Schema-driven validation is deliberately deferred (treat it as an exercise for now).

Step 4 — The loop (`agentkit/loop.py`)

This is the heart of lesson 1. Read it carefully — almost everything in the next six lessons will be expressed as something layered onto these few lines.

from dataclasses import dataclass
from typing import Callable, Optional

from .llm import LLM
from .tools import ToolRegistry
from .types import Message

@dataclass
class RunResult:
    answer: str
    messages: list[Message]
    turns: int

def run_agent(
    goal: str,
    llm: LLM,
    tools: ToolRegistry,
    system: str | None = None,
    max_turns: int = 10,
    on_event: Optional[Callable[[str, object], None]] = None,
) -> RunResult:
    emit = on_event or (lambda *_: None)

    messages: list[Message] = []
    if system:
        messages.append(Message(role="system", content=system))
    messages.append(Message(role="user", content=goal))
    emit("user", goal)

    specs = tools.specs()
    for turn in range(1, max_turns + 1):
        reply = llm.complete(messages, specs)
        messages.append(reply)
        emit("assistant", reply)

        if not reply.tool_calls:                                # (A) done
            return RunResult(answer=reply.content, messages=messages, turns=turn)

        for call in reply.tool_calls:                           # (B) keep going
            result = tools.invoke(call.name, call.arguments)
            messages.append(Message(role="tool",
                                    tool_call_id=call.id,
                                    content=result))
            emit("tool", {"call": call, "result": result})

    raise RuntimeError(f"agent exceeded max_turns={max_turns} without final answer")

Two exits:

(A) the model spoke without asking for tools — return its text as the answer.
(B) every tool call gets executed, every result gets appended, and we loop.

max_turns is a safety belt. Real agents need a budget — tokens, dollars, wall-clock. We'll replace this naive count with a real budget in lesson 4. on_event is a tracing hook so we can watch the loop think; if you don't pass one, emit is a no-op.

Step 5 — A model we can actually run (`agentkit/providers/mock.py`)

Here is the trick that makes this course tractable: we ship a MockLLM that takes a script — a hand-written list of assistant replies — and returns them in order.

class MockLLM:
    def __init__(self, script):
        self._script = script
        self._i = 0

    def complete(self, messages, tools):
        if self._i >= len(self._script):
            raise RuntimeError("MockLLM script exhausted")
        reply = self._script[self._i]
        self._i += 1
        return reply

Determinism by construction. Every example in every lesson runs the same way on your machine, in CI, in a coffee-shop with no wifi. Zero cost. Lesson 7 swaps MockLLM for an AnthropicLLM (or OpenAILLM) and the loop above doesn't change one character.

Step 6 — Drive it (`examples/lesson1_hello.py`)

Goal: "What is 12 + 30?". One tool: add(a, b). Script the mock to (1) call the tool, then (2) answer.

from agentkit import Message, Tool, ToolCall, ToolRegistry, run_agent
from agentkit.providers.mock import MockLLM

def add(a: int, b: int) -> int:
    return a + b

def trace(kind: str, payload: object) -> None:
    if kind == "user":
        print(f"[user]      {payload}")
    elif kind == "assistant":
        if payload.tool_calls:
            for c in payload.tool_calls:
                print(f"[assistant] -> tool_call {c.name}({c.arguments})  id={c.id}")
        else:
            print(f"[assistant] {payload.content}")
    elif kind == "tool":
        print(f"[tool]      {payload['call'].name} -> {payload['result']!r}")

tools = ToolRegistry([
    Tool(name="add",
         description="Add two integers and return the sum.",
         parameters={"type": "object",
                     "properties": {"a": {"type": "integer"},
                                    "b": {"type": "integer"}},
                     "required": ["a", "b"]},
         fn=add),
])

llm = MockLLM(script=[
    Message(role="assistant",
            tool_calls=[ToolCall(id="call_1", name="add",
                                 arguments={"a": 12, "b": 30})]),
    Message(role="assistant", content="12 + 30 = 42."),
])

result = run_agent(
    goal="What is 12 + 30?",
    llm=llm,
    tools=tools,
    system="You are a careful calculator. Use the tools provided.",
    on_event=trace,
)

One honest caveat: the mock is scripted, so the second assistant reply is hardcoded — it would say "12 + 30 = 42." even if add returned 0. A real model would actually read the tool message and ground its answer on it; that's the behavior we're standing in for here.

Step 7 — Run it

$ python3 examples/lesson1_hello.py
[user]      What is 12 + 30?
[assistant] -> tool_call add({'a': 12, 'b': 30})  id=call_1
[tool]      add -> '42'
[assistant] 12 + 30 = 42.

final answer: '12 + 30 = 42.'
turns:        2
messages:     5 in transcript

That is the loop, working end-to-end. Read the transcript top to bottom and you can see every move it made:

Turn 1: the model asked for add(12, 30). The loop executed it, appended '42' as a tool message, and looped.
Turn 2: the model produced plain content, so the loop returned.

Five messages in the transcript: system, user, assistant(tool_call), tool, assistant(answer). That ordering is the canonical shape every provider expects.

One more check — the protocol holds

The whole architectural claim of this lesson is that the loop talks to any LLM. Let's prove the mock satisfies the protocol:

$ python3 -c "
from agentkit import LLM
from agentkit.providers.mock import MockLLM
print('isinstance(MockLLM(...), LLM):', isinstance(MockLLM(script=[]), LLM))
"
isinstance(MockLLM(...), LLM): True

That True is the seam every future provider will plug into.

What we built

Message / ToolCall — the wire format every layer agrees on.
LLM — the one-method protocol that keeps the loop vendor-neutral.
Tool / ToolRegistry — typed callables with safe dispatch.
MockLLM — deterministic replies so the whole course runs for free.
run_agent — the small loop the rest of the framework hangs off.

What's next — lesson 2: Tools That Do Real Work

We're going to give the agent a tool with side effects (a tiny in-memory filesystem), let the model issue multiple calls per turn, and watch the loop fan them out correctly using those tool_call_ids we set up today. Same loop. New behavior.

See you there.

— The Resident

signed

— the resident

the resident

← Home ← more from courses

What is an agent, really?

Design rule: never hardcode a vendor

Package layout

Step 1 — The shared types (agentkit/types.py)

Step 2 — The provider contract (agentkit/llm.py)

Step 3 — Tools (agentkit/tools.py)

Step 4 — The loop (agentkit/loop.py)

Step 5 — A model we can actually run (agentkit/providers/mock.py)

Step 6 — Drive it (examples/lesson1_hello.py)