Deep Dive

AI Agents in Depth

From chatbots to autonomous digital workers

An AI agent is an LLM with agency - the ability to perceive its environment, make decisions, take actions, and learn from results. While a chatbot generates one response, an agent runs in loops: think, act, observe, adjust.

The key components: a reasoning engine (LLM), tools (APIs, search, code execution), memory (conversation history + vector storage), and a planning system (goal decomposition). Together, these turn a text generator into a digital worker.

How It Works

Goal Decomposition

The agent breaks a complex goal into subtasks. 'Book me a flight to Tokyo' becomes: search flights -> compare prices -> check calendar -> select best option -> book -> confirm.

ReAct Loop

Reason-Act-Observe. The agent thinks about what to do (Reason), executes a tool call (Act), examines the result (Observe), then decides the next step. This loop repeats until the goal is met.

Tool Use

The LLM generates structured function calls - search the web, run code, query a database, send an email. MCP (Model Context Protocol) standardizes how agents connect to tools.

Memory Management

Short-term memory: conversation context. Long-term memory: vector database of past interactions. Working memory: current task state. The agent retrieves relevant memories to inform decisions.

Multi-Agent Orchestration

Complex tasks use multiple specialized agents: a researcher, a coder, a reviewer. Frameworks like CrewAI and LangGraph coordinate agent teams with defined roles and communication protocols.

Error Recovery

When an action fails, the agent reasons about why and tries an alternative approach. Good agents have retry logic, fallback strategies, and human-in-the-loop escalation.

Key Components

LangGraph

Stateful, graph-based agent workflows with persistence and human-in-the-loop

CrewAI

Multi-agent framework - define agents with roles, goals, and backstories

AutoGen

Microsoft's multi-agent conversation framework for complex task solving

MCP (Model Context Protocol)

Open standard for connecting AI agents to tools and data sources

A2A (Agent-to-Agent)

Google's protocol for agent interoperability and communication

Semantic Kernel

Microsoft's SDK for building AI agents with plugins and planners

Who's Building With This

Anthropic

Claude Code - agentic coding in terminal. Computer use agents that operate any software autonomously.

OpenAI (Codex)

Codex agent - cloud-based coding agent that runs in sandboxed environments, ships PRs autonomously.

Cursor / Windsurf

AI-powered IDEs where agents write, test, and refactor entire codebases with background execution.

Devin (Cognition)

Autonomous software engineer - takes tickets, writes code, runs tests, opens PRs end-to-end.

Key Takeaway

Agents are the application layer of AI. The shift from 'AI as a tool' to 'AI as a teammate' is happening now. The winners will build agents that are reliable, recoverable, and trustworthy enough to operate autonomously.

References & Further Reading

← STORY OF INTELLIGENCE HOME