Agent Builder's Almanac

tl;dr

SWEs at companies with enterprise AI should use Codex to prototype and ship agents. Until they need to optimize message state, built-in tools, or mix and match models.

The number of packages for building AI agents is bewildering. For personal projects, it's fun to switch back and forth between solutions like Amp, Cursor, CrewAI, n8n, LangGraph. But as of this writing, long-running coding-agent style packages are stable enough for common usecases. SWEs employed at companies with enterprise AI should just start with Codex or OpenAI Agents SDK.

Do you work at a company that buys cloud inference from Azure, Bedrock, Vertex? Is there an enterprise SLA? SSO access? Team token budgets? Chances are you should start with Codex or OpenAI Agents SDK.

That is if you aren't building a chatbot, or setting up your claw, or Claude-pilled.

Prototype

Are you building an app or a workflow? Use an SDK for an app, or a CLI for a workflow. A few open source SDKs ship with CLIs.

The charts that follow highlight features and differentiators between packages, and omit categories where there's feature parity. Remember, these capabilities change weekly.

Codex CLI and OpenAI Agents SDK are the best choice for the majority of differentiators.

Feedback

OpenAI and Anthropic offerings make collecting feedback easier with built-in async cloud sandboxes e.g. Codex (Web) and native desktop apps e.g. Claude Cowork.

Pick the SDK or CLI that makes it easiest to collect feedback from your target user.

Production

In production, you might want to control state and memory. You might have token cost and latency constraints. You might want to remove unused bundled features. And you might want to mix-and-match models in agents and subagents. This is when some 'older' agent packages can make more sense.

Reach for purpose-built packages only when it's time to optimize.