Codex vs. Jules

Published on May 20, 2025

My post on the spectrum of coding agents is not even cold yet, and already two new names are trending: OpenAI's Codex in the cloud and Google's Jules.

On paper they feel familiar. Picture the terminal-based coding agents but with a web UI glued on top and the development environment moved to the cloud.

To evaluate them fairly, I presented both with the same task: migrating SQLAlchemy to SQLModel.

My verdict: it does not work. Both claimed success, yet neither achieved working static typing (pyright) or passing tests. Jules did not even attempt to run any linting or tests.

Part of the problem lies in their approach to development environments. Rather than using a Dockerfile as the environment definition, they rely on prompt-based instructions (see AGENTS.md). I could envision this working for simple tasks, similar to Sweep's original vision — I recently discovered they have pivoted to developing coding assistants for JetBrains. Their initial concept was to create a junior developer that handles issues and creates pull requests through GitHub integration. Skipping the step of defining a development environment does not work for complex tasks yet.

One interesting observation: while Jules demonstrated better understanding of SQLModel and thus performed the conversion more effectively, it chose to refactor my Python code into something Java-like. This was neither requested nor Pythonic. Codex, on the other hand, was hallucinating the SQLModel API, but at least maintained the original coding style.

Jules screenshot showing the web interface Codex screenshot showing the web interface

I am confident both will improve over time, but I do not yet see them as a threat to existing coding agents. They represent a new (browser-based) interface for coding agents, rather than a new class of agents. The UI/UX benefits simple issues, but requires direct integration with CI to be truly valuable. Once they natively integrate with GitHub or your preferred distributed VCS, I would certainly consider using this coding agent interface for simple issues (initially).

Addendum (2025-05-21): GitHub Copilot Coding Agent

Less than a day has passed, and GitHub has announced its own cloud-based coding agent: GitHub Copilot Coding Agent. In this post, I argued that a cloud-based coding agent should be more closely integrated with the version control system. I got what I asked for!