Video Summary

I Hated Every Coding Agent, So I Built My Own — Mario Zechner (Pi)

Mastra

Main takeaways
01

Mario disliked existing coding agents for feature bloat, hidden context injection, and poor observability, so he built pi.

02

Pi intentionally provides only four tools: read, write, edit, and bash — keeping the core minimal and predictable.

03

Pi uses a tiny system prompt, tree-structured sessions, full cost tracking, and hot-reloadable TypeScript extensions for customization.

04

Extensions enable custom tools, UIs, multi-agent setups (pi-messenger), and visual feedback (pi-annotate) without hidden behaviors.

05

On TerminalBench, pi (with Claude Opus 4.5) scored close to Terminus even before advanced optimizations like compaction.

Key moments
Questions answered

What motivated Mario Zechner to build pi?

He was frustrated by existing agents' feature bloat, unpredictable hidden context injection, poor observability, and lack of power-user extensibility, so he built a minimal alternative he could control.

What are the four core tools pi exposes?

Pi intentionally exposes only read (file), write (file), edit (file), and bash — the minimal operations needed for code-oriented workflows.

How does pi differ from agents like Claude Code or OpenCode?

Pi strips away added features and hidden behaviors, uses a very short system prompt, emphasizes observability and cost tracking, and supports hot-reloadable TypeScript extensions for user-defined tools.

What is a tree-structured session in pi?

Rather than a linear chat history, pi stores sessions as a tree so sub-agents or branches can read files and operate independently while preserving context and lineage.

What kinds of extensions has the community built for pi?

Early community extensions include pi-annotate (visual feedback), pi-messenger (multi-agent chatroom), custom UIs, and even demos like running Doom — all built as hot-reloadable TypeScript modules.

The Journey of Building a Coding Agent 02:06

“I eventually thought, I hate all the existing coding agents or harnesses. How hard can it be to write one myself?”

  • Mario Zechner recounts his initial dissatisfaction with existing coding agents, leading him to consider creating his own. This reflects a common sentiment among developers who seek tools that align better with their personal workflows and preferences.

  • The speaker shares his collaborative experience working overnight with peers to create various tools, many of which they never used. This highlights the iterative process of innovation often characterized by experimentation without immediate application.

Evolution of Coding Tools 04:14

“Cloud Code was not a good tool when it comes to observability and actually managing your context.”

  • In discussing the development of coding tools, Mario describes how Cloud Code revolutionized the genre by utilizing reinforcement learning models that intelligently explored codebases.

  • However, he became frustrated with Cloud Code's lack of stability and predictability, noting that frequent updates disrupted workflows. This emphasizes the challenges developers face with tools that evolve rapidly yet remain inconsistent.

Challenges with Existing Tools 07:49

“You’re working with this new tool, and then the tool vendor changes a tiny little thing under the hood that makes the LLM go crazy with your existing workflows.”

  • Mario expresses his frustration regarding how changes in tools like Cloud Code can negatively impact established workflows. He points out that the lack of control over such tools poses significant challenges for developers who require reliability and stability for efficient work.

  • Even as these tools advance and become more widely adopted, they may not cater to the specific needs of every developer, thus necessitating the development of customized solutions.

Frustrations with Existing Coding Agents 09:50

"So obviously they’re doing things right, but not for me."

  • Mario Zechner shares his dissatisfaction with existing coding agents, expressing that while they may perform well, they do not align with his preferences or needs as an older developer.

  • He discusses several coding tools he explored, starting with Codec CLI, which he found lacking in user interface and initial model performance.

  • However, he acknowledges that Codex has improved significantly since his first impressions.

  • Zechner praises AMP, a coding harness created by engineers formerly from Sourcegraph, stating, "They managed to build a commercial coding harness where they take away features instead of adding them, and most of the choices make a lot of sense to me."

Open Code and Context Management Issues 11:12

"The problem with Open Code is that it's also not very good at managing your context."

  • Zechner reflects on his experience with Open Code, noting its reliance on session compaction but criticizing its poor context management capabilities.

  • He emphasizes that each coding task decision made by the agent can lead to challenges, particularly regarding compiling code after multiple edits.

  • His concerns extend to the real-time interactions between the coding agent and the language server protocol (LSP), suggesting that LSP can provide misleading status updates before the agent has completed its modifications.

The Nature of Coding Agent Evaluations 16:10

"It has a bunch of computer use and coding-related tasks that an agent needs to fulfill."

  • Zechner introduces TerminalBench, a benchmarking tool for evaluating coding agents, which includes numerous tasks ranging from simple setups to complex simulations.

  • He describes how Terminal is considered a top performer among coding harnesses, emphasizing an approach where the model interacts through a basic terminal interface.

  • This leads to an important observation that contemporary coding agents may not require the plethora of features presented by various harnesses to achieve good results. Instead, simplification can lead to effective performance without unnecessary complexities.

  • "We are in the messing around and finding out stage, and nobody has any idea what the perfect coding agent should look like."

Custom Terminal User Interface Coding Agent 19:39

"The coding agent itself is both an SDK you can use in headless mode or a full terminal user interface coding agent."

  • Mario Zechner discusses the creation of a terminal user interface that is efficient and compact, consisting of only 600 lines of code. This coding agent, described as both an SDK and a full terminal interface, offers users flexibility in how they choose to interact with coding tasks.

  • He emphasizes the simplicity of the entire system prompt, which is minimal in comparison to other coding harnesses. Zechner notes that this streamlined approach is due to the efficiency of frontier models that are trained through reinforcement learning, eliminating the need for excessive system prompts.

Limitations of Current Coding Agents 20:31

"Most coding agent harnesses at the moment have two modes: either the agent can do whatever it wants, or it requires user approval for every action."

  • Current coding agents operate under two predominant modes: one where agents operate autonomously and another where they seek user approval for actions, such as deleting files or listing directory contents. Zechner critiques this by indicating that requiring user approval can lead to fatigue, which might cause users to either disable the mode or automate their responses, undermining usability.

  • He suggests that relying solely on containerization as a security measure against data exfiltration and prompt injections is inadequate and points out that there are better solutions than the common approval dialogue rates.

Key Features of the PI Coding Agent 21:20

"It only has four tools: read a file, write a file, edit a file, and bash. That's all you need."

  • Zechner highlights the limited yet powerful toolset of the PI coding agent, which includes the ability to read, write, edit files, and use bash commands. He underscores that additional features like more complex agents or built-in to-dos are unnecessary, advocating for a simple, manageable environment that allows for user-controlled extensions and functionalities.

  • He mentions alternatives for capabilities such as managing tasks and developing extensions, emphasizing that the extensible nature of PI allows for customization through simple TypeScript files.

Extensible Functionality of PI 22:23

"You can extend tools and give the LLM tools that you define, and I think no other coding agent harness currently offers that."

  • The extensibility of the PI coding agent allows users to create and integrate custom tools seamlessly with their workflow. Unlike other coding harnesses, users can create and modify features through simple coding practices, which adds a layer of personalization and efficiency that is often lacking in standard frameworks.

  • Users can develop extensions that are specific to their tasks, with the ability for changes to be implemented on-the-fly, leading to increased productivity and an enhanced coding experience.

Communication and Collaboration Features 24:11

"PI Messenger acts like a chat room for multiple PI agents, allowing them to communicate with a custom UI."

  • Zechner introduces the PI Messenger, which facilitates communication between different agents, enhancing collaborative efforts. This feature allows users to monitor what the agents are doing through a custom UI and potentially engage them in a gaming experience while tasks are executing.

  • Other features mentioned include the ability to annotate live websites, making it easier to provide real-time feedback to coding agents without switching editing environments, thus streamlining the workflow further.

Session Management and Cost Tracking 25:00

"Your session is a tree, not a linear list of chats, which allows for sub-agents that can read all the files in a directory."

  • The structure of sessions in PI is designed to be intuitive, organized as a tree which allows users to easily navigate through sub-agents and files. This tree structure is beneficial for managing complex tasks where multiple resources need to be tracked.

  • Furthermore, Zechner highlights the importance of cost tracking and efficient data management. Many coding harnesses lack these features, but PI includes them to significantly improve resource management and transparency during development processes.