Video Summary

Full Walkthrough: Workflow for AI Coding — Matt Pocock

AI Engineer

Main takeaways
01

Start by converting ambiguous briefs into a concise PRD that defines destination and out-of-scope items.

02

Break work into vertical 'tracer bullet' slices and create a Kanban backlog so agents can work in parallel.

03

Keep LLM context small—stay in the 'smart zone' by chunking tasks and compacting session history.

04

Use TDD (red-green-refactor) and strong feedback loops so agents produce verifiable code and tests.

05

Design deep modules with simple interfaces to make codebases testable and agent-friendly; avoid shallow, noisy module graphs.

Key moments
Questions answered

What does Matt mean by the LLM 'smart zone' and 'dumb zone'?

The smart zone is when a model operates from a small, fresh context and yields best results. As you add tokens and long histories the attention relationships multiply and performance drops into the 'dumb zone'—so size tasks to stay within the model's effective context window.

How should teams prepare work for autonomous (AFK) agents?

Create a clear PRD (destination), break it into vertical tracer-bullet issues on a Kanban board, include out-of-scope items, and ensure each ticket has tests and a defined definition of done so agents can pick tasks autonomously.

Why is test-driven development (TDD) emphasized for AI agents?

TDD forces a red-green-refactor cycle: write failing tests first, run them, then implement. That creates reliable feedback loops and prevents agents from 'cheating' tests, improving QA and making agent output verifiable.

How should you structure a codebase to make it easier for AI agents to work in?

Prefer deep modules with small public interfaces that encapsulate complexity. Avoid many tiny shallow modules and noisy dependency graphs—deep, testable modules make it easier to define test boundaries and for agents to implement features correctly.

What is Sandcastle and how does it fit the workflow?

Sandcastle is a TypeScript library Matt demonstrates for running agent loops in isolated Docker sandboxes. It helps run prompts, execute code, and manage git branches so multiple agents can work in parallel and safely against a repo.

Introduction to AI and Coding Concepts 00:19

"Welcome, my name's Matt, I teach AI."

  • Matt introduces himself and establishes the purpose of the workshop, which is to guide the audience through AI coding concepts over a two-hour session.

  • He emphasizes that AI is perceived as a new paradigm but stresses that foundational software engineering principles still apply and work effectively with AI technologies.

  • The intent of the workshop is to highlight these critical fundamentals while also exploring the nuances of integrating AI into coding practices.

Audience Engagement and AI Experience 01:41

"Raise your hand if you've ever coded with AI."

  • Matt encourages audience participation by asking them to raise their hands based on their experiences with AI coding.

  • He notes different levels of familiarity with AI, ranging from those who have coded with it to those who face frustrations when using AI technologies.

  • This engagement technique fosters a sense of community and shared experience among participants, setting a collaborative tone for the workshop.

Understanding the Smart and Dumb Zones of LLMs 03:00

"When you're working with LLMs, they have a smart zone and a dumb zone."

  • Matt introduces the concept of "smart zones" and "dumb zones" in the context of working with Large Language Models (LLMs).

  • He explains that LLMs perform best when starting from a clean slate, as adding more tokens increases complexity and leads to diminished performance.

  • Matt likens adding tokens to adding teams in a football league, where relationships become exponentially more complicated, suggesting the importance of staying within manageable limits to avoid dropping into the "dumb zone."

Strategies for Handling Large Tasks with AI 05:02

"How do you tackle big tasks?"

  • Matt discusses the challenge of managing large coding tasks with AI and emphasizes the importance of breaking them down into smaller, more digestible components.

  • He mentions the practice of employing multi-phase plans to ensure that each task remains within the smart zone, thereby maximizing efficiency and accuracy.

  • The discussion pivots to the iterative approach of refining and looping through tasks in order to achieve the desired results while minimizing potential overwhelm from the AI.

The Role of Compacting in AI Sessions 08:54

"Every session with an LLM goes through the same stages."

  • Matt highlights that each session with an LLM has specific stages: system prompting, exploratory phase, implementation, and testing.

  • He advises that context should be kept minimal to avoid entering the dumb zone, stressing the importance of compacting information effectively during sessions to maintain optimal performance.

  • This insight serves as technical guidance for developers working with LLMs, enhancing their ability to manage context and workflow effectively.

Using AI for Coding Workflows 09:30

"It's asking me a bunch of questions, and I can highly recommend you do this."

  • Engaging in conversations with large language models (LLMs) can help plan out the next steps in coding sessions. This interactive dialogue stimulates ideas and enhances clarity.

  • Monitoring token usage during AI interactions is crucial since it provides insight into how close you are to the 'dumb zone' of LLMs, where performance may degrade. A status line displaying token count can aid in maintaining this awareness.

Compacting AI Conversations 10:07

"So if you're able to do that and you're able to optimize for that, then you're in a great spot."

  • Compacting is a technique where past conversations with an LLM are condensed into a smaller amount of information. This creates a usable history of interactions that helps developers refer back to previous discussions.

  • However, there is a subjective preference between developers regarding compacting—some may find it beneficial, while others may prefer to keep the context open for a more consistent flow of interaction similar to the character from "Memento."

Setting Up the Challenge: Gamification of Learning 13:55

"I'd love to add some gamification to the platform."

  • A scenario is described where a client, Sarah Chen, indicates a need for gamification to improve student retention on a course platform. Understanding client briefs effectively is essential for transforming ideas into actionable coding tasks.

  • It emphasizes the importance of communication, even with hypothetical clients, to clarify project goals, objectives, and any constraints involved in the development process.

The Grill Me Skill: Enhancing AI Communication 14:39

"Interview me relentlessly about every aspect of this plan until we reach a shared understanding."

  • The ‘Grill Me Skill’ is introduced as a method to ensure that both the developer and the AI have a shared understanding of project goals. Instead of jumping to a plan, this skill involves a series of detailed inquiries to clarify objectives and dependencies.

  • This approach counters the common pitfall in development of relying too heavily on specification documents that may not translate well into effective code. Emphasizing comprehension over mere execution is vital for project success.

Starting Simple with AI Recommendations 18:56

"Keep it simple: two point sources to start."

  • The speaker emphasizes the importance of simplicity by recommending starting with only two sources for AI.

  • They note that this approach not only provides clarity in the task but also aligns well with the project goals.

  • Often, the AI's suggestions are found to be very useful, demonstrating the value of leveraging AI recommendations.

Engaging in Discussions with AI 19:25

"I usually dictate to the AI instead of typing."

  • The speaker discusses their preference for dictating to the AI during interactions, enhancing the efficiency of communication.

  • They encounter technical difficulties with their new laptop, which affects their typical workflow.

  • A critical query arises about whether lesson progress points should be retroactive, underscoring the need for alignment on such decisions for proper feature fulfillment.

The Importance of Collaborative Alignment 21:01

"You're having discussions with the AI to get aligned."

  • Engaging with AI can lead to in-depth discussions, sometimes involving up to a hundred questions in a session.

  • This process creates a conversation history that becomes a valuable asset in the design and planning phase of projects.

  • The same collaborative approach can be utilized with domain experts, making it a versatile tool for validating assumptions and refining ideas.

Handling Q&A and Interaction with AI 22:09

"Let's start a little Q&A session now."

  • The speaker transitions into a Q&A format to engage with participants while continuing to interact with the AI for insights.

  • They invite viewers to participate and explore questions, underscoring the collaborative nature of this process.

  • Through interaction with the AI, they aim to discover more about gamification and specific functionalities, demonstrating the dynamic nature of AI-assisted discussions.

The Role of Humans in AI Collaboration 26:41

"There are two types of tasks in the AI age: human-in-the-loop tasks, and AFK tasks."

  • The speaker highlights the necessity of human involvement during certain phases of project planning while recognizing tasks that can be handled independently by AI.

  • Certain activities, such as implementation, can become automated or delegated to AI, freeing humans for more critical, creative tasks.

  • This delineation asserts the importance of human expertise in planning and alignment, particularly in collaborative environments with AI.

Defining the Task Structure 28:42

"We need to essentially find some way of turning it into a destination."

  • The process begins by determining the objectives for the AI coding session, likened to a grilling session where tasks and goals are clarified.

  • Two essential documents are identified as necessary: one that outlines the destination (what is being aimed for) and another that documents the journey (the tasks needed to reach the destination).

  • The speaker emphasizes the importance of understanding both the overall goals and the detailed breakdown of tasks, which leads to a clear definition of "done."

Creating the Product Requirements Document (PRD) 29:41

"All we're really doing is summarizing the design concept that we have so far."

  • The next step in the workflow is to write a Product Requirements Document (PRD). This document serves as the destination outline in the workflow.

  • The PRD includes problem statements, solutions, user stories, and implementation decisions—key components that contribute to a well-defined project outline.

  • The speaker prefers a specific structure for the PRD but notes that teams can adapt the format to fit their needs or practices.

Importance of Code and Continuous Design 33:10

"We actually keep the code in mind throughout the whole process."

  • Once the PRD is established, it is crucial to keep code considerations at the forefront of the development process.

  • The speaker discusses the idea of continuously designing the system by proposing modifications to existing modules, ensuring that the focus remains on practical coding implications.

  • The recording of this workflow is presented through practical examples, connecting theory with application.

Evaluating the Document's Usefulness 35:11

"I don't tend to read these documents because I know that LLMs are great at summarization."

  • The speaker mentions their practice of not reviewing the documents rigorously because their primary focus is on testing the ability of large language models (LLMs) to summarize effectively.

  • This reflects a strategic decision to trust the model’s output based on a shared design concept rather than getting bogged down by extensive document reviews.

  • The speaker also opens the floor for a Q&A session, encouraging participant engagement and feedback on the process just discussed.

Comfort Break and PRD Overview 38:27

"I can just notice some sleepy eyes, and I want to make sure that we're awake for the next bit."

  • Matt Pocock announces a five-minute comfort break to help the audience refresh before continuing with the discussion. He emphasizes the importance of staying alert for the next section of the workflow presentation.

  • After the break, he highlights the significance of the Product Requirements Document (PRD) as a destination document for their project. He suggests scanning for questions before diving deeper into the content.

Software Engineering Disciplines and Planning Challenges 39:03

"So, how do we actually get to our destination? How do we split it so that we don't put things into the dumb zone?"

  • The discussion shifts to identifying important software engineering disciplines in today's world. Although Matt gives a humorous response when asked for top disciplines, he acknowledges the complexities of breaking down a PRD without losing focus.

  • He poses a challenge: how to effectively plan the phases of a project. He introduces the idea of creating a multi-phase plan to reach their project destination without falling into common pitfalls.

The Kanban Board Approach 40:10

"A Kanban board is essentially just a set of tickets that you put on the wall that have blocking relationships to each other."

  • Matt introduces the concept of a Kanban board as a tool for organizing tasks and projects in a visually manageable way. This approach has been common practice since the Agile methodology gained traction.

  • He explains that the Kanban board will help visualize the project’s workflow by breaking it down into multiple tasks with their specific interdependencies, which aids in better planning and execution.

Importance of Vertical Slices 41:58

"What you really need to do is think about vertical layers. You need to think about thin slices of functionality that cross all of the layers that you need to."

  • An essential part of effective software planning involves focusing on vertical slices instead of horizontal coding. Matt emphasizes the significance of testing and integrating layers of functionality together.

  • He discusses the concept of "traceable bullets," which allows for better feedback throughout the development process. This method provides insights into how the tasks are progressing and whether they are aligned with the overall objectives.

Creating Issues and Iterating on Feedback 43:48

"We increase our level of feedback and we get near-instant feedback on what we’re building."

  • As part of the planning process, Matt discusses breaking down the PRD into smaller, traceable issues to ensure continuous feedback. This helps maintain focus and clarity on the tasks at hand.

  • He explains that without timely feedback, development becomes inefficient, as the AI coding lacks direction until much later phases of the project. The goal is to keep adjustments agile and responsive to the evolving project requirements.

Tuning AI Skills for Better Collaboration 47:36

"If you feel like the AI is just absolutely hammering you, then you can just tell it to pull back a little bit or get it to do stop points."

  • AI skills can be adjusted based on user needs, allowing for finer control if the interaction becomes overwhelming.

  • If there are frequent issues with the AI’s aggressiveness, users can modify its settings to mitigate that behavior.

Importance of Concision in Documentation 47:55

"When using plan mode, you can say, 'Okay, yeah, approve that.'"

  • Conciseness in documentation is emphasized to increase readability, especially when preparing plans.

  • The speaker notes their initial focus on brevity for easier understanding but later shifts their approach to a more collaborative, interactive style.

Transitioning from Concise Plans to Interactive Sessions 48:16

"I wanted to get on the same wavelength as the LLM. I wanted it to ask aggressive questions."

  • A shift from reading concise plans to engaging with the large language model (LLM) directly fosters a deeper collaboration and understanding.

  • This transition signifies a move toward leveraging the LLM's capabilities to enhance inquiry and interaction rather than solely focusing on simplification and grammar.

The Role of the Kanban Board in Task Management 49:31

"This means that you can start to parallelize and get agents working at the same time on these tasks."

  • The Kanban board is structured to allow for tasks to be independently manageable, facilitating parallelization.

  • A well-organized Kanban board allows multiple agents to work simultaneously on different tasks, thereby optimizing workflow and efficiency.

Implementing Parallelization in Workflows 50:22

"You can do really nice parallelization, and that's why I prefer a Kanban board set up like this to a sequential plan."

  • The use of a Kanban board over sequential plans allows for multiple agents to tackle tasks concurrently, which enhances productivity.

  • By creating independent tasks, the workflow supports flexibility and adaptability in project management.

Flow of Work for Human and AI Collaboration 52:27

"Now, the implementation stage, we step back and let an agent work through that Kanban board."

  • After thorough planning and review, the human’s role shifts, allowing the AI to execute tasks autonomously.

  • This model resembles a collaborative shift between human and AI, akin to transitioning from day shifts (planning) to night shifts (execution).

Preparing for AFK (Away From Keyboard) Automation 53:51

"We’re essentially going to run Claude and encourage it to work completely AFK."

  • Preparation for running an AFK agent involves collecting tasks and issues that will be tackled without active human supervision.

  • The AFK mode allows for continuous workflow and efficiency, even when the human team is not actively engaged.

Structuring the Backlog for Effective Task Execution 56:30

"We're essentially creating a backlog of tasks for the night shift to pick up."

  • A structured approach to creating a backlog for automation ensures that tasks are well-defined and ready for AI take-over.

  • This method allows for prioritization, enabling the AI to efficiently identify and address pressing tasks in a systematic manner.

Running AI and Setup Observations 57:38

"While it's running, you probably have some questions about this setup and the decisions I've made to delegate all of my coding to AI."

  • The video showcases the execution of an AI component named Claude, which is underway. The presenter encourages viewers to follow along if they wish.

  • While the AI runs, a discussion is initiated to address common queries related to this setup. The aim is to clarify the rationale behind choosing to automate coding tasks using AI.

Managing Decision Outcomes in AI Sessions 58:23

"In the PRD write-up, there's a section at the bottom for out-of-scope items, which is important for giving a definition of done."

  • Viewers inquire about how to manage negative decisions and rationales in coding sessions with AI.

  • The response highlights the significance of including a section in the Product Requirements Document (PRD) to list out-of-scope items. This approach provides clarity on project boundaries and helps define what constitutes completion.

The Challenge of Code Reviews with AI Delegation 59:22

"If we delegate all of our coding to agents, we still need to QA and code review the work."

  • The speaker expresses concern that delegating coding tasks to AI might lead to an increased burden of code reviews and quality assurance.

  • When implementing multiple issues simultaneously, adhering to the practice of keeping pull requests small can be cumbersome. The prevalent sentiment is that developers might face more extensive code review tasks than before.

Collaboration and Feedback Loops in Team Environments 01:00:31

"If you have an idea and you conduct a grilling session on it without a clear answer, you need to loop in your team."

  • A question arises regarding the collaborative approach to working as a team while handling ideas that may evolve or change.

  • It’s emphasized that the entire workflow from idea generation to quality assurance stages requires team involvement, ensuring that all perspectives are considered. Creating prototypes and exchanging feedback iteratively can enhance the development process.

Prototyping Best Practices with AI 01:02:35

"The prototype is supposed to give you feedback earlier on the process."

  • The dialogue transitions to prototyping, underlining its importance in the front-end development process.

  • While AI may struggle with creating polished front-end designs, it excels at generating initial concepts. These prototypes serve as tools for gathering feedback and refining ideas through subsequent grilling sessions.

Integrating Architecture and Design Constraints 01:04:01

"How do you make it conform to code standards, the architecture design, and security rules?"

  • The discussion shifts to integrating existing software architecture and design constraints, questioning how new implementations can align with established API contracts and security protocols.

  • Responding to this inquiry, the presenter suggests examining how prototypes and architectural guidelines can be subsequently adapted to ensure consistency with the overall system architecture.

The Importance of Test-Driven Development (TDD) 01:06:48

"TDD is test-driven development, which follows the red-green-refactor cycle."

  • TDD, or test-driven development, is essential for maximizing the effectiveness of AI agents. The process involves writing a failing test first, known as the "red" phase, before implementing the code to ensure it passes, referred to as the "green" phase. This cycle encourages a structured approach to coding, as it requires developers to clearly define the expected functionality of their code before it is written.

  • By incorporating TDD, developers introduce valuable tests into the codebase, often leading to better testing outcomes. This becomes a gamification service, where developers are motivated by the structure and feedback of the TDD process.

  • However, AI sometimes struggles with creating valid tests, often resulting in tests that essentially cheat the system by writing the entire implementation before the corresponding tests. The advancement of certain tools facilitates more rigorous TDD practices, making it difficult for AI to bypass the process.

Feedback Loops and Their Significance 01:09:54

"Feedback loops are essential to getting AI to produce anything reasonable."

  • Feedback loops play a crucial role in AI development, providing necessary guidance to produce quality code. Without these loops, AI operates "blindly," incapable of discerning how to navigate or improve the codebase effectively.

  • The quality of feedback loops directly influences the output of the AI, serving as a ceiling for its capability to generate effective code. If outputs from AI are subpar, enhancing the quality of feedback loops is often a necessary solution.

  • Running tests and checks, such as NPM run test or NPM run type check, allows for immediate corrective action when issues arise, further underscoring the importance of these feedback mechanisms.

Managing Code Quality and Module Structure 01:14:10

"Bad tests often stem from trying to wrap every tiny function in its own test boundary."

  • The structure of a codebase profoundly impacts the quality of AI development, especially when dealing with complicated or poorly designed modules. Shallow modules with numerous small files can create a highly convoluted dependency graph that is challenging for both AI and human developers to navigate.

  • Properly establishing test boundaries is essential; care must be taken to not isolate functions unnecessarily, which can lead to missed interactions and order problems between integrated modules. This approach to testing requires a balance between individual module testing and collective testing to ensure thorough validation of interdependent functionalities.

  • By understanding these dynamics, developers can improve their codebases, making them more amenable to effective AI productivity and overall enhancing outcomes.

The Importance of Testable Codebases 01:16:25

"Figuring out how to build a codebase that is easy to test is essential here."

  • Having a codebase that is straightforward to test enhances the feedback loops, ultimately leading to improved performance from AI systems integrated into the software.

  • Good codebases should consist of "deep modules" that present a simple interface while containing substantial functionality within. This design allows for easier testing by encapsulating dependencies within individual modules.

  • The practice of wrapping test boundaries around these deep modules facilitates capturing extensive functionality, making the testing process simpler for developers.

Deep vs. Shallow Modules 01:16:58

"Deep modules versus shallow modules is a good understanding."

  • A deep module is characterized by a limited interface while encompassing considerable internal functionality, promoting effective testing. Conversely, shallow modules are less efficient and can complicate the testing process.

  • Without proper oversight from developers, AI can unintentionally generate shallow module structures, which may lead to inefficient codebases and hinder testability. Active direction is necessary to maintain a clear and manageable code structure.

Mental Strategies for Code Mastery 01:19:30

"If we lose the sense of our codebase, we're not going to be able to improve it."

  • As projects rush forward and responsibilities are delegated, developers may lose familiarity with their codebase, which can impair their ability to enhance the system. This highlights the need for a balanced approach that allows for rapid development while maintaining a strong awareness of the overall structure.

  • A proposed solution is to design the interfaces of modules carefully, allowing for delegation of implementation that can simplify the developer's mental load. By understanding module shapes and behaviors, while relying on AI for actual implementation, developers can preserve their grasp on the codebase without becoming overwhelmed.

Improving Codebase Architecture 01:21:28

"We have a skill called 'improve codebase architecture.'"

  • Tools exist that can help analyze and refine the architecture of a codebase, ensuring modules are appropriately deep and well-structured. Running these analysis tools on existing repos can lead to discoveries about where improvements can be made.

  • For example, encapsulating broad functionalities like a video editor into a single module enables comprehensive testing from the front end to the back end. This allows AI to operate effectively within a well-defined structure, drastically improving coding efficiency and quality.

Importance of the PRD and its Optimization 01:26:01

“The journey is really just a hint of where you want to go; the place that you need to be putting the work is in QA.”

  • During the discussion, it is noted that the PRD (Product Requirement Document) should not be constantly optimized to an ideal state before initiating work. Instead, it should serve as a directional tool guiding the team toward their objectives.

  • The speaker emphasizes that the real focus should be on QA (Quality Assurance), which will significantly impact the project's success rather than endlessly refining the PRD.

Coding Standards: Push vs. Pull Approaches 01:28:03

“When you have an implementer, you want the coding standards to be available via pull; the reviewer needs to have the coding standards pushed to them.”

  • The speaker introduces two approaches to ensure that coding standards are met during the development process: push and pull.

  • The push approach refers to pushing instructions to the LLM (Large Language Model) so it consistently follows specified coding standards, while the pull approach allows the LLM to seek out the necessary information when needed.

  • For implementers, coding standards should be accessible via a pull method, encouraging them to seek clarification when they encounter uncertainties. In contrast, for reviewers, coding standards should be pushed to ensure they can verify that the code aligns with the expected standards.

Sandcastle Tool Overview 01:29:50

“Sandcastle is essentially a TypeScript library for running these loops, allowing you to run a prompt inside a Docker container.”

  • The speaker describes a tool called Sandcastle, designed to enhance the workflow of running agents without manual involvement. It is a TypeScript library that facilitates the creation of a work environment using Docker containers.

  • Sandcastle enables developers to run code in a sandbox and manage Git branches, allowing for streamlined parallel processing across various tasks.

  • This tool is particularly useful for automating loops and managing multiple agents effectively, optimizing coding workflows.

Workflow Process and Quality Assurance 01:34:04

“Throughout this process, we're being very intentional with the kind of modules and the shape of the code base that we want.”

  • The speaker outlines a structured workflow process that focuses on developing a code base that is planned and intentional rather than simply generated from requirements.

  • The workflow involves breaking down tasks into parallelizable issues that agents can work on simultaneously.

  • Quality Assurance is integral to this process, generating additional issues for the Kanban board while code is being implemented. This iterative approach promotes continuous improvement and accountability.

  • Lastly, it is emphasized that once the work reaches a satisfactory state, sharing it with the team for a comprehensive review is essential to ensure quality standards are met.

Customization and Recommendations for AI Coding Workflow 01:35:20

"All of this can be customized by you. What I recommend, if you take one thing away from this session, is that you should head to Amazon and just buy a ton of those old books because I found it so enlightening reading them."

  • The speaker emphasizes that the workflow discussed is not rigid and can be tailored to fit individual needs and preferences.

  • There is a strong recommendation to seek out older, foundational texts on AI and related subjects. The speaker reflects on how enlightening these materials were for them, suggesting that others may find similar value.

  • The act of exploring pre-AI writing is highlighted as beneficial, with the speaker noting that each page offers useful insights that can aid in understanding contemporary practices in AI development.

  • This advice is presented as a key takeaway from the session, indicating the importance of grounding oneself in traditional knowledge even as technology evolves.