Understanding Local Development and Coding Agents

Press Space for next Tweet

This is the kind of foundational knowledge that I think you need to work with today's AI coding agents, if you're not that technical. ## Local vs Remote Firstly, you have to understand what the hell local and remote mean. Local just means it's on your computer. Remote means it's in the cloud. If you're working on a Google Doc with no internet access on your phone, you're working on it locally to that device (that machine). It only lives there. But when you connect to the internet and you save it to your Google Drive, it has then got a remote destination. You can go on your computer then and access the same document that you were editing on your phone. That is the difference between local and remote. What's only on your computer vs in the cloud. ## Terminals When you're using a coding agent. You may be using it in your terminal. A terminal is just an app, that can run commands on your computer. But the tricky thing is that you need to know the exact commands that it needs to run. It can list, edit, move, and create files, all things like that. With coding agents, you no longer need to know those commands. Except maybe 'cd' - which means change directory (folder). So you'd say 'cd /Users/bentossell/project/bensbites/' and then start your agent IN that folder. ## Directories & Paths Computers have got their own kind of language which is a little bit difficult to understand at first, but when you see enough of it, you start recognising it. A directory is a folder. So when you're creating a directory, you're just creating a folder. That's all it is. When you're trying to find a specific file or folder, it's got a sort of destination URL. If you're on a website, website.com/product/category, that full URL is the URL path. With your file system, you have the same thing. It's a path to your file. So you can say, it's in my home directory, which is your /Users/your_username (as I showed above with cd). Then ~/projects/bensbites/ would be the path to that folder or file. The ~/ (called a tilde) is a shortcut for your home directory (a replacement for /Users/your_username/). So if you're in your home directory, you can just say ~/projects/bensbites/ and it will know where to go. ## What Coding Agents Are So now when you hear about what a coding agent is, it's effectively like talking to ChatGPT or Claude. It's an LLM, it's an AI system that can respond to you, but it also has access to the tools that your computer has access to as we mentioned above. BTW: I posted this on my cookbook site where i share my sessions and i have explainers for terms/tools that you may find hard to remember. ## System Prompts Coding agents come with their own system prompts. So Claude Code has its own system prompt, Factory's Droid has one, Codex has one, and so on. Some of these companies actually publish their system prompt so you can view it. Other companies do not, but there are accounts on Twitter that actually go and sort of trick the system to expose its own system prompt. So you can always go and look at them. Effectively, what they will say is, "You are an expert software engineer. These are the things you can do. These are the things you shouldn't do. This is the company that created you. Your knowledge cutoff is this date, but it is currently this date." (because LLMs have knowledge only up to a certain date). Coding agents can use web search tools, so you can get up-to-date information. They also might include a reference to their own documentation. So if you're asking, "Okay, how do I set this thing up in my coding agent?" It can look at its own documentation to guide you (or do it for you). ## Markdown Files & agents.md Another thing that is very common and talked about are markdown files (.md). A markdown file is just a file format for plain text with some formatting (headers, lists, etc). If you use PowerPoint then you'll know that when you export a presentation, it'll be in a PPTX format. If you export one on Mac, it'll be a .keynote format. If you use Excel, it'll be a .xlsx file, but if you use Google Sheets, it'll be a .csv or something else. These markdown files allow you to give your agent your own instructions beyond the system prompt. Agents look for an AGENTS.md file (except claude looks for CLAUDE.md). In that file, you could have a prompt that says, "I'm not very technical at all, so anytime you're talking about code, please explain exactly what it is at all times," or, "Explain it to me like I'm five," and you'll see how the difference is output between not having that instruction and then having that instruction. Developers often put a lot of different things in here, like, "This is my tech stack. These are the key files to look at. These are the commands you need to use to run this particular project." And it's all hierarchical, where the closest markdown file to what you're working on will be read first before the preceding one... When editing `Button.tsx`, the agent reads the closest AGENTS.md first (components/), then works its way up to the root (home). ## Skills And now there are things called skills. Skills you can think about as reusable workflows, so something you use over and over again. Skills are their own instruction files. So if you work with a lot of projects, and you want to work in a particular format, so say, you're working on newsletter content, you've got your /project/newsletter/, and you have a skill that is for writing newsletter headlines. So you'll give it your newsletter content and say, "Write a headline based on this content," and in your skill.md, you have instructions that say, "Never use emoji, always use lowercase, make sure it's maximum four words. Try and pick out the most important topic from the newsletter content," and things of that nature. Skills can also be instructions and tools. So it could be the same instructions with a tool that generates an image based on the content. For actual code projects you'd likely have a skill for frontend design (what users see). I use one called agent-browser which has instructions for the agent on how to use the agent-browser tool (so an agent can look at chrome, click, scroll, read, type, etc). ## Frontmatter & Progressive Disclosure The differences with skills and agents.md are very little, actually. The skill basically just gets called by the agent as and when it needs it instead of always added to the system prompt. A couple of terms worth mentioning are progressive disclosure and frontmatter. Frontmatter is effectively like metadata. If you were writing a blog post and you've got your title and your tagline, you're writing the metadata, the SEO data, to say, this is what a search term should hit. When the user types these words, it should hit like the tagline and description. So like you would put, if you're talking about AI, you would have AI in there somewhere. Frontmatter is that for these documents. So you just have, name: Newsletter-content-writer description: Use this skill when asked to write any newsletter content. The agent itself prepopulates all of those titles and descriptions. It can look at that and say, "Okay, I've taken in my system prompt. That's the first thing that I know, and then I've also taken in the agents.md file to know any instructions that the user has specified. And then I can also see they've got these ten skills that I have access to. So I now know that if they ask me anything about newsletter content, I will pick up these skills and use them proactively." And that leads us to progressive disclosure is. It just the 'loads them itself' part. You don't have to specify, "Use this skill to do this task." Although, I actually do that a lot because I just wanna make sure that it is definitely using that skill. ## How It All Connects Everything you're doing is giving extra context for the agent, and we'll get into context later. Extra context, extra instructions, extra abilities to the agent that works for you in a particular project or across all of your projects. You can have specific instrustions and skills per project or for all projects. ## Dot Files & Where Skills Live Where do your skills live? In your filesystem some files are hidden by default. You can view them by pressing Command+Shift+. (on mac). These files are called dot files. They look like this: .factory, .claude, .codex, etc. They have your configuration, session information, settings and skills. ~/.factory/skills/ is where they'll be. And there's often an agents.md in there. So you should make sure that that is where your key instructions are. ## Commands Custom slash commands are if you want to run a workflow quickly. So you could have /newsletter, and it would pull the transcript from your Transcripts folder. It would format it using a specific skill. It would rewrite it. Your agent can set them up for you or you can create them manually in ~/.factory/commands/ ## Bash Touching on the tools briefly. When you start working with an agent, you'll see your computer typing out a bunch of commands. This is basically what you'd be doing if you were really technical and lived in a terminal all day (without an agent). The main tool here is called Bash. Agents love it because they can just do stuff; run commands, look things up, move files around, chain a bunch of operations together. ## Context Windows & Tokens AI models have a memory limit. It's called the context window. And it's measured in tokens. A token is roughly 4 characters or 3/4 of a word. So if a model has a 100k context window, it can hold about 100,000 tokens in its memory at once. The agent can only see what fits in that window. So if you've got a massive codebase with hundreds of files, it can't just load all of them at once. It has to be smart about what it looks at. It'll read the files it needs, do some work, maybe forget some stuff to make room for other stuff. file - that all takes up space.Or if the tool has a big system prompt, and you've given it a massiveagents.md The more you cram in there, the worse it gets at focusing on what matters. (Although Factory's Droid compaction is phenomenal - compaction/compression is the agent summarising the context and trying to keep the important bits). should be focused. That's why skills are separate files that only get loaded when needed. You're managing the agent's attention.When you're working with agents, you want to give them the right context. It's almost the entire job really. That's why youragents.md And if the agent seems to forget something you told it earlier in a long conversation, that's why. It fell out of the context window. You might need to remind it or start a fresh session. # Now let's talk about coding projects. ## Git & Version Control Git is probably the most important thing to understand if you're going to be doing any kind of real work with coding agents. Git is version control - saving your work, but you can go back to any previous save at any time. When you're working on a project and you make a change, you commit it. A commit is just a snapshot. It's like saying, "Okay, this is what the project looks like right now, and I'm happy with it, so I'm going to save this checkpoint." And you write a little message, like "added the new header" or "fixed that bug," so future you knows what that change was. And then you've got branches. A branch is like a parallel version of your project. You might have your main branch, which is the real version, the live thing, and then you create a new branch to try something out. Like, "I want to experiment with this new design", you branch off, do your work, and if it works, you merge it back into main (bring the new work back onto the main work to incorporate it). If it doesn't, you just delete that branch and nothing's changed. Local and remote comes back here. When you're working on your computer, your commits are local to your machine. They only exist on your computer. When you push, you're sending those commits to the remote, which is usually GitHub, somewhere in the cloud. And when you pull, you're bringing down changes that someone else pushed, or that you pushed from a different machine. So push is upload, pull is download, basically. And a repo, a repository, is just a project folder that has git set up. That's all it is. It's got a hidden .git folder inside it that tracks all of these commits and branches. So when someone says "clone this repo," they're saying download this project from GitHub to your machine so you can work on it locally. The coding agent will do all of this for you. It'll commit, it'll push, it'll create branches. It's good to understand what it's doing as it may ask you "Do you want me to commit this?" and you'll know what that means. If (when) something goes wrong, git is how you undo it. ## Environment Variables & API Keys These are basically secrets that your code needs to work but you don't want to write directly into your code. So if you're connecting to Stripe or OpenAI or whatever service, they give you an API key (effectively a unique password). You don't want to just put that password in your code because if you push that to GitHub, now everyone can see your password. Bad. So what you do is you create a file called .env, and you put your secrets in there. Like, OPENAI_API_KEY=sk-whatever. And then your code reads from that file instead of having the actual key written in it. The .env file stays on your machine, you never push it. There's usually a .gitignore file that tells git, "Hey, don't ever upload the .env file," so it stays safe. Environment variables aren't just for API keys. You might have different settings for when you're developing versus when it's live. Like, your database URL might be different locally than in production. So you'd have different .env files for different environments. When you're setting up a new project and it asks you to "add your API keys to .env," that's what it means. Create that file, put your keys in there, and the code will pick them up. The coding agent can help you set this up, but it'll ask you for the actual keys because obviously it doesn't know your passwords. Open the file directly and add your API keys - don't just paste them to the agent (although I've done this a lot). ## Dependencies & Package Managers When you're building stuff, you're not writing every piece of code from scratch. You're using code that other people have already written. These are called dependencies, packages, or libraries, kind of used interchangeably. If you want to connect to a database, there's many packages for that. If you want to send emails, there's packages for that. You just install them and use them. And the thing that manages all of these is called a package manager. If you're working with JavaScript, which is what most web stuff uses, you've got npm, which stands for Node Package Manager. There's also Bun, which is a newer, faster alternative that's getting popular. Does the same thing, just quicker. If you're working with Python, you've got pip. There's others, but those are the main two you'll see. So when you clone a repo and you want to run it, the first thing an agent will do is npm install or pip install. It looks at a file in the project, package.json for JavaScript or requirements.txt for Python, and it downloads all the packages that project needs. They go into a folder called node_modules for JavaScript, and you'll notice that folder is massive. Like, thousands of files. That's normal. Don't worry about it. Don't touch it. It's just all the packages you need. The coding agent does this automatically most of the time. When it needs a new package, it'll install it. ## Running Projects Locally So once you've got your project set up and your dependencies installed, you need to actually run the thing. And this is where you'll experience localhost. Localhost is just your computer pretending to be a web server. Your agent will start the server and you go to localhost, and it'll show you your project running on your machine. You'll see localhost:3000 or localhost:8080 or some number. That number is the port. It's like different channels. So you could have one project running on 3000 and another on 3001 at the same time. They don't interfere with each other because they're on different ports. To run a project, you usually do something like npm run dev or npm start. It depends on how the project is set up. There'll usually be a README file that tells you, or the coding agent will figure it out. And when you run it, the terminal will show you the URL, like "ready on localhost:3000," and you just open that in your browser. It's running, it's live, but only on your machine. No one else can see it, you can't share that link with anyone. It's local. You're developing, you're testing, you're making sure it works before you deploy it to the actual internet where everyone can see it. When you make changes to your code while the dev server is running, most of the time it'll automatically refresh. You don't have to restart it. That's called hot reloading. It just picks up the changes. ## Deployment You've built something locally and it works. Now you want to put it on the actual internet so other people can see it. You have to deploy it. There's a few popular platforms for this. Vercel is probably the most common for anything built with React or Next.js. Netlify is similar, really good for static sites and simple apps. Cloudflare has Pages and Workers which are great for anything that needs to be fast everywhere. All of these have their own CLI tools which makes it easier for the agent to deploy for you. You can say "deploy this to Vercel" and the agent will run the Vercel CLI, push your code, and give you a live URL. These tools also have something called MCPs, Model Context Protocols, which let the agent interact with these services even more directly (similar to CLIs). It can check your deployments, see logs, manage environment variables, all that. The one thing to remember with deployment is environment variables. Your .env file doesn't get deployed, that's the point. So any secrets your app needs, you have to add them in the platform's dashboard. Vercel has environment variables in project settings, Cloudflare has the same, they all do. If your app works locally but breaks when deployed, check your environment variables first. ## Reading Errors So when something breaks, and things will break, the terminal is going to show you an error. It'll usually be red text, looks scary, very intimidating. But it's actually trying to help you. The key thing to look for is the error message itself, which is usually at the top or the bottom of all that red text. It'll say something like "cannot find module" or "undefined is not a function" or "connection refused." And then there's usually a stack trace, which is all those file paths and line numbers. It's showing you where the error happened. Like, "the problem is in this file, on line 47." So you can go look at that line and see what's wrong. The good news is you don't need to understand most of this yourself. You copy the error, you paste it to the coding agent, and you say "fix this." The agent is really good at reading errors because it's seen millions of them. It'll go, "Oh, this means you forgot to install that dependency," or "You've got a typo here." And it'll fix it for you. Over time, you'll start recognizing common ones. Like, "ENOENT" just means file not found. "ECONNREFUSED" means it couldn't connect to something, maybe your database isn't running. You'll pick up the patterns. ## Chrome DevTools & Console Errors There's another source of errors that's really useful, and that's your browser's developer tools. Chrome DevTools, specifically. If something's not working on your website, like a button doesn't do anything or data isn't loading, the answer is usually in the console. You open DevTools by right-clicking anywhere on the page and clicking Inspect, or pressing Command+Option+I on Mac or Ctrl+Shift+I on Windows. Then you click the Console tab. And you'll see any errors that JavaScript is throwing, any failed network requests, all sorts of useful stuff. The red text in there is what you want to pay attention to. It might say something like "Failed to fetch" or "Uncaught TypeError" or "404 Not Found." That's your clue. You copy that, you paste it to the agent, and you say "I'm seeing this in the console when I click the submit button." That's way more useful than just "it doesn't work." And there's a Network tab too, which shows you all the requests your page is making. If you're trying to load data from an API and it's not working, you can see in the Network tab whether the request is even being sent, what it's sending, and what the server responded with. Super useful for debugging. You can screenshot that or copy the error details and give them to the agent. This is another use-case for agent-browser - the agent can look at the site for you, inspect the console and read the error messages and pass them back to the agent's context to fix. ## Errors, Bug Fixing & Tests You'll hit errors and bugs all the time. Actual developers do too. The wrong way to handle them is what most people do at first. Something breaks, you go "fix this," the agent tries something, it doesn't work, you go "still broken," it tries something else, still doesn't work, and you're going round and round in circles. You can burn through hours like this (I have). Before you say "fix this," give the agent context. What were you trying to do? What did you expect to happen? What actually happened? Paste the errors. You should ask the agent to investigate before it fixes. Like, "Don't fix it yet. First, tell me what you think is wrong and why." Because if it just starts changing stuff without understanding the problem, it might fix the symptom but not the cause. And then you've got a different bug next week. Tests are huge for this. A test is just code that checks if your other code works. So you might have a test that says, "When I click this button, this thing should happen." And you can run all your tests to make sure nothing's broken. The agent can write tests for you/itself. A good workflow is: bug happens, you ask the agent to write a test that reproduces the bug first. So now you've got a failing test. Then your agent fixes the bug, and the test passes. Now you know it's actually fixed, and if that bug ever comes back, the test will catch it. This is called test-driven development, kind of. Also, logs. You can ask the agent to add logging so you can see what's happening. Like, print statements that say "got to this point" or "this variable is this value." Helps you and the agent figure out where things go wrong. The best agents have tools for this now. They'll run the code, see the error, inspect variables, check what's happening at each step. ## Before You Go In Circles I mentioned not going round and round trying to fix something. But everyone does it. So what I find helpful to remind myself is: 1. Commit your work. Before you start chasing a bug, make sure everything you've done so far is saved. Git commit. That way, if you make things worse, you can always go back. 2. Consider starting a fresh session. Sometimes the agent has gone down a rabbit hole on one feature and it's got confused context from all the failed attempts. 3. Check your setup. Is your .env file actually there? Are all the keys correct? Did you maybe set up the environment variables locally but forget to add them to your hosting platform like Vercel or GitHub secrets? 4. Think about updating your context. Your agents.md, your skills, whatever instructions you've given. If you keep running into the same type of problem, maybe the agent needs better guidance. Say the agent keeps trying to use a tool that doesn't work, or it keeps formatting something wrong, or it keeps forgetting that your project has a specific setup. That's your cue to add that information to your agents.md. "We use pnpm instead of npm." "Always run migrations before testing." "The API endpoint is this, not that." It's tricky if you're not technical. How do you know what to add? When the agent finally does solve it, ask it. "What was the problem? What should I add to my instructions so we don't hit this again?" It can diagnose the problem and suggest an instruction to help avoid it. ## The Agent Loop Agents are basically a loop. They plan, they act, they observe, and then they repeat. You give it a task. "Build me a landing page." First, it plans. It thinks, okay, I need to create these files, I need to install these dependencies, I need to set up this structure. Then it acts. It starts creating files, writing code, running commands. Then it observes. It looks at what happened. Did it work? Is there an error? Does the page actually load? And based on what it observes, it either moves on to the next thing or it goes back and fixes what broke. And this loop keeps going until the task is done or it gets stuck. When it gets stuck, that's when it asks you for help. "I tried this but it didn't work, what should I do?" And you give it more context or a different direction, and it continues the loop. ## Autonomy Modes Different agents have different levels of autonomy. Some will ask you before they do anything. "I'm about to create this file, is that okay?" And you have to approve every single action. Safer but slow. Others will just make changes, create files, run commands, all without asking. Much faster, but you have to trust it. And sometimes it'll go down a wrong path and you've got to reel it back in. Most agents let you configure this. Like, "Ask me before deleting files but go ahead and create new ones." Or you might have a mode for exploring where it asks a lot, and a mode for execution where it just does the work. I like to let it run and just watch. I can see what it's doing in real time. If it starts doing something weird, I'll jump in. But I'm not approving every single thing. You'll start getting more comfortable with more autonomy the more you use it. That's the foundational stuff. You don't need to memorise it all. Start building something, let the agent guide you, and you'll pick it up as you go. The best way to learn is by doing. What'd I miss?

Topics

artificial intelligence programming software engineering machine learning coding productivity technical skills

Browse more from January 2026 →

Read the stories that matter.The stories and ideas that actually matter.

Save hours a day in 5 minutesTurn hours of scrolling into a five minute read.