Code Factory Repository Setup for Automated Agents

Press Space for next Tweet

📝 Code Factory: How to setup your repo as so your agent can auto write and review 100% of your code The goal You want one loop: The coding agent writes code The repo enforces risk-aware checks before merge A code review agent validates the PR Evidence (tests + browser + review) is machine-verifiable Findings turn into repeatable harness cases The specific review agent can be @greptile, @coderabbitai, CodeQL + policy logic, custom LLM review, or another service. The control-plane pattern stays the same. I took inspiration from this helpful blog post by @_lopopolo The high-level flow 1) Keep one machine-readable contract Your contract should define: risk tiers by path required checks by tier docs drift rules for control-plane changes evidence requirements for UI/critical flows Why it matters: it removes ambiguity and prevents silent drift between scripts, workflow files, and policy docs. 2) Gate preflight before expensive CI A reliable pattern is: run `risk-policy-gate` first verify deterministic policy + review-agent state only then start `test/build/security` fanout jobs This avoids wasting CI minutes on PR heads that are already blocked by policy or unresolved review findings. 3) Enforce current-head SHA discipline This was the biggest practical lesson from real PR loops. Treat review state as valid only when it matches the current PR head commit: wait for the review check run on `headSha` ignore stale summary comments tied to older SHAs fail if the latest review run is non-success or times out require reruns after each synchronize/push clear stale gate failures by rerunning policy gate on the same head If you skip this, you can merge a PR using stale “clean” evidence. 4) Use a single rerun-comment writer with SHA dedupe When multiple workflows can request reruns, duplicate bot comments and race conditions appear. Use exactly one workflow as canonical rerun requester and dedupe by marker + `sha:<head>`. 5) Add an automated remediation loop (optional, high leverage) If review findings are actionable, trigger a coding agent to: read review context patch code run focused local validation push fix commit to the same PR branch Then let PR synchronize trigger the normal rerun path. Keep this deterministic: pin model + effort for reproducibility skip stale comments not matching current head never bypass policy gates 6) Auto-resolve bot-only threads only after clean rerun A useful quality-of-life step: after a clean current-head rerun auto-resolve unresolved threads where all comments are from the review bot never auto-resolve human-participated threads Then rerun policy gate so required-conversation-resolution reflects the new state. 7) Keep browser evidence as first-class proof For UI or user-flow changes, require evidence manifests and assertions in CI (not just screenshots in PR text): required flows exist expected entrypoint was used expected account identity is present for logged-in flows artifacts are fresh and valid 8) Preserve incident memory with a harness-gap loop This keeps fixes from becoming one-off patches and grows long-term coverage. 9) What we learned running this in PRs The most important lessons were: Deterministic ordering matters: preflight gate must complete before CI fanout. Current-head SHA matching is non-negotiable. Review rerun requests need one canonical writer. Review summary parsing should treat vulnerability language and weak-confidence summaries as actionable. Auto-resolving bot-only threads reduces friction, but only after clean current-head evidence. A remediation agent can shorten loop time significantly if guardrails stay strict. 10) General pattern vs. one implementation General pattern terms: `code review agent` `remediation agent` `risk policy gate` One concrete implementation (ours): code review agent: Greptile remediation agent: Codex Action canonical rerun workflow: `greptile-rerun.yml` stale-thread cleanup workflow: `greptile-auto-resolve-threads.yml` preflight policy workflow: `risk-policy-gate.yml` If you use a different reviewer, keep the same control-plane semantics and swap integration points. Useful command set Final pattern to copy Put risk + merge policy into one contract. Enforce preflight gate before expensive CI. Require clean code-review-agent state for current head SHA. If findings exist, remediate in-branch and rerun deterministically. Auto-resolve only bot-only stale threads after clean rerun. Require browser evidence for UI/flow changes. Convert incidents into harness cases and track loop SLOs. That gives you a repo where agents can implement, validate, and be reviewed with deterministic, auditable standards. http://x.com/i/article/20230017902585733…

Topics

software engineering programming artificial intelligence machine learning automation productivity saas

Read the stories that matter.The stories and ideas that actually matter.

Save hours a day in 5 minutesTurn hours of scrolling into a five minute read.