Finding signal on Twitter is more difficult than it used to be. We curate the best tweets on topics like AI, startups, and product development every weekday so you can focus on what matters.

Code Factory Repository Setup for Automated Agents

📝 Code Factory: How to setup your repo as so your agent can auto write and review 100% of your code The goal You want one loop: The coding agent writes code The repo enforces risk-aware checks before merge A code review agent validates the PR Evidence (tests + browser + review) is machine-verifiable Findings turn into repeatable harness cases The specific review agent can be @greptile, @coderabbitai, CodeQL + policy logic, custom LLM review, or another service. The control-plane pattern stays the same. I took inspiration from this helpful blog post by @_lopopolo The high-level flow 1) Keep one machine-readable contract Your contract should define: risk tiers by path required checks by tier docs drift rules for control-plane changes evidence requirements for UI/critical flows Why it matters: it removes ambiguity and prevents silent drift between scripts, workflow files, and policy docs. 2) Gate preflight before expensive CI A reliable pattern is: run `risk-policy-gate` first verify deterministic policy + review-agent state only then start `test/build/security` fanout jobs This avoids wasting CI minutes on PR heads that are already blocked by policy or unresolved review findings. 3) Enforce current-head SHA discipline This was the biggest practical lesson from real PR loops. Treat review state as valid only when it matches the current PR head commit: wait for the review check run on `headSha` ignore stale summary comments tied to older SHAs fail if the latest review run is non-success or times out require reruns after each synchronize/push clear stale gate failures by rerunning policy gate on the same head If you skip this, you can merge a PR using stale “clean” evidence. 4) Use a single rerun-comment writer with SHA dedupe When multiple workflows can request reruns, duplicate bot comments and race conditions appear. Use exactly one workflow as canonical rerun requester and dedupe by marker + `sha:<head>`. 5) Add an automated remediation loop (optional, high leverage) If review findings are actionable, trigger a coding agent to: read review context patch code run focused local validation push fix commit to the same PR branch Then let PR synchronize trigger the normal rerun path. Keep this deterministic: pin model + effort for reproducibility skip stale comments not matching current head never bypass policy gates 6) Auto-resolve bot-only threads only after clean rerun A useful quality-of-life step: after a clean current-head rerun auto-resolve unresolved threads where all comments are from the review bot never auto-resolve human-participated threads Then rerun policy gate so required-conversation-resolution reflects the new state. 7) Keep browser evidence as first-class proof For UI or user-flow changes, require evidence manifests and assertions in CI (not just screenshots in PR text): required flows exist expected entrypoint was used expected account identity is present for logged-in flows artifacts are fresh and valid 8) Preserve incident memory with a harness-gap loop This keeps fixes from becoming one-off patches and grows long-term coverage. 9) What we learned running this in PRs The most important lessons were: Deterministic ordering matters: preflight gate must complete before CI fanout. Current-head SHA matching is non-negotiable. Review rerun requests need one canonical writer. Review summary parsing should treat vulnerability language and weak-confidence summaries as actionable. Auto-resolving bot-only threads reduces friction, but only after clean current-head evidence. A remediation agent can shorten loop time significantly if guardrails stay strict. 10) General pattern vs. one implementation General pattern terms: `code review agent` `remediation agent` `risk policy gate` One concrete implementation (ours): code review agent: Greptile remediation agent: Codex Action canonical rerun workflow: `greptile-rerun.yml` stale-thread cleanup workflow: `greptile-auto-resolve-threads.yml` preflight policy workflow: `risk-policy-gate.yml` If you use a different reviewer, keep the same control-plane semantics and swap integration points. Useful command set Final pattern to copy Put risk + merge policy into one contract. Enforce preflight gate before expensive CI. Require clean code-review-agent state for current head SHA. If findings exist, remediate in-branch and rerun deterministically. Auto-resolve only bot-only stale threads after clean rerun. Require browser evidence for UI/flow changes. Convert incidents into harness cases and track loop SLOs. That gives you a repo where agents can implement, validate, and be reviewed with deterministic, auditable standards. http://x.com/i/article/20230017902585733…

Content

Topics

Read the stories that matter.

Save hours a day in 5 minutes