AI coding agents are good at writing code. They are not yet good at the engineering judgment around code: which callers will break, which tests cover the change, whether the new loop is quadratic, whether the function the agent just edited has two near-identical siblings that didn't get the patch.
Each demo below is a real failure pattern we see in agent PRs. Roam ships a deterministic local tool for each.
1. The clone the AI didn't notice
Scenario. The agent fixes a bug in UserService.normalize_email(). It looks correct. Tests pass. Reviewers approve. Two weeks later the same bug fires from AdminService.normalize_email() and SignupFlow._normalize_email() — copy-pasted siblings the agent never saw.
What Roam does. roam clones --persist populates a clone-class table; git diff | roam critique flags clones-not-edited at BLOCK severity when a patch touches one member of a clone class without touching the others.
$ roam clones --persist clone classes: 47 (3 high-similarity, 21 medium, 23 low) class C-12 [3 members, 0.91 similarity]: src/users/service.py:142 normalize_email src/admin/service.py:89 normalize_email src/signup/flow.py:201 _normalize_email $ git diff | roam critique VERDICT: BLOCK [1] clones-not-edited (HIGH) Edited: src/users/service.py:142 normalize_email Unedited siblings in class C-12: src/admin/service.py:89 normalize_email src/signup/flow.py:201 _normalize_email Hint: copy-paste fix or extract a shared helper.
2. The deletion that breaks production
Scenario. The agent is "cleaning up dead code" and deletes resolve_legacy_token(). Tests pass — there are no direct callers in source. The function is reached at runtime through a string lookup in a dispatcher map. Production breaks the next morning.
What Roam does. roam impact follows static + bridge edges (REST, template, config, ORM, dispatcher tables). roam safe-delete returns a binary verdict.
$ roam safe-delete resolve_legacy_token VERDICT: UNSAFE reachable_from: 3 entry points (login_handler, sso_callback, mobile_v1) indirect callers: 1 (dispatcher_map['legacy_token'] in src/auth/dispatch.py:67) runtime hits: 1,247 in last 7 days recommendation: do not delete. Add deprecation warning instead.
3. The accidental O(n²) the linter approved
Scenario. The agent rewrites a list-deduplication loop. It's clean, idiomatic, and uses list.append + if x not in seen. Linters love it. At 100 records it runs in 2 ms; at 50,000 records it takes 18 seconds and the request times out.
What Roam does. roam math (alias roam algo) detects nested-loop lookups (LIST-MEMBERSHIP-IN-LOOP), regex-in-loop, JSON-parse-in-loop, quadratic string concatenation, and seven other algorithmic anti-patterns the AI agent's training distribution doesn't penalise.
$ roam math --confidence high VERDICT: 3 high-confidence findings [1] LIST-MEMBERSHIP-IN-LOOP (O(n²)) src/etl/dedupe.py:34 in dedupe_records Hint: use a set for O(1) membership; switch list[] to set() at line 32. [2] REGEX-COMPILE-IN-LOOP src/parsers/log.py:88 in parse_log_lines Hint: hoist re.compile() outside the loop. ~40× speedup at 10k lines. [3] JSON-PARSE-IN-LOOP src/sync/users.py:212 in sync_user_batch Hint: parse once outside the loop, iterate the parsed object.
4. The N+1 query no one noticed
Scenario. The agent adds an order.customer.name reference to the order-list template. Each rendered row issues a fresh SELECT against the customers table. Page load goes from 80 ms to 4.2 seconds at 200 orders. Tests pass — the test fixture has 3 orders.
What Roam does. roam n1 walks the call graph and matches DB-call patterns inside loops; roam missing-index matches WHERE-clause shapes against the schema's index list.
$ roam n1 VERDICT: 2 N+1 hotspots [1] OrderList.render_rows -> Customer.find_by_id (loop in render_rows:42) Hint: prefetch with .includes(:customer) in the query at line 30. [2] InvoiceExport.export -> LineItem.tax_for_country (loop in export:88) Hint: batch-load tax rates for the unique country set.
5. The layer violation that quietly couples your domain to your HTTP stack
Scenario. The agent is asked to "use the existing HTTP retry helper" and imports http.retry directly inside domain/billing.py. The change passes review — it's a small import. Six months later you can't change HTTP libraries without touching the entire domain layer.
What Roam does. roam layers infers architectural layers; roam fitness + .roam-gates.yml declare layer constraints; roam critique blocks PRs that violate them.
$ git diff | roam critique --gates fitness VERDICT: BLOCK [1] layer-violation (HIGH) Edge added: domain/billing.py -> http/retry.py Rule: domain/* MUST NOT import from http/* (.roam-gates.yml:18) Suggestion: route through application/retry_policy or pass an injected client.
6. The change with no tests touching it
Scenario. The agent edits PricingEngine.discount_for_segment(). The full test suite is green. But the only test that actually exercises that branch was deleted six months ago when its fixture broke. Coverage looks fine because other tests pass through the function without entering the changed branch.
What Roam does. roam affected-tests walks the call graph in reverse and lists tests reachable from changed lines, not files. roam test-gaps highlights changed code with no reaching test.
$ roam affected-tests changed: PricingEngine.discount_for_segment (lines 88-104) tests reaching this branch: 0 nearest tests: test_pricing_engine.py::test_basic_discount (covers lines 60-79, not 88-104) test_pricing_engine.py::test_segment_lookup (covers segment_for_user, not the discount branch) $ roam test-gaps --diff VERDICT: 1 high-risk gap src/pricing/engine.py:88-104 discount_for_segment reach-from-tests: 0 hint: add a test that exercises segment="enterprise" + amount > 10_000.
7. The refactor that takes the codebase down for an afternoon
Scenario. The agent says "I'll move EmailQueue to infrastructure/messaging/ — it's a clean rename." It updates the imports it can see. CI passes. Production stops sending emails because three runtime registries reference the old import path through importlib.import_module.
What Roam does. roam simulate move clones the graph, applies the move, reports what breaks before any source file changes.
$ roam simulate move EmailQueue infrastructure/messaging/email_queue.py VERDICT: BLOCK (3 unresolved references) direct callers updatable: 14 indirect references requiring manual fix: config/celery.py:12 'app.queues.email.EmailQueue' (string ref in CELERY_BEAT_SCHEDULE) scripts/seed_demo.py:88 importlib.import_module('app.queues.email') src/admin/registry.py:204 QUEUE_REGISTRY['email'] = 'app.queues.email:EmailQueue' run: roam plan-refactor EmailQueue --target infrastructure/messaging/email_queue.py to get a step-by-step plan with caller-update ordering.
8. The PR that ships unsigned
Scenario. Quarterly review arrives. Compliance asks "what was checked on this PR before it merged?" You point at green checkmarks. They want a signed, tamper-evident artifact — not a screenshot of CI. EU AI Act / SOC 2 CC8.1 / ISO 42001 evidence.
What Roam does. roam attest emits an in-toto v1 attestation (roam-code.dev/CodeGraph/v1) covering the inputs, the gates that ran, and the verdicts. Optional cosign signing produces an offline-verifiable artifact you can store with the PR.
$ roam attest --cosign --sign --output cga.intoto.jsonl attestation: roam-code.dev/CodeGraph/v1 inputs: graph_merkle: sha256:e2b0...c4f diff_merkle: sha256:9a17...8d2 gates: critique: PASS (severity max=LOW) fitness: PASS math: PASS (0 high-confidence findings) test-gaps: PASS (0 high-risk gaps) signed by: keyless OIDC (github-actions) written: cga.intoto.jsonl $ roam audit-trail-verify cga.intoto.jsonl VERDICT: VERIFIED signature: valid graph_merkle: matches HEAD gates: PASS
The pattern
Each demo follows the same shape: an AI agent ships a plausible patch, conventional CI passes, and a structural problem slips through. Roam runs alongside CI as a deterministic graph-aware gate that catches the problem before merge.
None of these checks require a network call, an API key, or access to your source code outside the repo. The graph is local, the analysis is local, and the verdict is reproducible.
Try them yourself — install the free CLI · or read the agent contract for the discipline that strings them together.