Verification is the single highest-leverage addition to any agentic workflow. I am not exaggerating. Nothing else I have adopted has had a bigger impact on the quality of what ships.
Without verification, you are trusting Claude to be right on the first try. Sometimes it is. Often enough, it is not. A subtle type error. A missing import. A function that works but does not handle the edge case. These slip through unnoticed until they hit production.
With verification, you catch these before they leave your machine.
The minimum viable verification
The minimum verification is embarrassingly simple: run the build and check that it passes.
Run pnpm build and make sure it passes before we move on.
That is it. This alone catches roughly 70% of the issues I would otherwise have to debug later. Missing imports, type mismatches, broken references, unused variables — the build catches them all.
Most people do not even do this consistently. They finish a task, glance at the code, and commit. The build fails in CI fifteen minutes later. They context-switch back, debug the issue, fix it, push again. Thirty minutes wasted because they skipped a thirty-second check.
The 6-phase verification loop
The build is the minimum. Here is the full pipeline I run before anything gets committed. Six phases, each catching a different category of issue.
Build
Does it compile? This catches broken imports, syntax errors, and configuration issues. If the build fails, nothing else matters.
Type check
Run tsc --noEmit. Catches type errors that the build might miss (especially in files not directly imported by a page). Stricter than the build alone.
Lint
Run your linter. Catches style violations, unused imports, accessibility issues, and patterns your team has decided to avoid. Not just cosmetic — lint rules encode team decisions.
Tests
Do existing tests still pass? What about new tests for the code you just wrote? If you have tests, run them. If you do not, at least check that you did not break someone else's.
Security
Any secrets in the code? Hardcoded API keys? SQL injection vectors? XSS vulnerabilities? A quick scan prevents the kind of mistake that ends up on Hacker News.
Diff review
Run git diff and look at every change. Are there unintended modifications? Files that should not have been touched? Debug logs left in production code? The diff is the final sanity check.
Not every phase is necessary every time. For a small CSS change, build and lint might be enough. For a new API endpoint, you want all six. Use judgment — but default to running more, not fewer.
A verification-loop skill can codify this entire pipeline — including decision logic for which phases to run. The skill checks diff size (small diffs skip the full test suite), runs a coverage gate (are the changed lines covered?), and includes a failure decision tree: if lint fails, auto-fix and re-run; if tests fail, check if the test or the implementation is wrong; if security finds something, block the commit entirely. This turns verification from a manual checklist into an adaptive workflow that scales its rigor based on the scope of the change.
Making verification automatic
The pipeline above is useless if you have to remember to run it. You will forget. Not sometimes — regularly. Especially when you are in flow, moving fast, excited about the feature you just built.
There are three ways to make it automatic:
As a hook. A Stop hook that runs verification after Claude finishes a task. This is the most aggressive option — it runs every time, whether the task was a one-line fix or a full feature.
As a skill. A /verify command you invoke before committing. Less aggressive, but you still have to remember to call it. Good enough for most workflows.
As a CLAUDE.md rule. "Before committing, always run the verification pipeline." This works 80% of the time, which means 20% of your commits ship unverified. Acceptable for low-stakes projects. Not acceptable for production.
Verification as a skill
Here is the skill I use. Drop this in .claude/skills/verify.md and you can run it with /verify before any commit.
---
name: verify
description: Run the full verification loop before committing
---
Run the following checks in order. Stop at the first failure and report it.
1. Run `pnpm build` — must pass with zero errors
2. Run `tsc --noEmit` — must pass with zero type errors
3. Run `pnpm lint` — must pass with zero warnings
4. Run `pnpm test` if tests exist — all must pass
5. Run `git diff` — review all changes for:
- Unintended file modifications
- Debug logs or console.log statements
- Hardcoded secrets or API keys
- Files that should not have been committed
Report: READY if all checks pass. NOT READY with specific details if any check fails.This is not complicated. It is not clever. It is just a checklist that runs the same way every time. That is exactly why it works.
Confidence-based code review
When you use a reviewer agent — a separate Claude instance that reviews code — there is a critical principle: filter by confidence.
Only report issues the reviewer is more than 80% confident about. Low-confidence reports ("this might be a problem, I'm not sure") create noise. After a few false alarms, you train yourself to ignore all feedback from the reviewer. That is worse than having no reviewer at all.
High-confidence reports are actionable. "This function does not handle the case where user is null, and it is called from a context where null is possible." That is specific, verifiable, and almost certainly correct. You fix it immediately.
Review the diff below. Only report issues you are more than 80% confident are real problems. For each issue, state:
- The file and line
- What the issue is
- Why you are confident it is a problem
Do NOT report style preferences, minor naming suggestions, or potential issues you are unsure about.
The same principle applies to automated checks. If your lint rule fires a hundred warnings about minor style issues, you will stop reading the lint output. Five high-confidence errors are worth more than a hundred low-confidence warnings.
The ROI of verification
I want to share numbers from my own experience. Over the last three months of using this verification pipeline:
- The build check caught issues in roughly 40% of sessions
- Type checking caught additional issues (beyond the build) in about 15% of sessions
- The diff review caught unintended changes in about 25% of sessions
- Total time spent on verification: roughly 30-60 seconds per commit
- Total time saved by catching issues early: impossible to measure exactly, but conservatively hours per week
The math is not close. Verification is the highest-return investment you can make in your agentic workflow.
Where this fits in the bigger picture
Verification closes the loop. You give Claude a task (Chapter 3). Claude executes within context constraints (Chapter 4). You capture friction and automate rules (this chapter). And verification confirms that the output is correct before it ships.
Without verification, the loop is open. Work goes out and you hope it is right. With verification, the loop is closed. Work goes out and you know it is right.
That is the difference between using an AI tool and having an AI workflow. The tool does the work. The workflow guarantees the quality.