Trust Fund Baby, TFBthumb

Kill the screenshot loop.

A browser agent that sees the page continuously: DOM mutations, network in-flight, animations, accessibility tree, fused into one perceptual stream. Acts against stable element ids, not guessed coordinates. Settles by observation, not by a fixed wait. And cannot land a consequential action without a single-use human token. Independently verified by a third-party reviewer.

What the receipts say, in numbers

0 / 200

Flakes on the harness vs 130 / 200 for the screenshot loop. Same page, same latencies.

7.7×

Fewer tokens per decision on a content-rich page. 55× fewer bytes.

2.0×

Faster end to end on the agent gate vs the screenshot baseline.

Ungated consequential dispatches across every gate run. Structurally impossible by the Ceiling's construction.

How the seven gates prove it

Every claim above is backed by a gate script you can re-run on your own machine. The gates exit with PASS only when they observe the specific receipt, not when a number sounds right.

Phase 0: Retina seed. The page must finish loading and finish animating before settled fires. No screenshots used.
Phase 1: the Thumb. Hover/focus/type/submit, each consequence detected from the same stream the action lands in. A dropped keystroke is recovered mid-word.
Phase 2: the Settle Engine. 200 randomized runs at 50–1500 ms latency. TFBthumb 0 flakes; the screenshot-diff baseline 130 flakes on the same harness.
Phase 3: the Ceiling. No consequential dispatch without a human-minted single-use token. Verifier closure carries only the public key. Ledger detects byte-flip tamper. And a swap-label attack (page mutates a button's name to Send after observation) is caught at authorize time.
Phase 4: agent in the loop. The agent fills a form, is blocked on Submit, the human approves with a single-use token, the agent completes the task, and is faster than the screenshot baseline.
v0.2 effect-gate: the Sentinel. A button mislabeled OK that POSTs to /charge has its click allowed by the keyword classifier, and the resulting POST is blocked at the wire before reaching the server.
v0.2.2 identity-break gate: a page swaps a button's underlying DOM node so a different affordance now occupies the same slot. The Brain refuses to act on the slot-replaced element. Closes the reversible-tier swap-attack class.

The verification chain

Every claim above is paired with a receipt the reviewer signs off on. We do not state what we have not had checked.

Verified

Independent reviewer packet: the bounded-claim set, the source SHAs the reviewer must match, the reproduce script.

Gemini reviewed the v0.2 substrate against the packet's 12 falsifiable questions and returned verified.

Verified

Gemini decision receipt: the verdict, question-by-question record, and what the verdict does (and does not) authorize.

Section 5 explicitly states the bounded-claim set holds; sections 6–7 name what wider claims the verdict does NOT open.

Internal

DAVID+ security review: agency ceiling, Ed25519 key custody, untrusted-page fence, swap-attack defense at both tiers.

First pass at v0.2 named 3 class-shaped gaps; second pass at v0.2.2 confirmed all 3 healed structurally.

Internal

CODE_MECHANIC engineering review: parse-gate every file, contract honesty across the module graph, structured Result returns where booleans would lie.

First pass named 4 notes; second pass confirmed all 4 healed. Marked as a Claude Opus 4.7 proxy of the gpt-5.1 substrate.

Heals

v0.2 → v0.2.1 · v0.2.1 → v0.2.2: the heal trail.

Every named gap and engineering note from the inspector reviews has a receipt explaining what got changed, where, and why.

Logs

Phase 2 n=200 receipt · Analytics JSON · Gate run tails

The literal output of the gate scripts on the current substrate. No retouching.

Try it without committing

The playground runs a sandboxed Chromium against any URL you paste. Read-only and reversible actions only. Consequential actions are refused at the wrapper, so a playground run cannot complete a checkout, send a message, or delete an account, even if you ask it to.

Open the playground

Paste a URL. Describe what you want done. Watch the agent move. Download the ledger receipt. → Playground

For beta evaluation

If you want to point TFBthumb at a real workflow that includes consequential actions: submitting a form, completing a checkout, sending a message; the beta API gives you that with a human-in-the-loop approver. Every consequential action surfaces to your approver before it dispatches; nothing fires without your single-use token.

Beta access is paid, scoped to one evaluation, under a one-page agreement. → Request beta access

What this surface does NOT promise

The boundaries that hold

The bounded-claim set above is what is verified. Specifically not claimed: a streaming embodied perception model (the blueprint's Phase 5 — marked unverified); cross-origin iframe orchestration; multi-tab; closed shadow roots; GET-with-side-effects coverage at the wire gate; unattended autonomous loops. Each is named as a residual or a fresh decision in the reviewer packet.

Built by Trust Fund Baby. Verified by independent peer review. Methods are protected; receipts are open. The substrate runs at v0.2.2. The reviewer packet pairs to the v0.2 SHAs; the heals trail pairs each subsequent change to a re-run gate set.

Browse-only public surface. The substrate source is closed; the receipts and gates are open so a reproducer can verify what is claimed without seeing how it is done.