Trust Fund Baby, TFBthumb

Kill the screenshot loop.

A browser agent that sees the page continuously: DOM mutations, network in-flight, animations, accessibility tree, fused into one perceptual stream. Acts against stable element ids, not guessed coordinates. Settles by observation, not by a fixed wait. And cannot land a consequential action without a single-use human token. Independently verified by a third-party reviewer.

What the receipts say, in numbers

0 / 200
Flakes on the harness vs 130 / 200 for the screenshot loop. Same page, same latencies.
7.7×
Fewer tokens per decision on a content-rich page. 55× fewer bytes.
2.0×
Faster end to end on the agent gate vs the screenshot baseline.
0
Ungated consequential dispatches across every gate run. Structurally impossible by the Ceiling's construction.

How the seven gates prove it

Every claim above is backed by a gate script you can re-run on your own machine. The gates exit with PASS only when they observe the specific receipt, not when a number sounds right.

The verification chain

Every claim above is paired with a receipt the reviewer signs off on. We do not state what we have not had checked.

Verified
Independent reviewer packet: the bounded-claim set, the source SHAs the reviewer must match, the reproduce script.
Gemini reviewed the v0.2 substrate against the packet's 12 falsifiable questions and returned verified.
Verified
Gemini decision receipt: the verdict, question-by-question record, and what the verdict does (and does not) authorize.
Section 5 explicitly states the bounded-claim set holds; sections 6–7 name what wider claims the verdict does NOT open.
Internal
DAVID+ security review: agency ceiling, Ed25519 key custody, untrusted-page fence, swap-attack defense at both tiers.
First pass at v0.2 named 3 class-shaped gaps; second pass at v0.2.2 confirmed all 3 healed structurally.
Internal
CODE_MECHANIC engineering review: parse-gate every file, contract honesty across the module graph, structured Result returns where booleans would lie.
First pass named 4 notes; second pass confirmed all 4 healed. Marked as a Claude Opus 4.7 proxy of the gpt-5.1 substrate.
Heals
v0.2 → v0.2.1 · v0.2.1 → v0.2.2: the heal trail.
Every named gap and engineering note from the inspector reviews has a receipt explaining what got changed, where, and why.
Logs
Phase 2 n=200 receipt · Analytics JSON · Gate run tails
The literal output of the gate scripts on the current substrate. No retouching.

Try it without committing

The playground runs a sandboxed Chromium against any URL you paste. Read-only and reversible actions only. Consequential actions are refused at the wrapper, so a playground run cannot complete a checkout, send a message, or delete an account, even if you ask it to.

Open the playground

Paste a URL. Describe what you want done. Watch the agent move. Download the ledger receipt. → Playground

For beta evaluation

If you want to point TFBthumb at a real workflow that includes consequential actions: submitting a form, completing a checkout, sending a message; the beta API gives you that with a human-in-the-loop approver. Every consequential action surfaces to your approver before it dispatches; nothing fires without your single-use token.

Beta access is paid, scoped to one evaluation, under a one-page agreement. → Request beta access

What this surface does NOT promise

The boundaries that hold

The bounded-claim set above is what is verified. Specifically not claimed: a streaming embodied perception model (the blueprint's Phase 5 — marked unverified); cross-origin iframe orchestration; multi-tab; closed shadow roots; GET-with-side-effects coverage at the wire gate; unattended autonomous loops. Each is named as a residual or a fresh decision in the reviewer packet.