Root cause of 'I clicked many times but only got two screenshots':
finishing/pausing a session called backend.stop(), which cancelled every
in-flight frame request to null. Clicks whose PNG had not finished
encoding yet were then dropped — only the first few survived.
Fixes:
- Stream backend now *drains* on stop: it stops accepting new requests but
keeps the worker alive until frames already selected for queued clicks
finish encoding. stop({ immediate: true }) keeps the old abandon-now
behavior for an unhealthy worker.
- Two-stage worker reply: a fast 'frame-selected' ack pins the pairing and
proves liveness; the slow PNG payload follows. A slow encode (seconds on
software-rendered hosts) is no longer mistaken for a dead worker, which
had been forcing the post-click fresh-shot fallback (late screenshots).
- Queued clicks carry their guide id and are stored even if the session
ends while they wait in the queue.
- The tray gesture that stops a session is discarded by matching its
recorded screen position, not a time window — a fast workflow click near
the stop is no longer collateral damage. (Replaces the earlier grace
window, which dropped whole bursts.)
- A click on a display with no ready stream resolves null so the caller
fresh-shots the correct monitor instead of returning another screen.
- STEPFORGE_CAPTURE_LOG=1 prints one line per click decision; the
second-instance handler now surfaces the running window instead of
exiting silently.
- Self-test gains a fast-burst-then-finish scenario (8/8 saved) and the
marker/coordinate checks remain at 0.00% offset.
Tests: 133 unit + all repo checks passing.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Implements the architecture change from ai_prompts/prompt3.md:
- New app/click-frames.js: shared timestamped frame ring + strict
click-to-frame pairing (never a frame whose grab started after the
click); legacy slack behavior kept behind capture.strictClickFrames=false.
- New stream capture backend (app/stream-backend.js + hidden worker
window): per-display desktop media streams sampled into ring buffers
and PNG-encoded entirely off the main process, so click delivery is
never starved by capture work. Auto-degrades to the legacy in-process
frame loop when streams cannot start or the worker stops answering.
- Clicks are paired with their frame at event time (eager pairing in
enqueueClickCapture); only the storing is serialized, so slow encodes
cannot skew later clicks in a fast burst.
- Linux watcher: restored event-time root coordinates from
xinput test-xi2 and merge raw/regular twin events structurally.
- Replaced the 40ms time debounce with source-aware duplicate
suppression: fast legitimate clicks are never dropped.
- New app/coords.js: physical-to-DIP conversion with multi-monitor and
scale-factor handling; Windows keeps screenToDipPoint.
- STEPFORGE_CLICK_SELFTEST end-to-end hook: 3/3 clicks become steps via
the stream backend with 0.00% marker offset on this host.
- Tests rewritten/added: strict selection, coords, stream backend,
Linux coordinate parsing, twin merge, burst clicking (126 passing).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- New Capture sessions now start paused; the window only tucks away once
the user clicks "Start recording" in the capture bar instead of hiding
~1.2s after starting.
- The capture status bar is shown only in the editor view, not over the
library.
- Fix openModal/confirmDialog resolving as cancelled when an action button
is clicked, which made the step "Delete" button (and other modal actions)
silently no-op.
- Click-triggered captures now use the click-time cursor position for the
marker and arm the capture cache as soon as recording starts, so the
first click is captured instantly and accurately placed.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>