The remaining 'captured slightly after the click' reports came from the
fresh-shot fallback, which grabs the screen when the click is processed
(after it). The previous lead change made that fallback *more* likely: a
frame now had to be >=120ms before the click to qualify, so on machines
where the capture stream can't always keep a frame that old buffered, more
clicks fell through to the post-click shot.
Make the click-lead a two-tier preference instead of a hard gate in
selectFrameForClick:
1. newest frame captured at least leadMs before the click (ideal margin), else
2. newest frame captured before the click at all.
Only when no pre-click frame exists does the caller fresh-shot. leadMs is
threaded through the stream backend to the worker so both selection paths
agree. Verified end to end: frames land ~120-170ms before each click,
markers stay at 0.00%, and the 8-click burst still saves all 8.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Real-world recording now saves every click with exact markers; the only
remaining nit was screenshots feeling a touch late. Add a configurable
click-lead (capture.clickLeadMs, default 120ms) that targets the screen
just before the hook timestamp, and tighten the stream sampling cadence to
50ms so a frame near that target always exists. Verified end to end: frames
now land ~120-160ms before the click (was 25-57ms), markers stay at 0.00%
offset, and the 8-click burst still saves all 8.
Also document the milestone in docs/CHANGELOG.md and remove an accidental
paste of Gitea commit-page text from it.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Root cause of 'I clicked many times but only got two screenshots':
finishing/pausing a session called backend.stop(), which cancelled every
in-flight frame request to null. Clicks whose PNG had not finished
encoding yet were then dropped — only the first few survived.
Fixes:
- Stream backend now *drains* on stop: it stops accepting new requests but
keeps the worker alive until frames already selected for queued clicks
finish encoding. stop({ immediate: true }) keeps the old abandon-now
behavior for an unhealthy worker.
- Two-stage worker reply: a fast 'frame-selected' ack pins the pairing and
proves liveness; the slow PNG payload follows. A slow encode (seconds on
software-rendered hosts) is no longer mistaken for a dead worker, which
had been forcing the post-click fresh-shot fallback (late screenshots).
- Queued clicks carry their guide id and are stored even if the session
ends while they wait in the queue.
- The tray gesture that stops a session is discarded by matching its
recorded screen position, not a time window — a fast workflow click near
the stop is no longer collateral damage. (Replaces the earlier grace
window, which dropped whole bursts.)
- A click on a display with no ready stream resolves null so the caller
fresh-shots the correct monitor instead of returning another screen.
- STEPFORGE_CAPTURE_LOG=1 prints one line per click decision; the
second-instance handler now surfaces the running window instead of
exiting silently.
- Self-test gains a fast-burst-then-finish scenario (8/8 saved) and the
marker/coordinate checks remain at 0.00% offset.
Tests: 133 unit + all repo checks passing.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Implements the architecture change from ai_prompts/prompt3.md:
- New app/click-frames.js: shared timestamped frame ring + strict
click-to-frame pairing (never a frame whose grab started after the
click); legacy slack behavior kept behind capture.strictClickFrames=false.
- New stream capture backend (app/stream-backend.js + hidden worker
window): per-display desktop media streams sampled into ring buffers
and PNG-encoded entirely off the main process, so click delivery is
never starved by capture work. Auto-degrades to the legacy in-process
frame loop when streams cannot start or the worker stops answering.
- Clicks are paired with their frame at event time (eager pairing in
enqueueClickCapture); only the storing is serialized, so slow encodes
cannot skew later clicks in a fast burst.
- Linux watcher: restored event-time root coordinates from
xinput test-xi2 and merge raw/regular twin events structurally.
- Replaced the 40ms time debounce with source-aware duplicate
suppression: fast legitimate clicks are never dropped.
- New app/coords.js: physical-to-DIP conversion with multi-monitor and
scale-factor handling; Windows keeps screenToDipPoint.
- STEPFORGE_CLICK_SELFTEST end-to-end hook: 3/3 clicks become steps via
the stream backend with 0.00% marker offset on this host.
- Tests rewritten/added: strict selection, coords, stream backend,
Linux coordinate parsing, twin merge, burst clicking (126 passing).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>