Add tests/checks/test_click_capture_selftest.sh: runs the real Electron
STEPFORGE_CLICK_SELFTEST session and asserts every scenario passes — 3/3
markers at 0.00% offset, 8/8 burst clicks kept on finish, the first armed
click captured (warmup click ignored), and the debounce (4/4). Picked up
automatically by tests/run_test.sh. Skips cleanly when the host has no
capture environment so it never falsely fails CI, but fails the suite on any
real regression in click->screenshot->step behavior.
Document the guard in ARCHITECTURE.md and CHANGELOG.md.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Root cause: warm-before-hide kept the window visible during backend warmup,
and on a restart that warmup could take several seconds (the stream backend
start waits up to 8s). During that visible window, clicks over the app were
skipped by the userIsInApp guard and clicks elsewhere were shot post-click,
so a restarted session looked like it stopped after one click.
- Recording is now 'armed' only after the window is hidden and the buffer is
primed. A new warmingUp flag makes onOsClick ignore clicks during warmup
(the window is covering the user's work anyway) instead of mishandling
them. Cleared on pause/finish.
- armRecording caps the warmup wait (WARMUP_MAX_MS=1500): the window hides
and the session arms even if the backend start hangs, so it can never sit
visible for seconds dropping clicks. The backend keeps coming up in the
background; the first click or two may take the fresh-shot fallback.
- A generation token invalidates an in-flight backend start whose session
has since finished, so a slow start can't install into a new session or
leave the starting-guard stuck and block the restart from starting one.
Tests: 4 new behavioral capture tests (warmup ignores clicks; pause/finish
clear it; armRecording warms-then-hides-then-arms; a hung start still arms
within the cap; a stale start is discarded and frees the guard) plus a new
end-to-end self-test scenario (warmup click ignored, first armed click
captured). 152 unit tests + all repo checks pass.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Per request: clicks of the same button closer together than
capture.clickDebounceMs (default 200ms) now collapse into a single step, so
accidental fast/double clicks don't each become a step. It is a leading-edge
debounce measured from the last *accepted* click, so a run of fast clicks
can't push the next deliberate click out — two clicks spaced beyond the
window (e.g. the reported 400-500ms apart) always register.
Replaces the prior 8ms duplicate-delivery suppression (subsumed by the
window). Configurable; 0 captures every click.
Tests (the point of this change is that it can't silently regress):
- 13 behavioral unit tests in capture.test.js that drive real onOsClick
calls with controlled timestamps and assert which clicks survive — the
reported 400/450/500ms cases, sub-window collapse, the 200ms boundary,
per-button independence, configurability, debounce=0, last-accepted (not
last-dropped) reference, session reset, and a full onOsClick -> queue ->
store integration check. No keyword/comment assertions.
- A fourth end-to-end self-test scenario (burst of 40ms clicks collapses to
1; three 300ms-apart clicks each register => 4 total). The marker/drain
scenarios set debounce to 0 so they keep stressing the frame pipeline.
147 unit tests + all repo checks pass.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The first screenshot of a session was late while every later one was fine.
Cause: on 'Start recording' the window hid first and the capture backend
started warming up after — creating worker, getUserMedia, first frame takes
~1s. A click in that gap found no buffered frame and took the post-click
fresh shot.
armRecording() now warms the recorder while the window is still visible and
only hides once frames are buffering (with a brief post-hide settle so the
first frame shows the user's screen, not the dismissed app window). Verified
end to end with a new self-test scenario that clicks 250ms after start: the
first click is now served a pre-click frame instead of a post-click shot.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The remaining 'captured slightly after the click' reports came from the
fresh-shot fallback, which grabs the screen when the click is processed
(after it). The previous lead change made that fallback *more* likely: a
frame now had to be >=120ms before the click to qualify, so on machines
where the capture stream can't always keep a frame that old buffered, more
clicks fell through to the post-click shot.
Make the click-lead a two-tier preference instead of a hard gate in
selectFrameForClick:
1. newest frame captured at least leadMs before the click (ideal margin), else
2. newest frame captured before the click at all.
Only when no pre-click frame exists does the caller fresh-shot. leadMs is
threaded through the stream backend to the worker so both selection paths
agree. Verified end to end: frames land ~120-170ms before each click,
markers stay at 0.00%, and the 8-click burst still saves all 8.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Real-world recording now saves every click with exact markers; the only
remaining nit was screenshots feeling a touch late. Add a configurable
click-lead (capture.clickLeadMs, default 120ms) that targets the screen
just before the hook timestamp, and tighten the stream sampling cadence to
50ms so a frame near that target always exists. Verified end to end: frames
now land ~120-160ms before the click (was 25-57ms), markers stay at 0.00%
offset, and the 8-click burst still saves all 8.
Also document the milestone in docs/CHANGELOG.md and remove an accidental
paste of Gitea commit-page text from it.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Root cause of 'I clicked many times but only got two screenshots':
finishing/pausing a session called backend.stop(), which cancelled every
in-flight frame request to null. Clicks whose PNG had not finished
encoding yet were then dropped — only the first few survived.
Fixes:
- Stream backend now *drains* on stop: it stops accepting new requests but
keeps the worker alive until frames already selected for queued clicks
finish encoding. stop({ immediate: true }) keeps the old abandon-now
behavior for an unhealthy worker.
- Two-stage worker reply: a fast 'frame-selected' ack pins the pairing and
proves liveness; the slow PNG payload follows. A slow encode (seconds on
software-rendered hosts) is no longer mistaken for a dead worker, which
had been forcing the post-click fresh-shot fallback (late screenshots).
- Queued clicks carry their guide id and are stored even if the session
ends while they wait in the queue.
- The tray gesture that stops a session is discarded by matching its
recorded screen position, not a time window — a fast workflow click near
the stop is no longer collateral damage. (Replaces the earlier grace
window, which dropped whole bursts.)
- A click on a display with no ready stream resolves null so the caller
fresh-shots the correct monitor instead of returning another screen.
- STEPFORGE_CAPTURE_LOG=1 prints one line per click decision; the
second-instance handler now surfaces the running window instead of
exiting silently.
- Self-test gains a fast-burst-then-finish scenario (8/8 saved) and the
marker/coordinate checks remain at 0.00% offset.
Tests: 133 unit + all repo checks passing.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Implements the architecture change from ai_prompts/prompt3.md:
- New app/click-frames.js: shared timestamped frame ring + strict
click-to-frame pairing (never a frame whose grab started after the
click); legacy slack behavior kept behind capture.strictClickFrames=false.
- New stream capture backend (app/stream-backend.js + hidden worker
window): per-display desktop media streams sampled into ring buffers
and PNG-encoded entirely off the main process, so click delivery is
never starved by capture work. Auto-degrades to the legacy in-process
frame loop when streams cannot start or the worker stops answering.
- Clicks are paired with their frame at event time (eager pairing in
enqueueClickCapture); only the storing is serialized, so slow encodes
cannot skew later clicks in a fast burst.
- Linux watcher: restored event-time root coordinates from
xinput test-xi2 and merge raw/regular twin events structurally.
- Replaced the 40ms time debounce with source-aware duplicate
suppression: fast legitimate clicks are never dropped.
- New app/coords.js: physical-to-DIP conversion with multi-monitor and
scale-factor handling; Windows keeps screenToDipPoint.
- STEPFORGE_CLICK_SELFTEST end-to-end hook: 3/3 clicks become steps via
the stream backend with 0.00% marker offset on this host.
- Tests rewritten/added: strict selection, coords, stream backend,
Linux coordinate parsing, twin merge, burst clicking (126 passing).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>