Rearchitect click capture: strict click-time frames, off-main-process recorder, exact marker coordinates
Template tests / tests (push) Successful in 1m50s
Template tests / tests (push) Successful in 1m50s
Implements the architecture change from ai_prompts/prompt3.md: - New app/click-frames.js: shared timestamped frame ring + strict click-to-frame pairing (never a frame whose grab started after the click); legacy slack behavior kept behind capture.strictClickFrames=false. - New stream capture backend (app/stream-backend.js + hidden worker window): per-display desktop media streams sampled into ring buffers and PNG-encoded entirely off the main process, so click delivery is never starved by capture work. Auto-degrades to the legacy in-process frame loop when streams cannot start or the worker stops answering. - Clicks are paired with their frame at event time (eager pairing in enqueueClickCapture); only the storing is serialized, so slow encodes cannot skew later clicks in a fast burst. - Linux watcher: restored event-time root coordinates from xinput test-xi2 and merge raw/regular twin events structurally. - Replaced the 40ms time debounce with source-aware duplicate suppression: fast legitimate clicks are never dropped. - New app/coords.js: physical-to-DIP conversion with multi-monitor and scale-factor handling; Windows keeps screenToDipPoint. - STEPFORGE_CLICK_SELFTEST end-to-end hook: 3/3 clicks become steps via the stream backend with 0.00% marker offset on this host. - Tests rewritten/added: strict selection, coords, stream backend, Linux coordinate parsing, twin merge, burst clicking (126 passing). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
@@ -102,6 +102,41 @@ IPC API (`stepforge.*`), and `app/main.js` routes calls into `core/`. Screen
|
||||
capture uses Electron's `desktopCapturer` (full screen, window) and an
|
||||
overlay window for region selection; hotkeys use `globalShortcut`.
|
||||
|
||||
## Click-Capture Pipeline
|
||||
|
||||
Workflow recording must behave like one click → one step, with the
|
||||
screenshot showing the screen *at* the click and the marker on the exact
|
||||
click position. Three pieces make that hold:
|
||||
|
||||
1. **OS click events** (`app/capture.js`): a low-level mouse hook on Windows
|
||||
(`CLICK x y button unixMs` lines), an `xinput test-xi2 --root` watcher on
|
||||
X11. The Linux parser carries event-time `root:` coordinates and merges
|
||||
raw/regular twin blocks structurally — there is no time-based debounce
|
||||
that could drop fast clicks, only suppression of identical duplicate
|
||||
deliveries. Physical coordinates convert to DIP via
|
||||
`screen.screenToDipPoint` on Windows or display-geometry math in
|
||||
`app/coords.js` elsewhere (multi-monitor and scale-factor aware).
|
||||
|
||||
2. **Frame recorders**: while recording, a hidden worker window
|
||||
(`app/stream-backend.js` + `app/renderer/capture-worker.js`) samples a
|
||||
desktop media stream per display into a timestamped ring buffer —
|
||||
entirely off the main process, so click delivery is never delayed by
|
||||
capture work, and PNG encoding happens in the worker. If streams can't
|
||||
start (portal-less Wayland), or the worker stops answering, the service
|
||||
degrades to the legacy in-process `desktopCapturer` loop.
|
||||
|
||||
3. **Click ↔ frame pairing** (`app/click-frames.js`, shared by the main
|
||||
process, the worker, and tests): each click is paired *at event time*
|
||||
with the newest frame captured at or before its hook timestamp. In strict
|
||||
mode (`capture.strictClickFrames`, default on) a frame whose grab started
|
||||
after the click is never used — when nothing qualifies, the service takes
|
||||
an explicit fresh shot instead of passing a post-click frame off as the
|
||||
click-time screen. Storing is serialized per click; pairing is not, so
|
||||
slow encodes never skew later clicks.
|
||||
|
||||
`STEPFORGE_CLICK_SELFTEST=1 npm start` exercises the whole pipeline in a
|
||||
real Electron session and reports steps-per-click and marker offsets.
|
||||
|
||||
## Security Rules
|
||||
|
||||
- Zero network code paths: no sockets, no telemetry, no update or license
|
||||
|
||||
Reference in New Issue
Block a user