Files
autodoc/docs/ARCHITECTURE.md
T
Iisyourdad a0b69f8cc7
Template tests / tests (push) Successful in 1m50s
Rearchitect click capture: strict click-time frames, off-main-process recorder, exact marker coordinates
Implements the architecture change from ai_prompts/prompt3.md:

- New app/click-frames.js: shared timestamped frame ring + strict
  click-to-frame pairing (never a frame whose grab started after the
  click); legacy slack behavior kept behind capture.strictClickFrames=false.
- New stream capture backend (app/stream-backend.js + hidden worker
  window): per-display desktop media streams sampled into ring buffers
  and PNG-encoded entirely off the main process, so click delivery is
  never starved by capture work. Auto-degrades to the legacy in-process
  frame loop when streams cannot start or the worker stops answering.
- Clicks are paired with their frame at event time (eager pairing in
  enqueueClickCapture); only the storing is serialized, so slow encodes
  cannot skew later clicks in a fast burst.
- Linux watcher: restored event-time root coordinates from
  xinput test-xi2 and merge raw/regular twin events structurally.
- Replaced the 40ms time debounce with source-aware duplicate
  suppression: fast legitimate clicks are never dropped.
- New app/coords.js: physical-to-DIP conversion with multi-monitor and
  scale-factor handling; Windows keeps screenToDipPoint.
- STEPFORGE_CLICK_SELFTEST end-to-end hook: 3/3 clicks become steps via
  the stream backend with 0.00% marker offset on this host.
- Tests rewritten/added: strict selection, coords, stream backend,
  Linux coordinate parsing, twin merge, burst clicking (126 passing).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 21:33:31 -05:00

7.4 KiB

Architecture

StepForge is split into a dependency-free Node.js core and a thin Electron desktop shell. All product logic lives in core/ and exporters/ and runs (and is tested) headlessly with plain node. The shell in app/ only provides the window, the canvas UI, screen capture, hotkeys, and the clipboard.

Repository Layout

app/            Electron shell: main process, preload bridge, renderer UI
core/           Dependency-free domain logic (schema, store, archive, search,
                placeholders, render AST, png/gif/pdf/zip primitives, locks,
                snapshots, settings)
exporters/      One module per output format, all consuming the Render AST
scripts/        bootstrap / verify / build / package scripts (sh + ps1)
tests/
  run_test.sh   entrypoint — runs every tests/checks/test_*.sh
  checks/       shell wrappers that invoke the node test suites
  unit/         node:test workflow suites
  fixtures/     test images and guides
examples/       sample guide + sample exports
assets/         app icon and packaged static assets
build/          agent_audit.md, build_report.md, artifacts_manifest.json
docs/           file-format and data-model documentation
vendor/         reserved for vendored deps (reuse only; nothing fetched)

Data Model

Internal working storage is folder-based; sharing/backups use a single-file zip archive (.sfgz).

<data root>/                      (~/.local/share/stepforge, %APPDATA%/stepforge,
  settings/                        or $STEPFORGE_DATA_DIR)
    app-settings.json
    placeholders.json             global placeholders
    templates/<format>/<name>.template.json
  library/
    folders.json                  folder tree + guide->folder mapping
    guides/<guide-id>/
      guide.json                  guide metadata (schema below)
      steps/<step-id>/
        step.json
        original.png              never mutated after capture
        working.png               crop target; annotations stay vector JSON
      history/snapshots/*.zip     automated + manual backups
    index/search-index.json       inverted full-text index
  temp/                           previews; cleaned on close
  shared-links/                   linked-guide registry
  • guide.json — schemaVersion, guideId, title, descriptionHtml, placeholders, flags (focusedViewDefault, ...), stepsOrder, favorite, linkedSource, exportProfiles, createdAt/updatedAt.
  • step.json — stepId, parentStepId, kind (image | empty | content), status (todo | in-progress | done), title, descriptionHtml, hidden, skipped, focusedView {enabled, zoom, panX, panY}, image paths + size, annotations[] (normalized scene graph, coordinates in 0..1 fractions of the image), textBlocks[], codeBlocks[], tableBlocks[], links[].

All writes are atomic (write to *.tmp, fsync, rename). Deleting a guide moves it to library/trash/ first.

Annotation Scene Graph

Annotations are stored as normalized JSON, never as an editor-library blob. Coordinates are fractions of the image (resolution-independent). Types: rect, oval, line, arrow, text, tooltip, number, blur, highlight, magnify, cursor. The same geometry is rendered by the editor canvas (HTML5 canvas) and by the export rasterizer (core/raster.js), so what you see is what exports.

Render Pipeline

guide.json + step.json + settings
        │ core/renderast.js  (placeholder expansion, numbering, filtering
        ▼                     hidden/skipped, focused-view geometry)
   Render AST  ──► exporters/json.js        .json + steps-<title>/ images
               ──► exporters/markdown.js    .md  + steps-<title>/ images
               ──► exporters/html-simple.js single self-contained .html
               ──► exporters/html-rich.js   checkboxes + floating TOC
               ──► exporters/pdf.js         native PDF writer (core/pdf.js)
               ──► exporters/gif.js         GIF89a encoder (core/gif.js)
               ──► exporters/image-bundle.js annotated PNGs + metadata
               ──► exporters/docx.js        zip+XML (core/zip.js)
               ──► exporters/pptx.js        zip+XML (core/zip.js)

Image-bearing exporters rasterize annotations with core/raster.js on top of PNG pixels decoded by core/png.js. Every exporter accepts a template object (per-format settings persisted under settings/templates/, shareable as .sfglt zip files).

Shell / Core Boundary

The renderer never touches the filesystem. app/preload.js exposes a typed IPC API (stepforge.*), and app/main.js routes calls into core/. Screen capture uses Electron's desktopCapturer (full screen, window) and an overlay window for region selection; hotkeys use globalShortcut.

Click-Capture Pipeline

Workflow recording must behave like one click → one step, with the screenshot showing the screen at the click and the marker on the exact click position. Three pieces make that hold:

  1. OS click events (app/capture.js): a low-level mouse hook on Windows (CLICK x y button unixMs lines), an xinput test-xi2 --root watcher on X11. The Linux parser carries event-time root: coordinates and merges raw/regular twin blocks structurally — there is no time-based debounce that could drop fast clicks, only suppression of identical duplicate deliveries. Physical coordinates convert to DIP via screen.screenToDipPoint on Windows or display-geometry math in app/coords.js elsewhere (multi-monitor and scale-factor aware).

  2. Frame recorders: while recording, a hidden worker window (app/stream-backend.js + app/renderer/capture-worker.js) samples a desktop media stream per display into a timestamped ring buffer — entirely off the main process, so click delivery is never delayed by capture work, and PNG encoding happens in the worker. If streams can't start (portal-less Wayland), or the worker stops answering, the service degrades to the legacy in-process desktopCapturer loop.

  3. Click ↔ frame pairing (app/click-frames.js, shared by the main process, the worker, and tests): each click is paired at event time with the newest frame captured at or before its hook timestamp. In strict mode (capture.strictClickFrames, default on) a frame whose grab started after the click is never used — when nothing qualifies, the service takes an explicit fresh shot instead of passing a post-click frame off as the click-time screen. Storing is serialized per click; pairing is not, so slow encodes never skew later clicks.

STEPFORGE_CLICK_SELFTEST=1 npm start exercises the whole pipeline in a real Electron session and reports steps-per-click and marker offsets.

Security Rules

  • Zero network code paths: no sockets, no telemetry, no update or license checks, no remote fonts in exports.
  • Archive imports validate every entry name against path traversal and absolute paths before extraction (core/zip.js).
  • Linked guides use sidecar *.lock-sfgz lock files; conflicts surface a keep-editing / discard dialog and last-write-wins is documented.
  • Renderer runs with contextIsolation: true, nodeIntegration: false, and sandbox enabled; only the preload API is exposed.

Workflow

  1. Change core/exporter logic together with its workflow tests in tests/unit/.
  2. Put shell checks in tests/checks/ so the shared runner picks them up.
  3. Run bash tests/run_test.sh locally.
  4. Open a pull request so CI can verify on PR open.