Module: pipeline/stateFingerprint

State fingerprinting for the state-based exploration engine.

Produces a deterministic fingerprint of the current browser state that goes beyond the existing module:pipeline/smartCrawl.fingerprintStructure (which only hashes element tags). A state fingerprint captures:

  • URL route (pathname + hash, with significant query params)
  • Route param pattern (numeric segments normalised to :id)
  • DOM structural shape (reuses smartCrawl.fingerprintStructure)
  • Visible text content hash (with dynamic value normalisation)
  • UI component inventory (modals, sidebars, dropdowns, toasts, etc.)
  • SPA framework markers and loading/error states
  • Form field states (empty vs filled vs error)

Two states are considered identical when their fingerprints match, preventing infinite exploration loops and detecting meaningful transitions.

Exports

  • fingerprintState(snapshot) → string
  • statesEqual(fp1, fp2) → boolean
Source:

Methods

(static) fingerprintState(snapshot) → {string}

Produce a deterministic fingerprint of the current application state.

Combines route (with significant query params and normalised path params), DOM structure, visible content (with dynamic value normalisation), UI component inventory, SPA markers, and form state into a single hash string. Used by the state explorer to detect whether an action caused a meaningful state transition.

Parameters:
Name Type Description
snapshot object

— page snapshot from module:pipeline/pageSnapshot.takeSnapshot

Source:
Returns:

deterministic fingerprint string

Type
string

(static) statesEqual(fp1, fp2) → {boolean}

Check if two state fingerprints represent the same application state.

Parameters:
Name Type Description
fp1 string
fp2 string
Source:
Returns:
Type
boolean

(inner) componentInventory(snapshot) → {string}

Build a sorted, deterministic inventory of visible UI component types.

Goes beyond the original hasModals / hasTabs boolean flags to enumerate the full set of component types present on the page. This ensures that two pages with the same headings but different component layouts (e.g. sidebar visible vs collapsed) produce different fingerprints.

Parameters:
Name Type Description
snapshot object

— page snapshot from takeSnapshot

Source:
Returns:

sorted component inventory string

Type
string

(inner) extractRoute(url) → {string}

Extract the route portion of a URL with significant query params.

Normalises trailing slashes so /about/ and /about fingerprint the same. Numeric path segments are normalised to :id so /users/123 and /users/456 produce the same route pattern (#52 defect #2). Significant query params (category, sort, view, etc.) are included in sorted order; noise params are stripped (#52 defect #1).

Parameters:
Name Type Description
url string
Source:
Returns:
Type
string

(inner) formStateSignature(formStructures) → {string}

Compute a form-state signature from the snapshot's formStructures. Captures which fields are filled vs empty and whether required fields have values — this distinguishes "clean form" from "form with errors".

Parameters:
Name Type Description
formStructures Array

— from pageSnapshot.js

Source:
Returns:
Type
string

(inner) hashVisibleContent(elements) → {string}

Hash the visible text content from a snapshot's elements.

Only uses STRUCTURAL text signals (headings, button labels, link text) — not dynamic content like timestamps, counters, or personalised greetings. This prevents trivially different snapshots of the same page (e.g. google.com with different doodle text) from being treated as distinct states.

Dynamic values (order numbers, counts, prices) are normalised before hashing so "Order #12345" and "Order #12346" produce the same hash (#52 defect #5).

Parameters:
Name Type Description
elements Array
Source:
Returns:
Type
string

(inner) normaliseDynamicText(text) → {string}

Normalise dynamic text fragments in a string.

Strips order/ticket numbers, counts with units, currency amounts, timestamps, and other dynamic values that would cause trivially different fingerprints for the same logical state (#52 defect #5).

Parameters:
Name Type Description
text string
Source:
Returns:
Type
string

(inner) simpleHash(str) → {string}

Simple deterministic hash — reuses the same algorithm as module:pipeline/smartCrawl.fingerprintStructure and module:pipeline/deduplicator.simpleHash.

Parameters:
Name Type Description
str string
Source:
Returns:

base-36 hash

Type
string