State fingerprinting for the state-based exploration engine.
Produces a deterministic fingerprint of the current browser state that goes
beyond the existing module:pipeline/smartCrawl.fingerprintStructure
(which only hashes element tags). A state fingerprint captures:
- URL route (pathname + hash, with significant query params)
- Route param pattern (numeric segments normalised to
:id) - DOM structural shape (reuses smartCrawl.fingerprintStructure)
- Visible text content hash (with dynamic value normalisation)
- UI component inventory (modals, sidebars, dropdowns, toasts, etc.)
- SPA framework markers and loading/error states
- Form field states (empty vs filled vs error)
Two states are considered identical when their fingerprints match, preventing infinite exploration loops and detecting meaningful transitions.
Exports
fingerprintState—(snapshot) → stringstatesEqual—(fp1, fp2) → boolean
- Source:
Methods
(static) fingerprintState(snapshot) → {string}
Produce a deterministic fingerprint of the current application state.
Combines route (with significant query params and normalised path params), DOM structure, visible content (with dynamic value normalisation), UI component inventory, SPA markers, and form state into a single hash string. Used by the state explorer to detect whether an action caused a meaningful state transition.
Parameters:
| Name | Type | Description |
|---|---|---|
snapshot |
object | — page snapshot from |
- Source:
Returns:
deterministic fingerprint string
- Type
- string
(static) statesEqual(fp1, fp2) → {boolean}
Check if two state fingerprints represent the same application state.
Parameters:
| Name | Type | Description |
|---|---|---|
fp1 |
string | |
fp2 |
string |
- Source:
Returns:
- Type
- boolean
(inner) componentInventory(snapshot) → {string}
Build a sorted, deterministic inventory of visible UI component types.
Goes beyond the original hasModals / hasTabs boolean flags to enumerate
the full set of component types present on the page. This ensures that
two pages with the same headings but different component layouts (e.g.
sidebar visible vs collapsed) produce different fingerprints.
Parameters:
| Name | Type | Description |
|---|---|---|
snapshot |
object | — page snapshot from takeSnapshot |
- Source:
Returns:
sorted component inventory string
- Type
- string
(inner) extractRoute(url) → {string}
Extract the route portion of a URL with significant query params.
Normalises trailing slashes so /about/ and /about fingerprint the same.
Numeric path segments are normalised to :id so /users/123 and
/users/456 produce the same route pattern (#52 defect #2).
Significant query params (category, sort, view, etc.) are included in
sorted order; noise params are stripped (#52 defect #1).
Parameters:
| Name | Type | Description |
|---|---|---|
url |
string |
- Source:
Returns:
- Type
- string
(inner) formStateSignature(formStructures) → {string}
Compute a form-state signature from the snapshot's formStructures. Captures which fields are filled vs empty and whether required fields have values — this distinguishes "clean form" from "form with errors".
Parameters:
| Name | Type | Description |
|---|---|---|
formStructures |
Array | — from pageSnapshot.js |
- Source:
Returns:
- Type
- string
(inner) hashVisibleContent(elements) → {string}
Hash the visible text content from a snapshot's elements.
Only uses STRUCTURAL text signals (headings, button labels, link text) — not dynamic content like timestamps, counters, or personalised greetings. This prevents trivially different snapshots of the same page (e.g. google.com with different doodle text) from being treated as distinct states.
Dynamic values (order numbers, counts, prices) are normalised before hashing so "Order #12345" and "Order #12346" produce the same hash (#52 defect #5).
Parameters:
| Name | Type | Description |
|---|---|---|
elements |
Array |
- Source:
Returns:
- Type
- string
(inner) normaliseDynamicText(text) → {string}
Normalise dynamic text fragments in a string.
Strips order/ticket numbers, counts with units, currency amounts, timestamps, and other dynamic values that would cause trivially different fingerprints for the same logical state (#52 defect #5).
Parameters:
| Name | Type | Description |
|---|---|---|
text |
string |
- Source:
Returns:
- Type
- string
(inner) simpleHash(str) → {string}
Simple deterministic hash — reuses the same algorithm as
module:pipeline/smartCrawl.fingerprintStructure and
module:pipeline/deduplicator.simpleHash.
Parameters:
| Name | Type | Description |
|---|---|---|
str |
string |
- Source:
Returns:
base-36 hash
- Type
- string