Module: pipeline/stateExplorer

State-based exploration engine — discovers multi-step user flows by executing real UI actions and tracking state transitions.

Reuses

  • pipeline/pageSnapshot.takeSnapshot — DOM snapshot capture
  • pipeline/smartCrawl.extractPathPattern — path normalisation
  • pipeline/stateFingerprint.fingerprintState — state identity
  • pipeline/actionDiscovery.discoverActions — action enumeration
  • pipeline/flowGraph.extractFlows / flowToJourney — flow extraction
  • utils/abortHelper.throwIfAborted — abort signal support
  • utils/runLogger.* — SSE logging

Tuning (from Test Dials → options.explorerTuning)

Parameter Range Default Description
maxStates 5–100 30 Max unique states before stopping
maxDepth 1–10 3 Exploration depth from start URL
maxActions 1–20 8 Actions to try per state
actionTimeout 1000–15000 5000 Per-action timeout in ms

Exports

  • exploreStates — full state exploration from a project URL
Source:

Methods

(inner) effectiveUrlCap(existingSnapshots) → {number}

Compute the effective per-URL state cap based on fingerprint diversity.

If the existing states at this URL all have different DOM structures or component inventories, the cap is raised to allow deeper exploration of multi-step wizards and SPA flows. If the states are structurally similar (same DOM, different timestamps), the base cap applies.

Parameters:
Name Type Description
existingSnapshots Array

— snapshots already captured at this URL

Source:
Returns:

effective cap for this URL

Type
number

(inner) isSameEffectiveOrigin(urlA, urlB) → {boolean}

Check if two URLs share the same effective origin (protocol + normalised host). Treats www.example.com and example.com as equivalent.

Parameters:
Name Type Description
urlA string
urlB string
Source:
Returns:
Type
boolean

(inner) isSameOriginAndValid(currentUrl, projectOrigin) → {boolean}

Check if the current page URL is still on the same origin as the project. Returns false if the action navigated to a third-party domain, a bot detection page, or an error page.

Treats www/non-www as equivalent (e.g. google.com ≡ www.google.com).

Parameters:
Name Type Description
currentUrl string

— page.url() after the action

projectOrigin string

— the resolved project origin (after redirect)

Source:
Returns:
Type
boolean

(inner) normaliseHost(hostname) → {string}

Normalise a hostname for origin comparison by stripping the www. prefix. This treats google.com and www.google.com as the same origin, which is correct for virtually all real-world sites (they redirect between the two).

Parameters:
Name Type Description
hostname string
Source:
Returns:
Type
string