JSDoc: Module: crawler

Autonomous QA pipeline — thin orchestration layer for the 8-stage test generation pipeline.

Pipeline stages

#	Stage	Module
1	Smart crawl / Explore	`pipeline/crawlBrowser.js` or `pipeline/stateExplorer.js`
	↳ HAR capture	`pipeline/harCapture.js` (attached to BrowserContext)
2	Element filtering	`pipeline/elementFilter.js`
3	Intent classification	`pipeline/intentClassifier.js`
4	Journey generation	`pipeline/journeyGenerator.js`
4b	API test generation	`pipeline/journeyGenerator.js` + `prompts/apiTestPrompt.js`
5	Deduplication	`pipeline/pipelineOrchestrator.js`
6	Assertion enhancement	`pipeline/pipelineOrchestrator.js`
7	Validate tests	`pipeline/pipelineOrchestrator.js`
8	Feedback loop	`pipeline/feedbackLoop.js`

Explorer modes (Test Dials `exploreMode`)

crawl (default) — link-only BFS crawl via crawlBrowser.js
state — state-based exploration via stateExplorer.js that executes real UI actions (click, fill, submit) and tracks state transitions to discover multi-step user flows

Exports

generateFromUserDescription — Generate test(s) from a user description (skips crawl).
crawlAndGenerateTests — Full 8-stage pipeline from URL crawl or state exploration.

Source:

crawler.js, line 1

Methods

(static) crawlAndGenerateTests(project, run, optionsopt) → {Promise.<void>}

Full 8-stage pipeline: crawl a project URL, classify pages, generate tests, deduplicate, enhance, validate, and persist.

Parameters:

Name Type Attributes Description

project

Object

The project { id, name, url, credentials? }.

run

Object

The run record (mutated in place with results).

options

Object

Properties

Name	Type	Attributes	Description
`dialsPrompt`	string	<optional>	Pre-built prompt fragment from Test Dials config.
`testCount`	string	<optional>	Test count hint (`"one"` \| `"small"` \| `"medium"` \| `"large"` \| `"ai_decides"`).
`explorerMode`	string	<optional>	`"crawl"` (default) or `"state"` — from Test Dials.
`explorerTuning`	Object	<optional>	Numeric tuning for state explorer `{ maxStates, maxDepth, maxActions, actionTimeout }`.
`signal`	AbortSignal	<optional>	Abort signal for cancellation.

Source:

crawler.js, line 325

Returns:

Type: Promise.<void>

(static) generateFromUserDescription()

generateFromUserDescription — Generates test(s) from a user-provided name + description (no crawl needed).

Uses a dedicated AI prompt that produces tests matching the user's stated intent. The number of tests is controlled by the testCount dial (1–20, default "one"). Unlike the crawl pipeline which discovers pages automatically, this skips Steps 1-3 and goes straight to AI generation.

Pipeline: Step 1-3: SKIPPED (Crawl, Filter, Classify — user provides intent directly) Step 4: Generate — AI generates test(s) from name + description Step 5: Deduplicate — Check against existing project tests Step 6: Enhance — Strengthen assertions Step 7: Validate — Reject malformed / placeholder tests Step 8: Done

Source:

crawler.js, line 231

(async, inner) filterAndClassify(snapshots, snapshotsByUrl, project, run, signalopt) → {Promise.<{filteredSnapshots: Array.<object>, classifiedPages: Array.<object>, classifiedPagesByUrl: Record.<string, object>}>}

Shared Steps 2 & 3: Element filtering + intent classification. Extracted to avoid duplication between the "state" and "crawl" branches.

Parameters:

Name	Type	Attributes	Description
`snapshots`	Array.<object>		— raw page snapshots from crawl or explore
`snapshotsByUrl`	Record.<string, object>		— URL → snapshot map (mutated in place)
`project`	object		— project record (url used for log trimming)
`run`	object		— mutable run record
`signal`	AbortSignal	<optional>

Source:

crawler.js, line 177

Returns:

Type: Promise.<{filteredSnapshots: Array.<object>, classifiedPages: Array.<object>, classifiedPagesByUrl: Record.<string, object>}>

(inner) runDiffAwareBaseline(project, run, snapshots, mode, optsopt) → {Object}

AUTO-002 / AUTO-002b: shared diff-aware baseline runner. Compares the current crawl's snapshots against the persisted baseline, emits the pages_changed SSE event, and merges the new fingerprints into the baseline table.

Two callers, two key-derivation strategies:

Link crawl (mode="crawl") keys baselines by snapshot URL — one row per page. The caller filters snapshots[] down to changed pages so generation only runs on what changed.
State explorer (mode="state") keys baselines by a composite url#fp=<fingerprint> — distinct states at the same URL (login form blank vs login form with errors) are tracked as separate baseline rows. The caller does not filter snapshots[] post-diff because journeys reference unchanged states for context; filtering would break flow generation. The diff is informational + persistent, but no-change crawls still short-circuit the generation pipeline.

Parameters:

Name Type Attributes Description

project

object

project record (must carry id + canonicalUrl/url)

run

object

mutable run record

snapshots

Array.<object>

normalised snapshots (with synthetic .url for state mode)

mode

string

"crawl" | "state"

opts

object

Properties

Name	Type	Attributes	Description
`fingerprintOf`	function	<optional>	Forwarded to `diffCrawlSnapshots`. State mode supplies a function that returns a pre-computed fingerprint so the composite `url#fp=<fp>` key doesn't feed back into `fingerprintState`'s URL-derived computation (which would make every state-mode re-crawl look "changed" — the bug AUTO-002b's first round shipped with).

Source:

crawler.js, line 90

Returns:

skipped=true when the diff was bypassed (preview crawl or zero snapshots). noChanges=true when there's an existing baseline and nothing changed. changedSet is the set of keys (URLs or composite keys) that changed; the caller decides whether to filter snapshots[] against it.

Type: Object

Module: crawler

Pipeline stages

Explorer modes (Test Dials exploreMode)

Exports

Methods

(static) crawlAndGenerateTests(project, run, optionsopt) → {Promise.<void>}

Parameters:

Properties

Returns:

(static) generateFromUserDescription()

(async, inner) filterAndClassify(snapshots, snapshotsByUrl, project, run, signalopt) → {Promise.<{filteredSnapshots: Array.<object>, classifiedPages: Array.<object>, classifiedPagesByUrl: Record.<string, object>}>}

Parameters:

Returns:

(inner) runDiffAwareBaseline(project, run, snapshots, mode, optsopt) → {Object}

Parameters:

Properties

Returns:

Explorer modes (Test Dials `exploreMode`)