Members
(constant) ACTION_CALL_RE
Pattern that matches any method call on page/locator/expect in Playwright code. Captures: the receiver expression + the method name. e.g. page.clicks(...) → method = "clicks" locator.fillup() → method = "fillup"
- Source:
(constant) ASSERTION_RE
Matches the full assertion chain after expect(): expect(page).toHaveURL(...) expect(locator).not.toBeVisible() expect(value).toBe(...) expect(page.locator('...').first()).toBeVisible()
Uses greedy .+ so the regex backtracks from the last ) on the
line, correctly handling nested parentheses inside the expect()
expression (e.g. .locator(...).first()).
Groups: [1] target expression inside expect(...) [2] optional ".not" negation [3] matcher name
- Source:
(constant) AUTO_APPROVER_USER
Pseudo-user attributed to machine-made approvals in tests.approvedBy and
activities.userName. The literal "auto-approver" is pinned by the
audit-trail contract in ROADMAP.md (AUTO-003b) and NEXT.md, so consumers
(UI badges, activity log filters, route handlers) should reference this
constant rather than re-typing the string.
- Source:
(constant) CRAWL_NETWORKIDLE_TIMEOUT
pageSnapshot.js — Captures a serialised DOM snapshot from a live Playwright page
Extracts interactive elements, form structures, semantic sections, headings, and page-level signals (modals, tabs, tables, login forms) so the AI has rich context for test generation.
Exports: takeSnapshot(page) → snapshot object
- Source:
(constant) CSS_LOCATOR_RE
Captures CSS selector arguments passed to .locator(), .querySelector*, or .waitForSelector().
Three alternations handle the three JS string delimiters so that a
quote character different from the outer delimiter (e.g. " inside a
'-delimited string) does not prematurely terminate the capture.
Without this, selectors like 'button[type="submit"]' or XPaths like
'//div[@id="main"]' would be truncated at the inner ".
- Source:
(constant) DEVICE_PRESETS :Array.<{label: string, value: string}>
Curated device profiles for the UI dropdown.
Each entry maps a display label to its Playwright devices key.
Type:
- Array.<{label: string, value: string}>
- Source:
(constant) EXPECT_LOCATOR_RE
Captures expect(page.locator(<literal>)).[not.]<matcher>(...) chains.
Only literal-string locator arguments are matched — dynamic locator
expressions (variables, chained .first(), etc.) are outside scope and
get a pass. Three quote alternations mirror the other RE helpers so
inner quotes ("[data-id='x']") don't truncate the capture.
- Source:
(constant) FEEDBACK_TIMEOUT_MS
Maximum time the AI feedback loop is allowed to run before being abandoned. Default 180s — generous enough for Ollama (local models are slow on large prompts) while still preventing indefinite hangs on cloud providers. Override via FEEDBACK_TIMEOUT_MS env var.
- Source:
(constant) FUZZY_NAME_THRESHOLD
Fuzzy name similarity threshold — names this similar are treated as duplicates
- Source:
(constant) HAS_PAGE_LOAD_ASSERTION_RE
Regex that matches toHaveURL or toHaveTitle only when they appear as
method calls after an expect( expression — i.e. inside a real assertion
chain. Bare mentions in comments (// TODO: add toHaveURL) or string
literals ('toHaveURL') are NOT matched.
Pattern: expect( … ) … .toHaveURL( or .toHaveTitle(
The .+ is greedy so it backtracks from the last ) on the line,
correctly handling nested parens like expect(page.locator('x').first()).
- Source:
(constant) NOISE_PARAMS
Query parameter patterns that are always noise. Exported so stateFingerprint.js can reuse the same list (DRY).
- Source:
(constant) NON_VISUAL_PATTERNS
Patterns that match non-visual Playwright actions at the end of a test body. If the last non-blank, non-comment line matches any of these, we skip screenshot capture on success since the page hasn't visually changed.
- Source:
(constant) NO_NEGATE_MATCHERS
Matchers that must NOT be used with .not because the negated form is logically redundant or always-passes (Playwright warns/errors on these).
- Source:
(constant) PROMISE_CHAIN_METHODS
Promise-chain methods that can appear after an expect() assertion chain
but are NOT assertion matchers. The greedy ASSERTION_RE can capture these
when .catch(() => {}) or .then(...) follows an expect chain (e.g.
expect(loc).toContainText(/x/).catch(() => {})). Skip them silently.
- Source:
(constant) SAFE_HELPER_MATCHERS
Matchers that have a safe-helper equivalent and therefore should not be
chained off a raw page.locator(<cssSelector>) expression. Tests that
combine these with raw-CSS locators are rejected so the generator retries.
NOTE: toHaveCount, toBeHidden, toHaveValue, toHaveAttribute,
toHaveClass, and toHaveCSS are intentionally not listed here —
the SELF_HEALING_PROMPT_RULES in selfHealing.js explicitly tells the
AI to use page.locator(...) for count/state/attribute assertions, so
rejecting those would contradict the generation prompt. Only visibility
and textual-content assertions are enforced to go through safeExpect.
- Source:
(constant) SEMANTIC_SIMILARITY_THRESHOLD
Semantic (TF-IDF cosine) similarity threshold
- Source:
(constant) SIGNIFICANT_PARAMS
Query parameter names that carry state-significant meaning. Exported so stateFingerprint.js can reuse the same set (DRY).
- Source:
(constant) STOP_WORDS
buildTfIdfVector(text) → Map<term, tfidf-weight>
Single-document TF vector (no corpus IDF — we compare pairs at call time so a true IDF isn't available). Sufficient for cosine similarity between two short test descriptions.
Common English stop-words and common QA/Playwright verbs are removed so the signal comes from domain-specific nouns (page names, feature keywords, form field names, etc.).
- Source:
(constant) VALID_CSS_PSEUDOS
CSS pseudo-classes that are valid in a browser context.
Any other :
- Source:
(constant) VALID_MATCHERS
All Playwright matcher names (with and without "not." prefix). Source: https://playwright.dev/docs/api/class-locatorassertions
- Source:
(constant) VALID_PAGE_ACTIONS
Complete whitelist of Playwright API methods that Sentri-generated tests
are expected to call. Any method call on page, locator(), or expect()
that is NOT in this set is flagged as an invalid action.
Grouped for readability; the Set is what drives validation.
- Source:
(constant) XPATH_LOCATOR_RE
Captures XPath strings (detected by leading // or (// patterns).
Same three-alternation strategy as CSS_LOCATOR_RE above.
- Source:
Methods
analyzeRunResults()
analyzeRunResults(runResults, tests, snapshots) → improvement plan
Returns a list of tests that need regeneration with failure context.
- Source:
applyFeedbackLoop()
applyFeedbackLoop(run, { signal } = {}) → summary
Full feedback loop: analyzes results, regenerates failing tests. Called after a test run completes. Accepts an optional AbortSignal so long-running AI calls can be cancelled.
- Source:
attachPageListeners()
Attach network & console listeners to a page.
Returns { networkLogs, consoleLogs, dispose } — the arrays are mutated
in-place as events arrive. Call dispose() before closing the page to
prevent async response handlers from accessing a closed page (which
throws unhandled rejections that crash Node.js).
- Source:
buildObservedActionsBlock(observedActions) → {string}
Format observed actions from the state explorer into a prompt block.
Only included when journey._observedActions is present (state explorer mode).
Parameters:
| Name | Type | Description |
|---|---|---|
observedActions |
Array | — from flowToJourney()._observedActions |
- Source:
Returns:
- Type
- string
buildPipelineStats(params) → {object}
Build the pipelineStats summary object attached to run records.
Parameters:
| Name | Type | Description |
|---|---|---|
params |
object |
- Source:
Returns:
- Type
- object
buildQualityAnalytics()
buildQualityAnalytics(improvements, testMap) → analytics object
Produces a structured breakdown of failures for the run record.
- Source:
buildSandboxContext(exposed) → {Object}
Build a vm context for executing AI-generated Playwright code.
Injects only the objects the test needs (page, context, expect, etc.) plus Node.js globals that vm.createContext() doesn't provide automatically. Dangerous globals (process, require, global, etc.) are explicitly blocked.
NOTE: Any injected host object can be used to reach the host's Function
constructor via .constructor.constructor. The env-stripping in
runWithStrippedEnv() is the actual security boundary, not this context.
Parameters:
| Name | Type | Description |
|---|---|---|
exposed |
Object | — caller-provided objects to inject |
- Source:
Returns:
A vm context object
- Type
- Object
buildUserJourneys()
buildUserJourneys(classifiedPages, snapshotsByUrl?) → Array of journey objects
Chains related pages into GENUINE multi-page user journeys. Single-page intents are NOT wrapped as journeys — they are handled separately by generateIntentTests in journeyGenerator.js.
Detection strategies (applied in order):
- Intent-based patterns — AUTH→dashboard, multi-CHECKOUT, multi-SEARCH, multi-CRUD
- Link-graph analysis — when snapshots are provided, discover cross-intent journeys by following outbound links between classified pages
- Form→confirmation — FORM_SUBMISSION page linking to a CONTENT/NAVIGATION page
- Source:
captureBoundingBoxes()
captureBoundingBoxes(page) → Array<{ x, y, width, height }>
Collects bounding boxes of the last interacted / focused elements so the frontend OverlayCanvas can draw highlights.
- Source:
captureDomSnapshot()
Serialises a shallow representation of the current DOM (max depth 4) for debugging and AI context. Returns null on any failure.
- Source:
captureScreenshot(page, runId, stepIndex, optsopt)
captureScreenshot(page, runId, stepIndex, opts) → { base64, artifactPath }
Takes a PNG screenshot, writes it to disk, and returns both the base64 string (for SSE) and the artifact path (for the DB).
Parameters:
| Name | Type | Attributes | Description | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
page |
Object | ||||||||||||||
runId |
string | ||||||||||||||
stepIndex |
number | — test index within the run |
|||||||||||||
opts |
Object |
<optional> |
Properties
|
- Source:
captureWebVitals()
captureWebVitals(page) — AUTO-017.1
Reads the metrics accumulated on window.__sentriVitals by the observers
installed via registerWebVitalsInitScript at context creation. Because the
observers have been running during the entire page lifecycle, LCP / CLS /
TTFB reflect actual measurements rather than post-hoc buffered replays.
Waits up to 800ms (early-exiting as soon as LCP + TTFB + CLS are populated)
to let any final reportAllChanges callbacks flush. INP is reported only
after a user interaction — it stays null for non-interactive tests, which
the evaluator treats as "not measured" rather than a failure.
Falls back to the empty-metrics shape if the init script was never registered (e.g. web-vitals not installed, or context is an older run started before AUTO-017.1 landed).
- Source:
checkCssSelector(selector) → {string|null}
Validates a CSS selector string for obvious structural errors. Not a full CSS parser — catches the most common AI mistakes.
Parameters:
| Name | Type | Description |
|---|---|---|
selector |
string |
- Source:
Returns:
Error description or null if OK
- Type
- string | null
checkXPath(xpath) → {string|null}
Validates an XPath string for common structural errors.
Parameters:
| Name | Type | Description |
|---|---|---|
xpath |
string |
- Source:
Returns:
- Type
- string | null
classifyElement()
classifyElement(element) → { element, intent, confidence }
Uses weighted scoring where element TYPE matters more than text content. A password input strongly signals AUTH; a link containing "password" does not.
- Source:
classifyPage()
classifyPage(snapshot, filteredElements) → page intent summary
Returns the dominant intent for the page, classified elements, and priority tier. Priority is based on the dominant intent — interactive pages get more test coverage.
- Source:
classifyPageWithAI(signalopt)
classifyPageWithAI(snapshot, filteredElements, { signal }) → page intent summary
Same as classifyPage but falls back to the AI when heuristic confidence is below AI_THRESHOLD. Call this from the crawler pipeline instead of classifyPage when an AI provider is available.
Parameters:
| Name | Type | Attributes | Description |
|---|---|---|---|
signal |
AbortSignal |
<optional> |
— forwarded to AI calls so abort stops classification |
- Source:
codeToHumanStep()
Convert a Playwright code line into a human-readable step description. e.g. "await page.goto('https://example.com')" → "Navigate to https://example.com"
- Source:
cosineSimilarity(vecA, vecB) → {number}
cosineSimilarity(vecA, vecB) → number 0–1
Standard cosine similarity between two sparse TF vectors. Threshold: ≥ 0.65 is treated as a semantic duplicate (defect #1).
Parameters:
| Name | Type | Description |
|---|---|---|
vecA |
Map.<string, number> | |
vecB |
Map.<string, number> |
- Source:
Returns:
- Type
- number
createPiiContext(optsopt)
Create a sanitizer context. Pass the same context into multiple
sanitizeDomSnapshot calls to guarantee that identical input
values resolve to identical placeholder IDs across all calls (so the AI
can correlate references across snapshots + classified pages).
Parameters:
| Name | Type | Attributes | Description | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
opts |
object |
<optional> |
Properties
|
- Source:
deduplicateAcrossRuns()
deduplicateAcrossRuns(newTests, existingTests) → filtered new tests
Prevents re-adding tests that already exist for the project.
Four-layer strategy:
- Structural hash — existing behaviour
- Normalized name — existing behaviour (renamed tests, same URL)
- Fuzzy name match — Levenshtein ≥ 0.80 (defect #3)
- Semantic TF-IDF — cosine ≥ 0.65 on name+desc+steps (defects #1, #2)
- Source:
deduplicateTests()
deduplicateTests(tests) → { unique: Array, removed: number, stats: object }
Main deduplication function. Returns only the best unique tests.
Three-layer strategy:
- Structural hash — exact Playwright-action fingerprint (fast, O(n))
- Fuzzy name match — Levenshtein similarity ≥ 0.80 (defect #3)
- Semantic TF-IDF — cosine similarity ≥ 0.65 on name+desc+steps (defects #1, #2)
- Source:
detectFlakyTests()
detectFlakyTests(projectId) → Map<testId, flakyInfo>
Scans all run results for a project and identifies tests that have both passed and failed across different runs.
- Source:
dispose()
Call before page.close() to stop handlers from accessing the closed page.
- Source:
endsWithNonVisualAction(playwrightCode) → {boolean}
Returns true when the test body's last meaningful line is a non-visual action (assertion, wait, evaluate) — meaning the page hasn't visually changed since the last interaction and a screenshot would be redundant.
Parameters:
| Name | Type | Description |
|---|---|---|
playwrightCode |
string | null | The raw AI-generated code. |
- Source:
Returns:
- Type
- boolean
enhanceTest()
enhanceTest(test, snapshot, classifiedPage) → enhanced test
Adds or strengthens assertions in a generated test based on context.
Fast-path: if the test already has strong assertions AND a page-load assertion (toHaveURL or toHaveTitle), skip all enhancement work and return immediately. On re-crawls of a well-covered application this eliminates string manipulation for the majority of tests.
- Source:
enhanceTests()
enhanceTests(tests, snapshots, classifiedPages) → enhanced tests array
- Source:
(async) executeApiTest()
executeApiTest(test, runId, stepIndex, runStart) → result object
Runs an API-only test (one that uses request.newContext()) without
spinning up a browser page. Skips screenshots, video, DOM snapshots,
and screencast — none of which apply to API tests.
- Source:
executeTest(test, browser, runId, stepIndex, runStart, optsopt)
executeTest(test, browser, runId, stepIndex, runStart, opts) → result object
Runs a single test case inside a fresh browser context and returns a result object suitable for pushing into run.results.
Parameters:
| Name | Type | Attributes | Description | ||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
test |
Object | ||||||||||||||||||||||||||||||
browser |
Object | Playwright Browser instance. |
|||||||||||||||||||||||||||||
runId |
string | ||||||||||||||||||||||||||||||
stepIndex |
number | ||||||||||||||||||||||||||||||
runStart |
number |
|
|||||||||||||||||||||||||||||
opts |
Object |
<optional> |
Properties
|
- Source:
executeTestIterations(test, fixtureRows, runSingle) → {Promise.<Array.<Object>>}
CAP-001: run runSingle(iterTest) once per fixture row, substituting
{{key}} placeholders in playwrightCode from the row values. When
fixtureRows is empty/missing the test runs once unchanged (zero-
regression contract — fixture-less tests behave exactly as before).
Every iteration runs to completion so failures are attributable to a
specific row (NEXT.md acceptance criterion: "5-row CSV → 5 iteration
results"). Each result carries iterationIndex + fixtureRow snapshot
so the run UI can surface per-row attribution; callers decide retry/abort
semantics based on the returned array.
Parameters:
| Name | Type | Description |
|---|---|---|
test |
Object | |
fixtureRows |
Array.<Object> | undefined | |
runSingle |
function |
- Source:
Returns:
- Type
- Promise.<Array.<Object>>
executeWithRetries(fn, maxRetries) → {Promise.<{result: Object, retryCount: number}>}
Execute an async function with retries.
fn receives the current zero-based attempt index so callers can branch
(e.g. log differently on retry). On exhaustion, the last error is rethrown
with err.retryCount set to the number of retry attempts actually made
(i.e. total attempts minus the initial try) — callers should prefer this
over assuming maxRetries, since fn itself may have thrown synchronously
before reaching its first retry.
Artifact overwrite behaviour (AUTO-005 + testRunner.js integration)
The test runner (backend/src/testRunner.js:229-240) calls
executeWithRetries with the same (runId, stepIndex) on every attempt.
Each attempt recreates its temp Playwright context, records a video, and
writes screenshots / step captures keyed by (runId, stepIndex) — so the
last attempt's artifacts are the ones that survive; earlier attempts'
files are overwritten via fs.renameSync during teardown.
This is intentional — reviewers want to see what happened on the winning
(or final failing) attempt, not the noise of prior flaky attempts. But it
means you cannot replay intermediate retries: if attempt 1 failed, attempt
2 passed, the DB records retryCount=1, status=passed and only attempt
2's video/screenshots/trace exist on disk.
If per-attempt artifact retention is ever needed (e.g. for flake-root-cause
investigation), scope the artifact paths by (runId, stepIndex, attempt)
in backend/src/runner/executeTest.js and add a retention policy — the
storage hit is N× video size per retried test.
Parameters:
| Name | Type | Description |
|---|---|---|
fn |
function |
|
maxRetries |
number | Number of retries after the first attempt. |
- Source:
Returns:
- Type
- Promise.<{result: Object, retryCount: number}>
extractPathPattern()
extractPathPattern(url) → string
Converts /products/123 and /products/456 to /products/:id so we only crawl one version.
- Source:
extractPathPatternWithParams(url) → {string}
extractPathPatternWithParams(url) → string
Like extractPathPattern but includes significant query parameters
in the pattern so /products?category=electronics and
/products?category=books produce different patterns.
Used by the state explorer where query params are preserved (#52 defect #1).
The original extractPathPattern (without params) is still used by
crawlBrowser.js where query params are stripped before pattern extraction.
Parameters:
| Name | Type | Description |
|---|---|---|
url |
string |
- Source:
Returns:
- Type
- string
extractTestBody()
extractTestBody(playwrightCode)
Pulls the async function body out of the generated Playwright test so we can run it directly against an already-open page/context — without needing to spawn a whole new Playwright test runner process.
Handles both common shapes the AI produces: test('name', async ({ page }) => { ... }) test('name', async ({ page, context }) => { ... })
- Source:
extractTestsArray()
extractTestsArray(parsed) — normalise the 3 common AI response shapes into a plain array of test objects:
- Already an array → return as-is
- { tests: [...] } → unwrap
- Single object { name } → wrap in array
- Anything else → empty array
- Source:
finalizePiiContext()
Emit the per-run pipeline.pii_redacted audit log. total sums only
the five non-overlapping categories (email/phone/ssn/card/token) —
jwt, bearer, and queryAuth are subdivisions of token and would
double-count if added to the aggregate.
- Source:
fingerprintHash(str) → {string}
fingerprintHash(str) → 16-char hex string (64-bit via SHA-256 truncation)
Replaces the previous 32-bit djb2 implementation. A 32-bit hash has a ~1-in-4-billion collision rate per pair, which becomes non-negligible once a project reaches ~1 000 tests (~500 k pairs). This implementation uses the first 8 bytes of SHA-256 (64 bits), reducing the per-pair collision probability to ~1-in-18-quintillion — safe at any realistic test suite size.
Uses Node's built-in node:crypto (no new dependency). Synchronous
createHash is used rather than crypto.subtle.digest so the function
stays synchronous and callers require no changes.
Parameters:
| Name | Type | Description |
|---|---|---|
str |
string | Input string to hash. |
- Source:
Returns:
16-character lowercase hex fingerprint.
- Type
- string
fingerprintStructure()
fingerprintStructure(snapshot) → string
Creates a structural fingerprint of a page based on its DOM shape, not its content. Used to detect "template" pages (e.g. blog post A vs B).
- Source:
formatTestError()
Extract a clean, UI-safe error message from an Error (or AggregateError).
- Source:
fuzzyNameSimilarity(a, b) → {number}
fuzzyNameSimilarity(a, b) → number 0–1
Returns 1.0 for identical strings, 0.0 for completely different. Threshold: ≥ 0.80 is treated as a duplicate name match.
Parameters:
| Name | Type | Description |
|---|---|---|
a |
string | Already-normalized string |
b |
string | Already-normalized string |
- Source:
Returns:
- Type
- number
generateAllTests() → {Object}
generateAllTests(classifiedPages, journeys, snapshotsByUrl) → { tests, rateLimitHit, rateLimitError }
Orchestrates full test generation: journeys first, then per-page intent tests. ALL pages get comprehensive tests — not just high-priority ones.
- Source:
Returns:
- Type
- Object
generateApiTests(apiEndpoints, appUrl, optsopt) → {Promise.<Array.<object>>}
generateApiTests(apiEndpoints, appUrl, opts) → Array of test objects
Generates Playwright request API tests from HAR-captured endpoint summaries.
Returns an empty array if no endpoints were captured or the AI call fails.
Parameters:
| Name | Type | Attributes | Description | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
apiEndpoints |
Array.<ApiEndpoint> | — from summariseApiEndpoints() |
|||||||||||||||||
appUrl |
string | — project base URL |
|||||||||||||||||
opts |
object |
<optional> |
Properties
|
- Source:
Returns:
- Type
- Promise.<Array.<object>>
generateFromDescription()
generateFromDescription(name, description, appUrl) → Array of test objects
Generates test(s) focused on the user's provided name + description.
The number of tests is controlled by the testCount dial (1–20).
Used by the POST /api/projects/:id/tests/generate endpoint instead of the
generic generateIntentTests which produces crawl-oriented tests.
When the description indicates API test intent (mentions endpoints, HTTP
methods, status codes, etc.), automatically routes to the API test prompt
which generates Playwright request API tests instead of UI tests.
- Source:
generateIntentTests()
generateIntentTests(classifiedPage, snapshot) → Array of test objects
- Source:
generateJourneyTest()
generateJourneyTest(journey, snapshotsByUrl) → array of test objects or []
- Source:
getExpect()
getExpect()
Returns Playwright's expect function by lazy-importing it from the
test runner module. We don't import at the top level because Playwright's
expect lives in @playwright/test which we don't load globally.
- Source:
hashTest()
hashTest(test) → string fingerprint
Generates a fingerprint from the test's structural content, ignoring surface-level wording differences.
- Source:
injectStepCaptures(code) → {string}
Inject await __captureStep(N) calls after each // Step N: comment in the
test body so we capture a screenshot + timing after each logical step.
If the code has no // Step N: comments (older tests, manual code), the
original code is returned unchanged — the caller falls back to a single
end-of-test screenshot.
Parameters:
| Name | Type | Description |
|---|---|---|
code |
string | — cleaned test body |
- Source:
Returns:
instrumented code
- Type
- string
isApiIntent(name, description) → {boolean}
Detect whether the user's test name + description indicate API test intent.
Parameters:
| Name | Type | Description |
|---|---|---|
name |
string | |
description |
string |
- Source:
Returns:
- Type
- boolean
isApiTest()
isApiTest(playwrightCode)
Returns true when the generated code is an API-only test that uses
request.newContext() (Playwright's APIRequestContext) rather than
browser-based page interactions.
API tests:
- Do NOT need a browser page or page.goto()
- Need a real Playwright
requestfixture instead of the undefined stub - Should skip browser-specific artifacts (screenshots, DOM snapshots, video)
- Source:
isAutoApprovalDisabled()
Global kill-switch for auto-approval (AUTO-003b). Read on every persist
call from DISABLE_AUTO_APPROVAL — any truthy value ("1", "true",
"yes", case-insensitive) forces every generated test to land in Draft
regardless of the project-level autoApproveThreshold.
Intended for ops incidents: if an AI provider starts producing bad tests faster than reviewers can revoke them, setting this env var is a one-step rollback that doesn't require a code deploy or per-project threshold reset. Per-project thresholds stay intact and take effect again as soon as the env var is removed.
The check runs per-call (one string compare and a process.env read,
neither measurable at the persist hot path) so operators don't have to
restart the backend to flip the switch — and so test fixtures can drive
the behaviour by mutating process.env between cases. Matches the
convention used by other env-var gates in the codebase (e.g.
ALLOW_PRIVATE_URLS in routes/system.js).
Exported so the test suite can call it directly without round-tripping
through persistGeneratedTests.
- Source:
isProviderExhausted()
True when the error reaching this layer represents a durably exhausted provider — i.e. aiProvider.js has already retried with exponential backoff AND tried every configured fallback provider, and they all failed with either a rate-limit (429) or a transient 5xx ("high demand", "service unavailable", etc.). In both cases hammering the provider with more requests is pointless until its quota / outage window resets, so the pipeline should short-circuit the same way for either class of error.
- Source:
launchBrowser(overridesopt) → {Promise.<Object>}
Launch a browser with the shared config.
All modules that need a browser should call this instead of
chromium.launch() / firefox.launch() / webkit.launch() directly, so
launch args, env overrides, and the cross-browser selector stay in one
place.
The browser-specific executablePath env var is only applied when its
engine is selected — e.g. PLAYWRIGHT_CHROMIUM_EXECUTABLE_PATH has no
effect when launching firefox, where Playwright bundles its own binary.
Parameters:
| Name | Type | Attributes | Description | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
overrides |
Object |
<optional> |
— Playwright LaunchOptions merged on top Properties
|
- Source:
Returns:
Playwright Browser instance
- Type
- Promise.<Object>
levenshteinDistance(a, b) → {number}
levenshteinDistance(a, b) → integer edit distance
Classic DP implementation. Used by fuzzyNameSimilarity() to catch paraphrased test names (defect #3).
Parameters:
| Name | Type | Description |
|---|---|---|
a |
string | |
b |
string |
- Source:
Returns:
- Type
- number
listByProjectIds(projectIds) → {Array.<Object>}
Batch-fetch environments for a set of project IDs in one SQL round-trip.
Returns rows in the same order SQLite produces them; callers that need a
per-project map can group on row.projectId.
Added for the dashboard's per-environment pass-rate aggregation
(routes/dashboard.js) — the previous per-project loop issued N+1
listByProject queries which scaled poorly on workspaces with many
projects. Mirrors the pattern in githubCheckSettingsRepo.listByProjectIds.
Parameters:
| Name | Type | Description |
|---|---|---|
projectIds |
Array.<string> |
Returns:
environment rows (credentials JSON-parsed, still
AES-encrypted — call decryptCredentials() to peel that layer).
- Type
- Array.<Object>
normalizeQualityToConfidence(quality) → {number}
normalizeQualityToConfidence(quality) → number 0–1
Single source of truth for converting the 0–100 quality rubric output
(scoreTest / _quality) into the 0–1 confidenceScore scale used by
AUTO-003b's autoApproveThreshold comparison. Previously this /100
normalization was inlined in three places (deduplicator.js,
pipelineOrchestrator.js, testPersistence.js); centralizing avoids
drift if the rubric range ever changes.
Coerces non-finite / negative inputs to 0 and clamps to [0, 1] so callers can safely use the result without revalidating.
Parameters:
| Name | Type | Description |
|---|---|---|
quality |
number | — 0–100 score from scoreTest / scoreTestWithFactors |
- Source:
Returns:
0–1 confidence
- Type
- number
normalizeText()
normalizeText(s) → lowercase, whitespace-collapsed string Used so minor phrasing differences don't create false uniqueness
- Source:
parseEndpointHints(description, appUrl) → {Array.<ApiEndpoint>}
Parse lightweight endpoint hints from the user's description. Extracts patterns like "GET /api/users", "POST /login" and builds minimal ApiEndpoint-shaped objects for buildApiTestPrompt.
Parameters:
| Name | Type | Description |
|---|---|---|
description |
string | |
appUrl |
string |
- Source:
Returns:
- Type
- Array.<ApiEndpoint>
patchNetworkIdle()
patchNetworkIdle(code)
Rewrites any waitForLoadState('networkidle') or waitForLoadState("networkidle") calls that the AI may have generated into the safe domcontentloaded equivalent.
Many real-world sites (SPAs, e-commerce like Amazon) fire continuous background XHR/fetch requests for ads, personalisation, and tracking — they never reach networkidle, so Playwright always times out after 30 s. domcontentloaded is sufficient to guarantee the primary DOM content is ready for interaction.
Also rewrites page.goto() calls that use waitUntil:'networkidle' to use waitUntil:'domcontentloaded' for the same reason.
Additionally, wraps bare element.click() calls that are immediately followed by a waitForNavigation/waitForLoadState pattern into a safer Promise.all so the navigation promise is registered before the click fires.
- Source:
persistHealingEvents(testId, events)
persistHealingEvents(testId, events)
Writes healing events to the DB so future runs benefit from what we learned. Safe to call with an empty or undefined events array.
Parameters:
| Name | Type | Description |
|---|---|---|
testId |
string | — the test these events belong to |
events |
Array | — healing events from runGeneratedCode |
- Source:
regenerateFailingTest()
regenerateFailingTest(improvement, signal) → improved test or null
Calls the AI to produce a fixed version of a failing test. Accepts an optional AbortSignal so the operation can be cancelled.
- Source:
registerWebVitalsInitScript()
registerWebVitalsInitScript(context) — AUTO-017.1
Installs the web-vitals IIFE + observer bootstrap on the browser context
via addInitScript, so observers are active from the first byte of every
navigation. Must be called once per context immediately after creation and
before the first page.goto().
No-ops when the web-vitals package isn't installed — callers should still
invoke captureWebVitals(page), which returns the empty-metrics shape in
that case.
- Source:
repairBrokenStringLiterals()
repairBrokenStringLiterals(code)
AI output occasionally breaks CSS/XPath selectors across lines inside single/double-quoted literals, creating invalid JavaScript: page.$('button[name=btnI] [type=submit]')
JavaScript does not allow raw newlines in single/double quotes, so parsing fails with "Invalid or unexpected token". This repair pass replaces newline characters that occur while inside a single/double-quoted string with a space, preserving content while restoring valid syntax.
- Source:
resolveBrowser(nameopt) → {Object}
Resolve a browser name string to a Playwright BrowserType.
Unknown values fall back to chromium so a typo doesn't crash a run.
Non-string truthy inputs (numbers, objects, booleans) are treated as
unknown rather than throwing — the route layer can pass req.body.browser
straight through without a typeof guard.
Parameters:
| Name | Type | Attributes | Description |
|---|---|---|---|
name |
* |
<optional> |
One of |
- Source:
Returns:
The Playwright BrowserType and its canonical name.
- Type
- Object
resolveDevice(deviceName) → {Object|null}
Resolve a device name to a Playwright device descriptor.
Returns null for empty/unknown names (caller should use default context).
Parameters:
| Name | Type | Description |
|---|---|---|
deviceName |
string | A key from |
- Source:
Returns:
Playwright device descriptor with viewport, userAgent, etc.
- Type
- Object | null
resolveTestCountInstruction(testCount, localopt) → {string}
Resolve the test count instruction for prompt builders.
Maps the validated testCount dial value to an authoritative instruction string that replaces the previously hardcoded "Generate 3-5 / 5-8 tests" ranges. The instruction is worded imperatively so the LLM treats it as a hard constraint rather than a suggestion.
Parameters:
| Name | Type | Attributes | Description |
|---|---|---|---|
testCount |
string | — validated dial value (one|small|medium|large|ai_decides) |
|
local |
boolean |
<optional> |
— true when using a local provider (Ollama). Defaults to isLocalProvider() when omitted. |
- Source:
Returns:
e.g. "Generate EXACTLY 1 test" or "Generate 5-8 tests"
- Type
- string
runApiTestCode(playwrightCode, expect, optionsopt)
runApiTestCode(playwrightCode, expect)
Executes an API-only test that uses Playwright's request.newContext()
instead of a browser page. Creates a real APIRequestContext, runs the
generated code, and cleans up afterwards.
Returns { passed: true, apiLogs } or throws with the error.
Parameters:
| Name | Type | Attributes | Description | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
playwrightCode |
string | The AI-generated Playwright test code. |
|||||||||
expect |
function | Playwright's expect function. |
|||||||||
options |
Object |
<optional> |
Properties
|
- Source:
runFeedbackLoop(run, tests, signalopt)
runFeedbackLoop(run, tests, signal)
Analyses failures from the completed test run and auto-regenerates high-priority failing tests via AI. No-ops silently when:
- There are no failures
- The run was aborted
- No AI provider is configured
- The AI provider is degraded (rate-limited / circuit-broken)
The AI portion is wrapped in a timeout (FEEDBACK_TIMEOUT_MS, default 180s) so it can never block run completion indefinitely.
Parameters:
| Name | Type | Attributes | Description |
|---|---|---|---|
run |
object | — mutable run record |
|
tests |
Array | — the test objects that were executed |
|
signal |
AbortSignal |
<optional> |
- Source:
runGeneratedCode(page, context, playwrightCode, expect, healingHintsopt, optsopt)
runGeneratedCode(page, context, playwrightCode, expect, healingHints, opts)
Dynamically executes the AI-generated test body against the live page. Returns { passed: true, healingEvents: [...], stepCaptures: [...] } or throws.
healingHints is an optional map of "action::label" → strategyIndex from previous runs, injected into the runtime helpers so the winning strategy is tried first (adaptive self-healing).
Parameters:
| Name | Type | Attributes | Description | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
page |
Object | ||||||||||
context |
Object | ||||||||||
playwrightCode |
string | ||||||||||
expect |
function | ||||||||||
healingHints |
Object |
<optional> |
|||||||||
opts |
Object |
<optional> |
Properties
|
- Source:
(async) runInSandbox(code, exposed, filenameopt) → {Promise.<*>}
Compile and execute code inside a vm sandbox with env stripping.
Parameters:
| Name | Type | Attributes | Default | Description |
|---|---|---|---|---|
code |
string | — The full async IIFE source to execute |
||
exposed |
Object | — Objects to inject into the sandbox context |
||
filename |
string |
<optional> |
generated-test.js | — Virtual filename for stack traces |
- Source:
Returns:
The return value of the executed code
- Type
- Promise.<*>
runPostGenerationPipeline(rawTests, project, run, opts) → {Object}
Run the shared post-generation pipeline stages: Step 5: Deduplicate against batch + existing project tests Step 6: Enhance assertions Step 7: Validate (reject malformed / placeholder tests)
Parameters:
| Name | Type | Description | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
rawTests |
Array.<object> | — AI-generated test objects |
||||||||||||||||
project |
object | — project record |
||||||||||||||||
run |
object | — mutable run record |
||||||||||||||||
opts |
object |
Properties
|
- Source:
Returns:
- Type
- Object
(async) runWithStrippedEnv(fn) → {Promise.<*>}
Execute a function with destructive process methods blocked.
Blocks process.exit(), process.kill(), and process.abort() so that
sandbox-escaped code cannot crash the server. The vm sandbox already hides
process from the global scope; these guards are a defense-in-depth layer
for the .constructor.constructor('return process')() escape path.
NOTE: process.env is NOT stripped — doing so breaks concurrent server
operations (Express handlers, JWT verification, AI calls) that run on the
same event loop between await points. For env isolation, use worker_threads
with env: {}.
Concurrency-safe: uses a reference counter so parallel workers (poolMap in testRunner.js) all run with guards installed. The first entering test installs them, the last exiting test restores the originals.
Parameters:
| Name | Type | Description |
|---|---|---|
fn |
function | — async function to execute with process guards |
- Source:
Returns:
return value of fn
- Type
- Promise.<*>
sanitiseSteps()
sanitiseSteps(tests) If a test's steps array contains Playwright code instead of human-readable descriptions (common with smaller LLMs like Mistral 7B), convert them.
- Source:
sanitizeDomSnapshot(input, ctxOrOptsopt) → {Object}
Sanitize an input value (string, array, or plain object — walked recursively) by replacing PII matches with deterministic placeholders.
Two call forms:
sanitizeDomSnapshot(input, ctx)— caller-owned context (created viacreatePiiContext); no audit log is emitted here. The caller invokesfinalizePiiContextonce after the final call to log aggregate counts for the run. Use this form when multiple artifacts must share placeholder IDs.sanitizeDomSnapshot(input, { allowlist, runId })— convenience form that creates a fresh context internally and emits its own audit log. Counters and placeholders are NOT shared with later calls.
Parameters:
| Name | Type | Attributes | Description |
|---|---|---|---|
input |
* | ||
ctxOrOpts |
object |
<optional> |
— either a context object from
|
- Source:
Returns:
- Type
- Object
sanitizeRunInputs(project, run, inputs) → {Object}
SEC-006: PII firewall — single wiring point between the crawler/classify
stages and the AI prompt builder. Called once per run to redact PII from
the snapshots AND classified pages that feed generateAllTests and the
post-generation pipeline.
Honours per-project controls:
project.strictPiiFirewall(default ON; explicitfalsedisables)project.piiAllowlist(string[] passed through to the sanitizer)
Returns the same shape it was given so callers can transparently swap in the sanitized values. When the firewall is disabled, inputs are returned unchanged and no audit log is emitted.
Parameters:
| Name | Type | Description | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
project |
object | |||||||||||||
run |
object | |||||||||||||
inputs |
object |
Properties
|
- Source:
Returns:
- Type
- Object
scoreTest()
scoreTest(test) → number 0–100
Quality score used to pick the best test when duplicates are found. Higher = better quality test to keep.
Thin wrapper around scoreTestWithFactors — kept as a separate
export so existing call sites (deduplicateTests, testPersistence) and
unit tests don't need to know about the factor breakdown.
- Source:
scoreTestWithFactors(test) → {Object}
scoreTestWithFactors(test) → { score: number, factors: Array<{ id, label, delta, kind }> }
Companion to scoreTest that also returns the list of factors that
applied. Drives the Review Queue's "why was this drafted?" explainer so a
reviewer can see at a glance which rewards and penalties produced the score
— without inspecting the test code.
The numeric score is identical to scoreTest()'s output; the two functions
share the QUALITY_FACTORS rubric so they can never drift.
Parameters:
| Name | Type | Description |
|---|---|---|
test |
object |
- Source:
Returns:
- Type
- Object
semanticSimilarity(testA, testB) → {number}
semanticSimilarity(testA, testB) → number 0–1
Combines name, description, and steps into a single bag-of-words TF-IDF vector and returns cosine similarity. Resolves defect #1 (semantic duplicates with different wording) and defect #4 (description field previously ignored).
Parameters:
| Name | Type | Description |
|---|---|---|
testA |
object | |
testB |
object |
- Source:
Returns:
- Type
- number
shouldSend()
Lifetime dedup for events in ONCE_PER_INSTALL. Returns true if the event should be sent. All other events are always sent.
- Source:
startScreencast(page, runId, optionsopt) → {Promise.<({stop: function(): Promise.<void>, cdpSession: Object}|null)>}
startScreencast(page, runId)
Starts a CDP screencast session and begins streaming JPEG frames to any
SSE clients watching the given run. Frames are throttled via setImmediate
so bursts don't flood the SSE channel; emitRunEvent() no-ops when no
clients are connected so the only overhead is CDP JPEG encoding (~2-3% CPU).
Returns an object with both a stop cleanup function (used by
executeTest.js and the recorder during teardown) and the underlying
cdpSession (used by the recorder to dispatch input events back into
the page). Returns null if CDP is unavailable on the current
browser engine — Firefox / WebKit have no equivalent of Chrome's
Page.startScreencast, so cross-browser test runs gracefully degrade
to a no-screencast / no-input-forwarding mode.
Parameters:
| Name | Type | Attributes | Description | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
page |
Object | Playwright Page instance. |
||||||||||||||||
runId |
string | |||||||||||||||||
options |
Object |
<optional> |
Properties
|
- Source:
Returns:
{ stop, cdpSession } on success, or null if CDP is unavailable.
- Type
- Promise.<({stop: function(): Promise.<void>, cdpSession: Object}|null)>
stripHallucinatedPageAssertions()
stripHallucinatedPageAssertions(code)
Removes expect(page).toHaveURL(...) and similar page assertions that the
AI sometimes hallucinates at the end of API-only tests. These lines would
crash at runtime because page is undefined in the API execution context.
Only called for code that has already been classified as an API test by isApiTest(), so we know there are no real page interactions.
- Source:
stripNoiseParams(u)
Strip noise query parameters from a URL, preserving significant ones.
Shared utility for both crawlBrowser.js and stateExplorer.js so link normalisation is consistent across crawl modes (#52 defect #1).
Parameters:
| Name | Type | Description |
|---|---|---|
u |
URL | — mutable URL object (modified in place) |
- Source:
stripPlaywrightImports()
stripPlaywrightImports(code)
Remove lines like: import { test, expect } from '@playwright/test'; const { test, expect } = require('@playwright/test'); so they don't cause parse errors when we eval the body.
- Source:
validateActions(code) → {Array.<string>}
validateActions(code) → string[]
Scans all method calls on page, locator, frame, context, and
request and flags any that are not in VALID_PAGE_ACTIONS.
Resolves defect #2 — catches typos like .clicks(), .fillIn(), .toHavURL().
Parameters:
| Name | Type | Description |
|---|---|---|
code |
string | Playwright test code |
- Source:
Returns:
Array of issue strings (empty = all actions valid)
- Type
- Array.<string>
validateLocators(code) → {Array.<string>}
validateLocators(code) → string[]
Extracts all CSS and XPath locator strings from the code and validates each. Resolves defect #1.
Parameters:
| Name | Type | Description |
|---|---|---|
code |
string |
- Source:
Returns:
- Type
- Array.<string>
validateSafeHelperUsage(code) → {Array.<string>}
validateSafeHelperUsage(code) → string[]
Rejects expect(page.locator('<cssSelector>')).<visibilityMatcher>(...)
chains. These bypass the self-healing locator waterfall and so fail
silently when the class/id is renamed or the element only renders in a
subset of UI states (the TC-7 regression where .todo-count was missing
in TodoMVC's empty state).
The AI is expected to use one of:
await safeExpect(page, expect, '<visible text>', '<role>')expect(page.getByRole('<role>', { name: '<text>' })).<matcher>(...)expect(page.getByText('<text>')).<matcher>(...)
Human-readable arguments (e.g. locator('Submit')) are a no-op for
page.locator() anyway — Playwright will simply fail to find them —
so they're left for the existing locator validator to flag.
Parameters:
| Name | Type | Description |
|---|---|---|
code |
string |
- Source:
Returns:
- Type
- Array.<string>
validateTest(test, projectUrl) → {Array.<string>}
Validate a single AI-generated test object. Returns an array of issue strings — empty means the test is valid.
Parameters:
| Name | Type | Description |
|---|---|---|
test |
object | — AI-generated test object |
projectUrl |
string | — the project's base URL (for placeholder detection) |
- Source:
Returns:
- Type
- Array.<string>
waitForStable(page, optsopt) → {Promise.<void>}
waitForStable(page, opts) → Promise
S3-02 — DOM stability wait using MutationObserver.
Modern SPAs (React, Vue, Angular, Next.js) and apps with streaming AI
responses, skeleton screens, or async data fetches settle at variable
times. Using a fixed waitForTimeout causes tests to assert on
partially-rendered pages, producing false failures.
This helper installs a MutationObserver on document.body that counts
every DOM mutation. It polls until stableSec consecutive seconds pass
with no new mutations (or timeoutSec is reached), then disconnects
cleanly. The observer and mutation counter are stored on window so
they survive across evaluate() calls and can be cleaned up reliably.
Based on the Assrt agent.ts pattern referenced in NEXT_STEPS S3-02.
Parameters:
| Name | Type | Attributes | Description | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
page |
Object | Playwright Page instance |
||||||||||||||||
opts |
object |
<optional> |
Properties
|
- Source:
Returns:
- Type
- Promise.<void>
withDials()
Inject an optional dialsPrompt into a base AI prompt.
Accepts either:
- A plain string (legacy) → injects before STRICT RULES / Requirements
- A { system, user } object → appends dials to the end of the user message
Returns the same shape as the input (string or { system, user }).
Dials are injected into the USER message (not system) because they represent per-request configuration that varies with each generation run.
- Source: