Module: pipeline/failureClusterer

Deterministic run-failure clustering (no DB, no LLM calls).

Matching heuristic

Two failed results merge into the same cluster when ALL of:

  1. Their normalised errorPattern strings are byte-equal, AND
  2. Either (a) both carry the same non-null URL origin prefix, OR (b) both carry a non-empty selector and the Levenshtein edit distance between the two selectors is ≤ 4, OR (c) neither side carries a URL or a selector (fallback: error- pattern equality alone — see "Null-URL + null-selector" below).

Known limitations (intentional trade-offs)

size vs affectedTestIds.lengthsize counts every failed result row contributing to the cluster (so a data-driven test with N failing iterations contributes N), while affectedTestIds is the deduplicated set of distinct test IDs. The Run Detail UI surfaces affectedTestIds.length for the "N affected test(s)" copy so per-test counts don't double-count iterations / retries.

Null-URL + null-selector merges — when neither side has a URL or a selector (e.g. a thrown error before the first navigation), the matcher falls back to error-pattern equality alone. Acceptable for "AI provider returned 503" style failures; documented here rather than gated because the alternative (singleton clusters per URL-less failure) is the worse UX.

Source:

Methods

(static) clusterFailures(input) → {Array}

Parameters:
Name Type Description
input Object
Properties
Name Type Attributes Description
results Array <optional>

Test result rows; only entries with status === "failed" cluster.

Source:
Returns:

Clusters sorted by descending size, each shaped: { fingerprint, affectedTestIds[], sharedUrl, sharedSelector, errorPattern, size }.

  • size — total failed-result rows in the cluster (includes data-driven iterations).
  • affectedTestIds — deduplicated set of distinct test IDs.
Type
Array

(inner) editDistance(a, b) → {number}

Levenshtein edit distance with O(min(n, m)) memory via a two-row rolling array. Selectors are typically short (< 100 chars) so the quadratic time bound is fine; the rolling-array shape removes a sharp edge if a stack trace ever leaks into the selector field.

Parameters:
Name Type Description
a string
b string
Source:
Returns:
Type
number