JSDoc: Module: database/repositories/runRepo

Run CRUD backed by SQLite.

JSON columns: tests, results, testQueue, generateInput, promptAudit, pipelineStats, feedbackLoop, videoSegments, qualityAnalytics, changedPages, removedPages, changedFiles, impactAnalysis, githubCheck.

Log lines are stored in the run_logs table (ENH-008) — not in a logs JSON column. getById hydrates run.logs from run_logs automatically so callers see no API change.

All read queries filter WHERE deletedAt IS NULL by default. Hard deletes are replaced with soft-deletes: deletedAt = datetime('now'). Use getDeletedByProjectId / restore for recycle-bin operations.

Pagination

getByProjectIdPaged returns { data: Run[], meta: { total, page, pageSize, hasMore } }.

Source:

database/repositories/runRepo.js, line 1

Methods

(static) appendRunResults(runId, newResults) → {number}

CAP-002 Phase 2 — Atomic append of one or more result objects to a run's results JSON column in a single SQL statement.

Why this exists: save(run) writes every column from an in-memory JS snapshot, so N concurrent shard workers calling save(run) would silently lose results to last-write-wins. This primitive performs a single-statement UPDATE that splices new elements into the serialized JSON in place — no read, no JS-side modify, no write-back. The SQL engine serializes concurrent UPDATEs against the same row (better-sqlite3 journal lock on SQLite, row-level lock on Postgres), so each statement is the linearization point — verified by backend/tests/run-storage-concurrency.test.js.

Cross-dialect strategy: rather than reach for SQLite's json_patch() or Postgres' jsonb_set (which the adapter's translateSql does not bridge — see backend/src/database/adapters/postgres-adapter.js), we operate on the column as raw JSON TEXT. substr (1-indexed, length-based) and || (string concat) are spelled identically in both dialects, so the statement runs unmodified through the adapter's translation layer. The column is only ever written by JSON.stringify(...) via runToRow, so the canonical format produces a closing ] as the final character — the rtrim(...) calls below are defensive insurance against a future writer (or direct DB migration) that pretty-prints or appends trailing whitespace, which would otherwise corrupt the splice into malformed JSON.

Strategy: build the new chunk client-side as JSON.stringify(newResults) (always shaped [...]), then in SQL:

if results is NULL or '[]' → overwrite with the new chunk
otherwise → trim trailing whitespace, then substr(rtrim(results), 1, length(rtrim(results)) - 1) || ',' || substr(newChunk, 2) i.e. drop the existing trailing ], append ,, then append the new chunk's interior …].

No-ops cleanly when newResults is empty or not an array.

Parameters:

Name	Type	Description
`runId`	string
`newResults`	Array.<Object>	— Result objects to append. Each element must be JSON-serialisable (no functions, no cycles). Empty array → no-op.

Source:

database/repositories/runRepo.js, line 592

Returns:

Number of elements actually appended.

Type: number

(static) countByProjectIds(projectIds) → {number}

Count non-deleted runs for a set of project IDs.

Parameters:

Name	Type	Description
`projectIds`	Array.<string>

Source:

database/repositories/runRepo.js, line 325

Returns:

Type: number

(static) create(run)

Create a run. Note: run.logs is intentionally not written to runs.logs — log lines are persisted via runLogRepo.appendLog in runLogger.js (ENH-008).

Parameters:

Name	Type	Description
`run`	Object

Source:

database/repositories/runRepo.js, line 507

(static) deleteByProjectId(projectId) → {Array.<string>}

Soft-delete all runs for a project.

Parameters:

Name	Type	Description
`projectId`	string

Source:

database/repositories/runRepo.js, line 1000

Returns:

IDs of newly soft-deleted runs.

Type: Array.<string>

(static) findActiveByProjectId(projectId, typesopt) → {Object|undefined}

Find an active (non-deleted, non-finished) run for a project.

Parameters:

Name	Type	Attributes	Description
`projectId`	string
`types`	Array.<string>	<optional>	— Run types to check (default: crawl, test_run, generate).

Source:

database/repositories/runRepo.js, line 984

Returns:

Type: Object | undefined

(static) findByGithubDeliveryId(projectId, deliveryId) → {Object|undefined}

Find a non-deleted run that was launched from a specific GitHub webhook delivery. Used to make duplicate PR webhooks idempotent (INT-002).

GitHub stamps every webhook delivery with a unique UUID in the X-GitHub-Delivery header and retries the same UUID with exponential backoff for up to 24h on non-2xx responses. The delivery ID — not the commit SHA — is the correct idempotency key:

Two deliveries for the same SHA but different delivery IDs are distinct events (e.g. PR opened, then check_suite.rerequested after a user clicks "Re-run"). Each deserves a fresh Check Run.
Two deliveries for the same delivery ID are GitHub retrying the same event. We must produce the same checkRunId and not launch a duplicate run.

The previous repo+SHA-based lookup conflated these two cases — a legitimate rerequested event for the same commit would silently reuse the prior check and overwrite its conclusion, which is surprising behaviour for an industry-standard QA platform.

Implementation note — cross-dialect lookup via LIKE: The natural query here is WHERE json_extract(githubCheck, '$.deliveryId') = ?, but json_extract() is SQLite-specific and the Postgres adapter (backend/src/database/adapters/postgres-adapter.js) has no translation rule for it — so the query would crash on every webhook retry on Postgres deployments. We instead use a LIKE pre-filter against the serialized JSON column (matching the established pattern in backend/src/database/repositories/activityRepo.js:206, which also avoids json_extract for the same reason) and verify the parsed deliveryId field in JS. The pre-filter narrows long-lived PRs to a tiny candidate set; the JS-side check is the source of truth and rejects accidental substring matches in unrelated JSON fields.

Parameters:

Name	Type	Description
`projectId`	string
`deliveryId`	string	— value of the `X-GitHub-Delivery` header.

Source:

database/repositories/runRepo.js, line 263

Returns:

Type: Object | undefined

(static) findLatestResultForTest(testId) → {Object|null}

Find the most recent non-deleted run result for a specific test ID.

Uses a LIKE pre-filter on the JSON results column to narrow down candidate rows, then parses and searches in JS. Only selects id, startedAt, results to avoid deserializing heavy columns.

Parameters:

Name	Type	Description
`testId`	string	— e.g. "TC-1"

Source:

database/repositories/runRepo.js, line 1039

Returns:

The matching result object with runId, or null.

Type: Object | null

(static) getAllWithResults() → {Array.<Object>}

Get all non-deleted runs with results + feedbackLoop columns (for failure/analytics). Prefer getWithResultsByProjectIds for workspace-scoped queries.

Source:

database/repositories/runRepo.js, line 175

Returns:

Type: Array.<Object>

(static) getById(id) → {Object|undefined}

Get a non-deleted run by ID. Hydrates run.logs from the run_logs table (ENH-008).

Parameters:

Name	Type	Description
`id`	string

Source:

database/repositories/runRepo.js, line 359

Returns:

Type: Object | undefined

(static) getByIdIncludeDeleted(id) → {Object|undefined}

Get a run by ID including soft-deleted (for restore and abort operations). Hydrates run.logs from the run_logs table (ENH-008).

Parameters:

Name	Type	Description
`id`	string

Source:

database/repositories/runRepo.js, line 382

Returns:

Type: Object | undefined

(static) getByProjectId(projectId) → {Array.<Object>}

Get non-deleted runs for a specific project, sorted by startedAt descending.

Parameters:

Name	Type	Description
`projectId`	string

Source:

database/repositories/runRepo.js, line 217

Returns:

Type: Array.<Object>

(static) getByProjectIdPaged(projectId, pageopt, pageSizeopt) → {Object}

Get non-deleted runs for a project with lean columns, paginated.

Parameters:

Name	Type	Attributes	Default
`projectId`	string
`page`	number \| string	<optional>	1
`pageSize`	number \| string	<optional>	DEFAULT_PAGE_SIZE

Source:

database/repositories/runRepo.js, line 341

Returns:

Type: Object

(static) getDeletedAll() → {Array.<Object>}

Get all soft-deleted runs.

Source:

database/repositories/runRepo.js, line 1090

Returns:

Type: Array.<Object>

(static) getDeletedByProjectId(projectId) → {Array.<Object>}

Get soft-deleted runs for a project.

Parameters:

Name	Type	Description
`projectId`	string

Source:

database/repositories/runRepo.js, line 1079

Returns:

Type: Array.<Object>

(static) getLatestCrawlWithChangedPages(projectId) → {Object|null}

AUTO-001: Lean accessor for the most recent crawl run that produced a non-empty changedPages[] summary. Used by the risk-scoring path in routes/runs.js and routes/trigger.js so a project with hundreds of historical runs doesn't have to deserialize every row's heavy JSON columns just to read the latest crawl's diff payload. Selects only id, startedAt, changedPages and bounds the SQL with LIMIT 1.

Returns the parsed { id, startedAt, changedPages } shape or null when no crawl has produced a non-empty changedPages array yet.

Parameters:

Name	Type	Description
`projectId`	string

Source:

database/repositories/runRepo.js, line 483

Returns:

Type: Object | null

(static) getLatestDiscoveredPageUrls(projectId) → {Array.<string>}

Lean accessor for the recorder's Start-URL dropdown (GET /api/v1/projects/:id/pages). Returns only the URLs persisted on the most recent crawl or recorder run that has a non-empty pages JSON array, deduplicated and oldest-first within the run.

Avoids the heavy SELECT * + per-row rowToRun() JSON parse fan-out of getByProjectId — the dropdown is fetched on every recorder modal open, so loading every run's results / tests / promptAudit / qualityAnalytics blob just to read one column would scale poorly on projects with long run history. We LIMIT 1 to the most recent qualifying run since the previous in-process implementation already only used the latest match.

Parameters:

Name	Type	Description
`projectId`	string

Source:

database/repositories/runRepo.js, line 304

Returns:

URLs from the latest crawl/record run, or [] when no run has discovered pages yet.

Type: Array.<string>

(static) getRecentCompletedWithResults(projectId, limitopt) → {Array.<Object>}

Get recent completed test runs with only the columns needed for flaky score computation. Returns at most limit rows, sorted newest-first, with only id, type, status, startedAt, and results — avoiding the heavy JSON blobs (testQueue, generateInput, promptAudit, qualityAnalytics).

Parameters:

Name	Type	Attributes	Default	Description
`projectId`	string
`limit`	number	<optional>	20

Source:

database/repositories/runRepo.js, line 407

Returns:

Type: Array.<Object>

(static) getRecentTestRunsForGithubBase(projectId, limitopt) → {Array.<Object>}

INT-002: Lean accessor for recent completed test runs that posted a GitHub Check Run. Used by concludeGithubCheck to find a green base run for regressed-test diff rendering without loading every project run's heavy JSON columns (testQueue, promptAudit, qualityAnalytics, …).

Selects only the columns findGreenBaseRun reads (backend/src/utils/runResultFormatters.js):

id, type, status, failed — eligibility filtering
githubCheck — repo/sha match
results — green-test set for the diff

The lookback is bounded by limit (default 25, matching BASE_LOOKBACK_RUNS in the formatter) so a project with thousands of historical runs doesn't trigger an O(n) deserialize on every check completion.

Parameters:

Name	Type	Attributes	Default	Description
`projectId`	string
`limit`	number	<optional>	25

Source:

database/repositories/runRepo.js, line 447

Returns:

Newest-first, parsed githubCheck + results.

Type: Array.<Object>

(static) getWithResultsByProjectIds(projectIds) → {Array.<Object>}

Get non-deleted runs with results + feedbackLoop for a set of project IDs. Workspace-scoped alternative to getAllWithResults — queries only the rows belonging to the given projects instead of loading the entire table.

Parameters:

Name	Type	Description
`projectIds`	Array.<string>

Source:

database/repositories/runRepo.js, line 188

Returns:

Type: Array.<Object>

(static) hardDeleteById(id)

Hard-delete a run by ID (permanent — use only for purge operations). Also purges all associated log rows from run_logs.

Parameters:

Name	Type	Description
`id`	string

Source:

database/repositories/runRepo.js, line 1102

(static) hardDeleteByProjectId(projectId) → {Array.<string>}

Hard-delete all runs for a project (permanent — for project purge). Also purges all associated log rows from run_logs.

Parameters:

Name	Type	Description
`projectId`	string

Source:

database/repositories/runRepo.js, line 1019

Returns:

IDs of all deleted runs.

Type: Array.<string>

(static) incrementRunStats(runId, deltas) → {number}

CAP-002 Phase 2 — Atomic accumulator for per-shard stats deltas. Composes sibling-shard contributions onto the same parent runs row without the read-modify-write race that runRepo.save(run) would lose under N concurrent workers. Sibling of appendRunResults / incrementShardsCompleted; same single-statement-UPDATE concurrency contract (better-sqlite3 journal lock on SQLite, row-level lock on Postgres).

Cross-dialect: COALESCE(x, 0) + ? is spelled identically on SQLite and Postgres — no adapter translation required. total is mutable because data-driven tests expand N input rows into K iterations at execution time; each shard's totalDelta = (executedIterations - originalSliceSize) so the parent run's total reflects what actually dispatched rather than what was originally queued.

Parameters:

Name Type Description

runId string

deltas

Object

Properties

Name	Type	Attributes
`passedDelta`	number	<optional>
`failedDelta`	number	<optional>
`totalDelta`	number	<optional>

Source:

database/repositories/runRepo.js, line 795

Returns:

1 if a row was updated, 0 otherwise.

Type: number

(static) incrementShardsCompleted(runId) → {number}

CAP-002 Phase 2 — Atomic increment of shardsCompleted, capped at shardCount so a buggy double-fire (e.g. coordinator + worker both marking a shard done) can't push the counter past the total. Sibling of appendRunResults; same row-lock-per-UPDATE concurrency contract.

Cross-dialect: CASE WHEN … THEN x + 1 ELSE x END works identically on SQLite and Postgres. Avoids MIN() (aggregate-only on Postgres) and LEAST() (not in SQLite). COALESCE handles pre-migration rows where shardsCompleted may be NULL.

Parameters:

Name	Type	Description
`runId`	string

Source:

database/repositories/runRepo.js, line 632

Returns:

1 if a row was updated, 0 otherwise (run not found or already at the cap).

Type: number

(static) markOrphansInterrupted() → {number}

Mark all "running" non-deleted runs as "interrupted" (orphan recovery on startup).

Source:

database/repositories/runRepo.js, line 1061

Returns:

Number of runs marked.

Type: number

(static) markRunCompletedFirstWriterWins(runId, fields) → {boolean}

CAP-002 Phase 2 — Mark a run completed with first-writer-wins semantics. The sibling primitive of markRunFailedFirstWriterWins: the WHERE id = ? AND status = 'running' predicate makes any caller arriving after the run has already transitioned terminal (e.g. the user clicked Abort between the finalizer's getById snapshot and its status-transition UPDATE) a clean no-op. Without this, the late finalizer would overwrite an aborted status with completed, silently masking the user's cancellation.

The race is real but tiny: shard N's incrementShardsCompleted returns 1 → getById reads status: "running" → user aborts (route writes status: "aborted") → finalizer continues, calls finalizeRunIfNotAborted which only looks at the in-memory run object (not the live DB) and proceeds to write status: "completed". The atomic predicate here is the DB-side enforcement that catches the mid-flight abort even when the in-memory snapshot is stale.

Sibling-shard concurrency contract: same single-statement-UPDATE row-lock guarantee as the rest of the CAP-002 primitives. Sets finishedAt and duration atomically with the status flip so a consumer reading the row never sees status: "completed" with a finishedAt: NULL interim state.

Parameters:

Name Type Description

runId string

fields

Object

Properties

Name	Type	Attributes	Description
`finishedAt`	string	<optional>	ISO timestamp; defaults to `datetime('now')` in SQL.
`duration`	number	<optional>	Wall-clock duration in ms.
`qualityAnalytics`	Object	<optional>	Feedback-loop output (JSON-stringified before persist).

Source:

database/repositories/runRepo.js, line 751

Returns:

true when this caller actually performed the update (i.e. it was the first writer); false when the row was already terminal (an abort beat the finalizer, or a sibling already finalized — defence-in-depth even though incrementShardsCompleted guarantees exactly one finalizer).

Type: boolean

(static) markRunFailedFirstWriterWins(runId, fields) → {boolean}

CAP-002 Phase 2 (Prerequisite #6) — Mark a run failed with first-writer- wins semantics. When N shard jobs crash near-simultaneously, only the first one to land its UPDATE actually writes the failure reason; the predicate WHERE id = ? AND status = 'running' makes every subsequent caller a no-op. Without this, a later shard's classified error would overwrite the first (and arguably most-informative) failure message, and the audit trail would lose track of which shard actually caused the cascade.

Sibling of appendRunResults / incrementShardsCompleted — same single-statement-UPDATE concurrency contract (better-sqlite3 journal lock on SQLite, row-level lock on Postgres). shardsCompleted is deliberately NOT touched here: a partial completion (< shardCount) is the correct surface for the failure badge, and any successful shards that landed before the crash have already bumped the counter via incrementShardsCompleted.

Parameters:

Name Type Description

runId string

fields

Object

Properties

Name	Type	Attributes	Description
`error`	string		Classified error message.
`errorCategory`	string	<optional>	Error category from `errorClassifier.js`.

Source:

database/repositories/runRepo.js, line 694

Returns:

true when this caller actually performed the update (i.e. it was the first writer); false when the row was already terminal (a sibling shard beat us to it, or the run was aborted).

Type: boolean

(static) purgeShardResults(runId, shardIndex, isNonExecutedSkip) → {Object}

CAP-002 Phase 2 — Atomic shard-scoped results purge for the BullMQ retry path. Reads the live results column under a dialect-appropriate lock, filters out rows belonging to shardIndex (keeping non-executed skips and sibling-shard rows verbatim), and writes the filtered array back — all inside a single transaction so concurrent sibling-shard appendRunResults calls cannot lose writes through this path.

Why this exists: the worker's non-final-attempt retry block previously called runRepo.save(run) after filtering the in-memory results, which was a full-column overwrite. Sibling shards atomically appending to the same column between this shard's job-start snapshot and the save would be silently truncated. This primitive replaces the read-modify-write on a stale snapshot with a transactional read-modify-write on the live column — guaranteed not to lose sibling-shard rows.

Shape contract — survivors are kept when ANY of:

isNonExecutedSkip(r) returns true (over_budget / skipped_no_impact)
r._shardIndex != null && r._shardIndex !== shardIndex (sibling row)

Returns the recomputed { passed, failed } aggregate from survivors so the caller can update its in-memory mirror and persist via the same transactional write. Single-shard / legacy callers (shardIndex == null) must NOT call this — they keep using runRepo.save(run) which is bit-for-bit unchanged.

Parameters:

Name	Type	Description
`runId`	string
`shardIndex`	number	0-based shard index whose rows to purge.
`isNonExecutedSkip`	function	Predicate from `utils/skipReasons.js` passed in to avoid a runtime import cycle (this module is loaded very early; `skipReasons.js` is a tiny utility but importing it here would pull every result-processing helper through the dependency graph).

Source:

database/repositories/runRepo.js, line 888

Returns:

re-derived from survivors.

Type: Object

(static) resetRunForRetryIfStillRunning(runId) → {boolean}

CAP-002 Phase 2 — Reset run-level transient fields for a BullMQ retry, guarded by status = 'running' so a sibling shard that already transitioned the row to a terminal state (failed via markRunFailedFirstWriterWins, aborted via the abort route, or completed via markRunCompletedFirstWriterWins) is not silently un-terminalised by the retry path.

Sibling of markRunFailedFirstWriterWins / markRunCompletedFirstWriterWins — same single-statement-UPDATE row-lock concurrency contract and same WHERE status = 'running' predicate so a TOCTOU race between check-and-write is impossible.

Returns the same boolean shape as the other first-writer-wins primitives so callers can use it as a gate: when it returns false, the run is already terminal and the caller MUST skip any further destructive cleanup (e.g. purgeShardResults, runLogRepo.deleteByRunId) — those would wipe rows belonging to a sibling shard's successful completion or to the audit trail of a user-initiated abort.

Note: status = 'running' is the expected current value (the retry keeps the row running while a fresh attempt executes), so this is an identity update on the status column. The other fields (error, errorCategory, finishedAt, pagesFound) are reset so the retry starts clean. results is NOT touched here — shard-mode callers handle that via purgeShardResults after this primitive returns true.

Parameters:

Name	Type	Description
`runId`	string

Source:

database/repositories/runRepo.js, line 944

Returns:

true when the row was still running and was reset; false when the row had already transitioned terminal (caller must abort the retry-prep sequence — see runWorker.js non-final-attempt block for the canonical use site).

Type: boolean

(static) restore(id) → {boolean}

Restore a soft-deleted run (clears deletedAt).

Parameters:

Name	Type	Description
`id`	string

Source:

database/repositories/runRepo.js, line 1113

Returns:

Whether the run was found and restored.

Type: boolean

(static) restoreByProjectIdAfter(projectId, deletedAfter) → {number}

Restore soft-deleted runs for a project that were deleted at or after a given timestamp. Used by project cascade-restore to avoid restoring items that were individually deleted before the project.

Parameters:

Name	Type	Description
`projectId`	string
`deletedAfter`	string	— ISO timestamp (inclusive lower bound).

Source:

database/repositories/runRepo.js, line 1127

Returns:

Number of runs restored.

Type: number

(static) save(run)

Save the entire run object (upsert-style update of all known columns). Used by pipeline code that mutates the run in-memory and then flushes.

Pipeline code accumulates non-column properties on the run object (e.g. snapshots, pages, testsGenerated). These are filtered out so the generated SQL only references actual table columns.

Parameters:

Name	Type	Description
`run`	Object	— Full run object with `id`.

Source:

database/repositories/runRepo.js, line 969

(static) setShardTracePath(runId, shardIndex, path)

CAP-002 Phase 2 — Atomically write a per-shard trace path into the sparse tracePaths JSON array. Each shard writes its own slot exactly once at trace-flush time, so contention is bounded to N writes total per run (vs the O(tests) contention of result writes). The transaction wrapper covers the read+write so a concurrent sibling shard's update to a different slot can't be lost.

Cross-dialect note: SQLite's json_set and Postgres' jsonb_set have incompatible signatures and the postgres-adapter doesn't bridge either, so we use a portable SELECT+UPDATE inside db.transaction(...). SQLite transaction() uses BEGIN IMMEDIATE which serializes against other write transactions; Postgres uses row-level locks via the implicit row lock on UPDATE within the transaction. Either way, two shards writing different indices on the same row are linearized.

Parameters:

Name	Type	Description
`runId`	string
`shardIndex`	number	0-based shard index.
`path`	string	Public artifact URL for this shard's trace.

Source:

database/repositories/runRepo.js, line 829

(static) update(id, fields)

Update specific fields on a run (full replacement of provided fields). Unknown properties (not in the runs table) are silently skipped.

Parameters:

Name	Type	Description
`id`	string
`fields`	Object

Source:

database/repositories/runRepo.js, line 537

(inner) parseLeanJson(row) → {Object}

Parse the lightweight JSON columns (feedbackLoop, pipelineStats) on a lean row in-place. Both are small objects — safe to include in listing queries.

Parameters:

Name	Type	Description
`row`	Object

Source:

database/repositories/runRepo.js, line 137

Returns:

The same row with JSON columns deserialized.

Type: Object

(inner) parseResultsAndLean(row) → {Object}

Parse results JSON + lean JSON columns on a row. Shared by getAllWithResults and getWithResultsByProjectIds.

Parameters:

Name	Type	Description
`row`	Object

Source:

database/repositories/runRepo.js, line 203

Returns:

Type: Object