Module: aiProvider

Multi-AI provider abstraction layer.

Supported providers

Provider Key Env Variable Model Env Variable Default Model
Anthropic Claude ANTHROPIC_API_KEY ANTHROPIC_MODEL claude-sonnet-4-20250514
OpenAI GPT OPENAI_API_KEY OPENAI_MODEL gpt-4o-mini
Google Gemini GOOGLE_API_KEY GOOGLE_MODEL gemini-2.5-flash
Ollama (local) AI_PROVIDER=local OLLAMA_MODEL mistral:7b

Detection order: Runtime override (header dropdown) → AI_PROVIDER env var → auto-detect (Anthropic → OpenAI → Google → Ollama).

Exports

  • generateText — Single-shot text generation.
  • streamText — Token-streaming text generation (Anthropic/OpenAI; fallback for others).
  • parseJSON — Parse AI response text as JSON (strips markdown fences).
  • getProvider, hasProvider, isLocalProvider, isProviderDegraded, getProviderName, getProviderMeta — Provider detection.
  • setRuntimeKey, setRuntimeOllama, setActiveProvider — Runtime configuration (Settings page).
  • getConfiguredKeys — Masked key status for the Settings UI.
  • getSupportedProviders — All provider names/models for the UI (derived from runtime config).
  • checkOllamaConnection — Ollama connectivity check.
  • loadKeysFromDatabase — Restore all persisted keys from DB into the runtime cache (called at startup).
Source:

Members

(inner, constant) circuitBreakers :Object.<string, {failures: number, disabledUntil: number}>

Type:
  • Object.<string, {failures: number, disabledUntil: number}>
Source:

Methods

(static) checkOllamaConnection() → {Promise.<Object>}

Check Ollama server connectivity and verify the configured model is available.

Source:
Returns:

Resolves to {ok: boolean, model?: string, baseUrl?: string, availableModels?: string[], error?: string}.

Type
Promise.<Object>

(static) generateText(prompt, optionsopt) → {Promise.<string>}

Generate text from an AI provider (single-shot, non-streaming). Automatically detects the active provider and routes the request.

FEA-003: On rate-limit errors, automatically falls back to the next configured provider in CLOUD_DETECT_ORDER before giving up. Each provider has a circuit breaker that disables it for 5 minutes after a rate-limit failure that survived all internal retries.

Parameters:
Name Type Attributes Description
prompt string | Object

Plain string or structured { system, user } messages.

options Object <optional>
Properties
Name Type Attributes Description
maxTokens number <optional>

Max output tokens (default 16384).

signal AbortSignal <optional>

Abort signal for cancellation.

Source:
Throws:

If no AI provider is configured or all providers fail.

Type
Error
Returns:

The generated text response.

Type
Promise.<string>

(static) getConfiguredKeys() → {Object}

Returns masked API keys and Ollama config for the Settings UI. Never returns full keys — only masked versions for display.

Source:
Returns:
Type
Object

(static) getProvider() → {string|null}

Source:
Returns:

Current provider ID ("anthropic", "openai", "google", "local"), or null.

Type
string | null

(static) getProviderMeta() → {Object|null}

Source:
Returns:

Full provider metadata, or null.

Type
Object | null

(static) getProviderName() → {string}

Source:
Returns:

Human-readable provider name (e.g. "Claude Sonnet"), or "No provider configured".

Type
string

(static) getSupportedProviders() → {Array.<{id: string, name: string, model: string, docsUrl: string}>}

Returns the list of all supported providers with current names/models. Derives from buildProviderMeta() so model names stay in sync with what's actually used in API calls. Consumed by GET /api/config.

Source:
Returns:
Type
Array.<{id: string, name: string, model: string, docsUrl: string}>

(static) hasProvider() → {boolean}

Source:
Returns:

true if any AI provider is configured.

Type
boolean

(static) isLocalProvider() → {boolean}

Source:
Returns:

true if the active provider is Ollama (local).

Type
boolean

(static) isProviderDegraded() → {boolean}

true when the AI provider is operating in a degraded state — either a sticky fallback is active (primary was rate-limited) or the primary provider's circuit breaker is open. Used by the feedback loop to skip expensive AI calls that would block run completion.

Source:
Returns:
Type
boolean

(static) isRateLimitError(err) → {boolean}

Detect whether an error is a rate limit / quota exhaustion from any AI provider. Used internally for retry decisions and exported for the pipeline to detect rate limits that survived all retries.

Parameters:
Name Type Description
err Error
Source:
Returns:
Type
boolean

(static) isTransientServerError(err) → {boolean}

Detect transient server-side failures that warrant retry + provider fallback but aren't rate limits. Common examples:

  • Google Gemini 503 "This model is currently experiencing high demand"
  • Anthropic 503 "overloaded_error" (already matches isRateLimitError via /overloaded/)
  • OpenAI 500/502/504 transient backend errors
  • Provider-specific HTTP 5xx with "high demand", "service unavailable", "try again later"

Distinct from isRateLimitError — these are not quota issues, they're temporary server outages. We retry with backoff and fall back to other providers, but don't trip the per-provider rate-limit circuit breaker (the provider's key is fine; the backend is struggling).

Parameters:
Name Type Description
err Error
Source:
Returns:
Type
boolean

(static) loadKeysFromDatabase() → {number}

Restore all persisted API keys and Ollama config from the database into the runtime cache. Called once at server startup after the DB is initialised.

Keys stored in the DB take precedence over the default detection logic only when no matching env var is already set — env vars remain the canonical override so Docker / K8s deployments are unaffected.

Source:
Returns:

The number of providers successfully loaded from the database.

Type
number

(static) parseJSON(text) → {Object}

Parse AI response text as JSON. Strips markdown code fences if present.

Parameters:
Name Type Description
text string

Raw AI response text.

Source:
Throws:

If the text is not valid JSON after cleanup.

Type
SyntaxError
Returns:

Parsed JSON object.

Type
Object

(static) setActiveProvider(provider)

Override the active provider selection (used by the quick-switch dropdown). The provider must already have a valid key/config — this does not set any key.

Parameters:
Name Type Description
provider string | null

Provider ID to pin, or null to resume auto-detect.

Source:

(static) setRuntimeKey(provider, key)

Set an AI provider API key at runtime (via Settings page). Persists the key to the database so it survives server restarts. Pass an empty string to clear the key both in-memory and in the DB.

Parameters:
Name Type Description
provider string

"anthropic" | "openai" | "google".

key string

The API key string, or "" to deactivate.

Source:

(static) setRuntimeOllama(optsopt)

Configure Ollama runtime settings (via Settings page). Persists the config to the database so it survives server restarts.

Parameters:
Name Type Attributes Description
opts Object <optional>
Properties
Name Type Attributes Description
baseUrl string <optional>

Ollama server URL.

model string <optional>

Model name (e.g. "mistral:7b").

disabled boolean <optional>

Set true to deactivate Ollama.

Source:

(static) streamText(promptOrMessages, onToken, optionsopt) → {Promise.<string>}

Token-streaming variant of generateText. Calls onToken(string) for each token as it arrives. Returns the full accumulated text when the stream completes.

Error handling

If the streaming call fails with a retryable error (rate limit or transient 5xx) BEFORE any tokens are emitted, we transparently retry via generateText() — which applies the full FEA-003 retry + fallback chain and emits the full response as a single synthetic "token". Once tokens have started flowing we can't safely fall back (the user would see two partial responses), so mid-stream failures propagate as-is.

Google and Ollama providers never start a real stream — they always delegate to generateText() (their SDKs don't support incremental streaming from this codebase), so they get fallback for free.

Parameters:
Name Type Attributes Description
promptOrMessages string | Object

Plain string or structured messages.

onToken function

Callback invoked for each token.

options Object <optional>
Properties
Name Type Attributes Description
maxTokens number <optional>

Max output tokens.

signal AbortSignal <optional>

Abort signal for cancellation.

Source:
Throws:

If no AI provider is configured.

Type
Error
Returns:

The full accumulated response text.

Type
Promise.<string>

(inner) composeSignal(external, timeoutMs) → {Object}

Compose an AbortSignal that fires on EITHER the external signal (user abort) OR a per-call timeout — whichever comes first. Returns the composite signal and a cleanup function that MUST be called in a finally block to prevent the timeout from leaking if the call completes before the deadline.

Parameters:
Name Type Description
external AbortSignal | undefined

Signal from runWithAbort (user abort).

timeoutMs number

Per-call deadline.

Source:
Returns:

{ signal: AbortSignal, cleanup: Function }

Type
Object

(inner) getFallbackProviders(primaryProvider) → {Array.<string>}

FEA-003: Get the ordered list of fallback providers to try when the primary provider hits a rate limit or transient error.

Same-tier only — cloud primary falls back to other cloud providers; local primary has no fallback. This prevents cross-tier mismatches where a prompt built for cloud (~1600 chars, 128K context assumed) gets delivered to Ollama (4K context, needs >120s to process) and hits the chat timeout. Ollama is never a cross-tier rescue — the prompt shape, context window, and response latency are too different.

To use Ollama as a primary, set AI_PROVIDER=local or pick it from the provider dropdown — detectProvider() will route all calls to Ollama with the correct tier-specific prompt.

Parameters:
Name Type Description
primaryProvider string

The provider that failed.

Source:
Returns:

Ordered list of same-tier fallback provider IDs.

Type
Array.<string>

(inner) getUserConfiguredKey(envName) → {string}

Get a user-configured key WITHOUT the demo fallback. Used by getConfiguredKeys() so BYOK detection is accurate.

Parameters:
Name Type Description
envName string
Source:
Returns:
Type
string

(inner) hasOllamaConfig()

True if Ollama has any config (runtime or env) hinting it should be auto-detected.

Source:

(inner) isCircuitBreakerOpen(provider) → {boolean}

Check whether a provider's circuit breaker is open (disabled).

Parameters:
Name Type Description
provider string
Source:
Returns:

true if the provider is temporarily disabled.

Type
boolean

(inner) isProviderUsable(provider) → {boolean}

Check whether a provider is usable right now (has a key or, for Ollama, is not disabled). Single source of truth — used by detectProvider, the quick-switch override, and the forced-env path.

Parameters:
Name Type Description
provider string
Source:
Returns:
Type
boolean

(inner) isRetryableError()

True if the error should be retried — either a rate limit (quota issue) or a transient server error (provider outage).

Source:

(inner) recordProviderFailure(provider)

Record a rate-limit failure for a provider. If the threshold is reached, the provider is disabled for CIRCUIT_BREAKER_COOLDOWN_MS.

Parameters:
Name Type Description
provider string
Source:

(inner) recordProviderSuccess(provider)

Record a successful call — resets the failure counter.

Parameters:
Name Type Description
provider string
Source: