Skip to main content

Documentation Index

Fetch the complete documentation index at: https://uselora.dev/llms.txt

Use this file to discover all available pages before exploring further.

Rate limits are per key, not per user or per workspace. Each key has its own counter, so minting two keys in the same workspace gives each its own budget. The goal: don’t let one runaway script slow everyone else down.

Default policy

Every key starts with these defaults:
SettingValue
Window1 minute
Max requests600 (10 rps sustained)
RefillCounter resets when the window rolls
All keys share the same 600/min default. Per-key overrides aren’t in the settings UI yet. Get in touch if your integration needs a higher cap.

When you hit the limit

You get 429 Too Many Requests with a Retry-After header in seconds:
{
  "error": "Too many requests",
  "code": "rate_limit_exceeded"
}
Wait the indicated number of seconds, then retry. Add a bit of jitter so a fleet of workers doesn’t thundering-herd the recovery window:
async function callWithRetry(url: string, init: RequestInit, attempt = 0) {
  const response = await fetch(url, init);
  if (response.status !== 429 || attempt >= 4) {
    return response;
  }
  const retryAfter = Number(response.headers.get("retry-after") ?? "1");
  const jitter = Math.random() * 500;
  await new Promise((r) => setTimeout(r, retryAfter * 1000 + jitter));
  return callWithRetry(url, init, attempt + 1);
}

Choosing a key budget

A few guidelines:
  • For an interactive integration like an internal tool or one-user script, leave the default alone. You won’t hit 600/min under normal use.
  • For background jobs and batch imports, pace the worker fleet so the combined rate stays under 600/min. If you need a higher cap, get in touch.
  • For untrusted clients like browser extensions, public widgets, or embeds, mint a short-lived key per surface. Per-key isolation means a leaky client burns its own budget instead of starving the rest of the workspace.