Budget-cap recovery
When your monthly token budget runs out, requests return 429 budget_exceeded. There is no automatic refill mid-month — the cap resets at the 1st (UTC).
curl
curl -i "$ACS_API_BASE/completions" \
-H "Authorization: Bearer $ACS_API_KEY" -H "Content-Type: application/json" \
-d '{"model": "llama-8b", "prompt": "Hi", "max_tokens": 8}'
# HTTP/1.1 429 Too Many Requests
# {"error": {"code": "budget_exceeded", "message": "Monthly token budget exhausted.", ...}}
Python
from openai import APIStatusError
try:
resp = client.completions.create(model="llama-8b", prompt="Hi", max_tokens=8)
except APIStatusError as e:
# The wrapper nests the error code under body["error"]["code"], not at
# the top level — so e.code (which the openai SDK reads from
# body["code"]) is None here. Dig into e.body instead.
err = (e.body or {}).get("error", {}) if isinstance(e.body, dict) else {}
if e.status_code == 429 and err.get("code") == "budget_exceeded":
# Don't retry — the cap is sticky until the 1st of next month UTC.
# Email base-models@acsresearch.org to raise the budget mid-month.
raise SystemExit("Budget exhausted — emailing the team for a raise.")
raise
Gotcha
429 budget_exceeded is not a transient rate-limit — exponential backoff just burns time. Inspect error.code and short-circuit non-retryable cases (concurrency limits, by contrast, simply queue).