Articles on: API & developers

API error reference

Every Lido API call can fail in one of two layers: an HTTP-level error (the request itself didn't succeed) or a job-level error (the request succeeded, but the extraction job ran into trouble). This article is the canonical reference for both.

HTTP status codes

Code	Meaning	Likely cause	What to do
200	OK	Request succeeded	Use the response body
202	Accepted	Workflow was triggered (Webhook Trigger, async mode)	No action — workflow is running
400	Bad Request	Malformed JSON, missing required field, invalid value	Inspect the response body for the validation message; fix the payload
401	Unauthorized	API key missing, invalid, or revoked	Check the `Authorization: Bearer` header. Generate a new key if needed
403	Forbidden	API key valid but lacks permission for the resource	Check the key's scope; contact support if it should have access
404	Not Found	`jobId` doesn't exist or has expired (>24h)	If recent, double-check the ID. If older than 24h, results are gone — re-submit
413	Payload Too Large	File >50 MB sent via JSON+base64	Switch to multipart upload (max 500 MB)
422	Unprocessable Entity	Configuration accepted but semantically invalid (e.g., column names with disallowed chars, malformed `pageRange`)	Read the error message; fix and retry
429	Too Many Requests	Rate limit exceeded (5 req / 30s on submit)	Back off with exponential backoff; do not treat as fatal
500	Internal Server Error	Lido encountered an unexpected error	Retry with exponential backoff; if persistent, file a support ticket with the request details
502 / 503 / 504	Upstream / unavailable / gateway timeout	Transient infrastructure issue	Retry with exponential backoff

Job-level errors (returned by `GET /job-result/{jobId}`)

Even when the HTTP request returns 200, the job itself can be in an error state. The response body's status field tells you which:

Status	Meaning
`running`	Extraction in progress; poll again
`complete`	Done; data is in the `data` field
`error`	Extraction failed; see `error` field

When status === "error", the response includes an error object:

{
  "status": "error",
  "error": {
    "code": "EXTRACTION_FAILED",
    "message": "Could not parse the document. The file may be corrupted or password-protected.",
    "details": { ... }
  }
}

Common job-level error codes

Code	Meaning	What to do
`EXTRACTION_FAILED`	The AI couldn't process the document at all	Confirm the file isn't corrupted. Try opening it in a PDF viewer; if it's password-protected, decrypt it first
`OCR_FAILED`	OCR didn't return usable text from a scanned document	Check image quality; consider running OCR yourself first (e.g., via the OCRMYPDF formula)
`INVALID_FILE_FORMAT`	The file isn't a supported type	Supported: PDF, PNG, JPG, JPEG, HEIC, common Office formats. Convert if needed
`FILE_TOO_LARGE`	File exceeded job-level processing size	Split the document, or process a page range
`TIMEOUT`	Extraction exceeded the maximum processing time	Try a smaller page range. Very long, dense documents (200+ pages) may need splitting
`INVALID_CONFIG`	Configuration was accepted at submit time but invalid for this document (e.g., page range outside the document)	Check `pageRange` against the actual page count
`RATE_LIMITED_INTERNAL`	Hit an upstream provider rate limit	Retry with exponential backoff
`UNKNOWN_ERROR`	Something else went wrong	File a support ticket with the `jobId`

Failed extractions don't count toward your page allowance — only successful ones do.

Validation errors (HTTP 400 / 422)

These come back with a structured response so you can fix the payload programmatically:

{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Invalid request payload",
    "details": [
      {"field": "columns", "message": "Required field missing"},
      {"field": "pageRange", "message": "Invalid range syntax: 'a-b'"}
    ]
  }
}

The details array tells you exactly which fields failed and why.

Rate-limit handling (HTTP 429)

The submit endpoint allows 5 requests per 30 seconds per API key.

When you hit the limit:

Lido returns 429 Too Many Requests.
Optionally, a Retry-After header tells you how long to wait (in seconds).
Back off and retry. Don't retry immediately; you'll just hit the limit again.

Recommended retry logic:

import time

def submit_with_retry(payload, max_retries=5):
    delay = 2
    for attempt in range(max_retries):
        r = requests.post(SUBMIT_URL, headers=headers, json=payload)
        if r.status_code == 429:
            wait = int(r.headers.get("Retry-After", delay))
            time.sleep(wait)
            delay = min(delay * 2, 60)
            continue
        if 500 <= r.status_code < 600:
            time.sleep(delay)
            delay = min(delay * 2, 60)
            continue
        r.raise_for_status()
        return r.json()
    raise RuntimeError("Exceeded max retries")

When NOT to retry

Don't retry these — they'll never succeed without a payload change:

400 (bad request) — fix the payload
401 (unauthorized) — fix the auth
403 (forbidden) — fix permissions
404 (not found) — the resource doesn't exist
413 (file too large) — switch upload method or split the file
422 (unprocessable) — fix the configuration

DO retry these (with backoff):

429 (rate limited)
500 / 502 / 503 / 504 (server errors)

Idempotency

Lido does not yet support an Idempotency-Key header. If you retry a submit because of a network error and don't know whether the original succeeded, you may end up with two extractions of the same file (counting toward your page allowance twice).

Mitigation:

On the client side, dedupe by content hash. Compute a SHA-256 of the file before submit; cache the resulting jobId. On retry, look up by hash before submitting again.
Don't retry within 30 seconds of a network failure — the original may still be in flight.

Logging recommendations

For every API call, log at minimum:

Timestamp
HTTP method, URL, status code
jobId (in the response body for submit; in the URL for poll)
Latency
For errors: full response body (redact API keys before logging)

This is the difference between "we have a problem somewhere" and "we have a problem with jobId xyz at 14:23:07 UTC."

Tips

Capture jobId at submit time, even on apparent failure. Some 5xx responses still create a job.
Set HTTP timeouts of 30+ seconds. Submits with large files and slow networks can take a while.
Build retry logic once and reuse it. Don't sprinkle time.sleep calls across the codebase.
Alert on EXTRACTION_FAILED rate. If suddenly 20% of jobs fail, something upstream changed (file format, OCR quality, etc.).
Don't log API keys. Log a fingerprint (last 4 chars) if you must.

Common mistakes

Treating 429 as fatal. It's a "wait" signal.
Retrying 4xx errors. They won't succeed without a fix.
No backoff on 5xx. Hammering a struggling service makes it worse.
Polling forever. Stop after a reasonable cap (5 minutes is plenty for any single document) and surface the timeout to the caller.
Missing the Bearer prefix in the Authorization header. Lido returns 401 with no further hint; double-check.
Confusing HTTP errors with job errors. A 200 OK with status: "error" is a job error, not an HTTP error.

Lido API: quickstart and authentication
Extract data via the API: deep dive
Webhooks and async processing
Improve extraction accuracy (when extractions complete but the data is wrong)

Updated on: 16/04/2026

Was this article helpful?

Thank you!