API error reference
Every Lido API call can fail in one of two layers: an HTTP-level error (the request itself didn't succeed) or a job-level error (the request succeeded, but the extraction job ran into trouble). This article is the canonical reference for both.
HTTP status codes
Code | Meaning | Likely cause | What to do |
|---|---|---|---|
200 | OK | Request succeeded | Use the response body |
202 | Accepted | Workflow was triggered (Webhook Trigger, async mode) | No action — workflow is running |
400 | Bad Request | Malformed JSON, missing required field, invalid value | Inspect the response body for the validation message; fix the payload |
401 | Unauthorized | API key missing, invalid, or revoked | Check the |
403 | Forbidden | API key valid but lacks permission for the resource | Check the key's scope; contact support if it should have access |
404 | Not Found | | If recent, double-check the ID. If older than 24h, results are gone — re-submit |
413 | Payload Too Large | File >50 MB sent via JSON+base64 | Switch to multipart upload (max 500 MB) |
422 | Unprocessable Entity | Configuration accepted but semantically invalid (e.g., column names with disallowed chars, malformed | Read the error message; fix and retry |
429 | Too Many Requests | Rate limit exceeded (5 req / 30s on submit) | Back off with exponential backoff; do not treat as fatal |
500 | Internal Server Error | Lido encountered an unexpected error | Retry with exponential backoff; if persistent, file a support ticket with the request details |
502 / 503 / 504 | Upstream / unavailable / gateway timeout | Transient infrastructure issue | Retry with exponential backoff |
Job-level errors (returned by GET /job-result/{jobId})
Even when the HTTP request returns 200, the job itself can be in an error state. The response body's status field tells you which:
Status | Meaning |
|---|---|
| Extraction in progress; poll again |
| Done; data is in the |
| Extraction failed; see |
When status === "error", the response includes an error object:
{
"status": "error",
"error": {
"code": "EXTRACTION_FAILED",
"message": "Could not parse the document. The file may be corrupted or password-protected.",
"details": { ... }
}
}
Common job-level error codes
Code | Meaning | What to do |
|---|---|---|
| The AI couldn't process the document at all | Confirm the file isn't corrupted. Try opening it in a PDF viewer; if it's password-protected, decrypt it first |
| OCR didn't return usable text from a scanned document | Check image quality; consider running OCR yourself first (e.g., via the OCRMYPDF formula) |
| The file isn't a supported type | Supported: PDF, PNG, JPG, JPEG, HEIC, common Office formats. Convert if needed |
| File exceeded job-level processing size | Split the document, or process a page range |
| Extraction exceeded the maximum processing time | Try a smaller page range. Very long, dense documents (200+ pages) may need splitting |
| Configuration was accepted at submit time but invalid for this document (e.g., page range outside the document) | Check |
| Hit an upstream provider rate limit | Retry with exponential backoff |
| Something else went wrong | File a support ticket with the |
Failed extractions don't count toward your page allowance — only successful ones do.
Validation errors (HTTP 400 / 422)
These come back with a structured response so you can fix the payload programmatically:
{
"error": {
"code": "VALIDATION_ERROR",
"message": "Invalid request payload",
"details": [
{"field": "columns", "message": "Required field missing"},
{"field": "pageRange", "message": "Invalid range syntax: 'a-b'"}
]
}
}
The details array tells you exactly which fields failed and why.
Rate-limit handling (HTTP 429)
The submit endpoint allows 5 requests per 30 seconds per API key.
When you hit the limit:
- Lido returns
429 Too Many Requests. - Optionally, a
Retry-Afterheader tells you how long to wait (in seconds). - Back off and retry. Don't retry immediately; you'll just hit the limit again.
Recommended retry logic:
import time
def submit_with_retry(payload, max_retries=5):
delay = 2
for attempt in range(max_retries):
r = requests.post(SUBMIT_URL, headers=headers, json=payload)
if r.status_code == 429:
wait = int(r.headers.get("Retry-After", delay))
time.sleep(wait)
delay = min(delay * 2, 60)
continue
if 500 <= r.status_code < 600:
time.sleep(delay)
delay = min(delay * 2, 60)
continue
r.raise_for_status()
return r.json()
raise RuntimeError("Exceeded max retries")
When NOT to retry
Don't retry these — they'll never succeed without a payload change:
- 400 (bad request) — fix the payload
- 401 (unauthorized) — fix the auth
- 403 (forbidden) — fix permissions
- 404 (not found) — the resource doesn't exist
- 413 (file too large) — switch upload method or split the file
- 422 (unprocessable) — fix the configuration
DO retry these (with backoff):
- 429 (rate limited)
- 500 / 502 / 503 / 504 (server errors)
Idempotency
Lido does not yet support an Idempotency-Key header. If you retry a submit because of a network error and don't know whether the original succeeded, you may end up with two extractions of the same file (counting toward your page allowance twice).
Mitigation:
- On the client side, dedupe by content hash. Compute a SHA-256 of the file before submit; cache the resulting
jobId. On retry, look up by hash before submitting again. - Don't retry within 30 seconds of a network failure — the original may still be in flight.
Logging recommendations
For every API call, log at minimum:
- Timestamp
- HTTP method, URL, status code
jobId(in the response body for submit; in the URL for poll)- Latency
- For errors: full response body (redact API keys before logging)
This is the difference between "we have a problem somewhere" and "we have a problem with jobId xyz at 14:23:07 UTC."
Tips
- Capture jobId at submit time, even on apparent failure. Some 5xx responses still create a job.
- Set HTTP timeouts of 30+ seconds. Submits with large files and slow networks can take a while.
- Build retry logic once and reuse it. Don't sprinkle
time.sleepcalls across the codebase. - Alert on
EXTRACTION_FAILEDrate. If suddenly 20% of jobs fail, something upstream changed (file format, OCR quality, etc.). - Don't log API keys. Log a fingerprint (last 4 chars) if you must.
Common mistakes
- Treating 429 as fatal. It's a "wait" signal.
- Retrying 4xx errors. They won't succeed without a fix.
- No backoff on 5xx. Hammering a struggling service makes it worse.
- Polling forever. Stop after a reasonable cap (5 minutes is plenty for any single document) and surface the timeout to the caller.
- Missing the Bearer prefix in the Authorization header. Lido returns 401 with no further hint; double-check.
- Confusing HTTP errors with job errors. A 200 OK with
status: "error"is a job error, not an HTTP error.
Related articles
- Lido API: quickstart and authentication
- Extract data via the API: deep dive
- Webhooks and async processing
- Improve extraction accuracy (when extractions complete but the data is wrong)
Updated on: 16/04/2026
Thank you!