Articles on: API & developers

API error reference

Every Lido API call can fail in one of two layers: an HTTP-level error (the request itself didn't succeed) or a job-level error (the request succeeded, but the extraction job ran into trouble). This article is the canonical reference for both.



HTTP status codes


Code

Meaning

Likely cause

What to do

200

OK

Request succeeded

Use the response body

202

Accepted

Workflow was triggered (Webhook Trigger, async mode)

No action — workflow is running

400

Bad Request

Malformed JSON, missing required field, invalid value

Inspect the response body for the validation message; fix the payload

401

Unauthorized

API key missing, invalid, or revoked

Check the Authorization: Bearer <key> header. Generate a new key if needed

403

Forbidden

API key valid but lacks permission for the resource

Check the key's scope; contact support if it should have access

404

Not Found

jobId doesn't exist or has expired (>24h)

If recent, double-check the ID. If older than 24h, results are gone — re-submit

413

Payload Too Large

File >50 MB sent via JSON+base64

Switch to multipart upload (max 500 MB)

422

Unprocessable Entity

Configuration accepted but semantically invalid (e.g., column names with disallowed chars, malformed pageRange)

Read the error message; fix and retry

429

Too Many Requests

Rate limit exceeded (5 req / 30s on submit)

Back off with exponential backoff; do not treat as fatal

500

Internal Server Error

Lido encountered an unexpected error

Retry with exponential backoff; if persistent, file a support ticket with the request details

502 / 503 / 504

Upstream / unavailable / gateway timeout

Transient infrastructure issue

Retry with exponential backoff



Job-level errors (returned by GET /job-result/{jobId})


Even when the HTTP request returns 200, the job itself can be in an error state. The response body's status field tells you which:


Status

Meaning

running

Extraction in progress; poll again

complete

Done; data is in the data field

error

Extraction failed; see error field


When status === "error", the response includes an error object:


{
"status": "error",
"error": {
"code": "EXTRACTION_FAILED",
"message": "Could not parse the document. The file may be corrupted or password-protected.",
"details": { ... }
}
}


Common job-level error codes


Code

Meaning

What to do

EXTRACTION_FAILED

The AI couldn't process the document at all

Confirm the file isn't corrupted. Try opening it in a PDF viewer; if it's password-protected, decrypt it first

OCR_FAILED

OCR didn't return usable text from a scanned document

Check image quality; consider running OCR yourself first (e.g., via the OCRMYPDF formula)

INVALID_FILE_FORMAT

The file isn't a supported type

Supported: PDF, PNG, JPG, JPEG, HEIC, common Office formats. Convert if needed

FILE_TOO_LARGE

File exceeded job-level processing size

Split the document, or process a page range

TIMEOUT

Extraction exceeded the maximum processing time

Try a smaller page range. Very long, dense documents (200+ pages) may need splitting

INVALID_CONFIG

Configuration was accepted at submit time but invalid for this document (e.g., page range outside the document)

Check pageRange against the actual page count

RATE_LIMITED_INTERNAL

Hit an upstream provider rate limit

Retry with exponential backoff

UNKNOWN_ERROR

Something else went wrong

File a support ticket with the jobId


Failed extractions don't count toward your page allowance — only successful ones do.



Validation errors (HTTP 400 / 422)


These come back with a structured response so you can fix the payload programmatically:


{
"error": {
"code": "VALIDATION_ERROR",
"message": "Invalid request payload",
"details": [
{"field": "columns", "message": "Required field missing"},
{"field": "pageRange", "message": "Invalid range syntax: 'a-b'"}
]
}
}


The details array tells you exactly which fields failed and why.



Rate-limit handling (HTTP 429)


The submit endpoint allows 5 requests per 30 seconds per API key.


When you hit the limit:


  1. Lido returns 429 Too Many Requests.
  2. Optionally, a Retry-After header tells you how long to wait (in seconds).
  3. Back off and retry. Don't retry immediately; you'll just hit the limit again.


Recommended retry logic:


import time

def submit_with_retry(payload, max_retries=5):
delay = 2
for attempt in range(max_retries):
r = requests.post(SUBMIT_URL, headers=headers, json=payload)
if r.status_code == 429:
wait = int(r.headers.get("Retry-After", delay))
time.sleep(wait)
delay = min(delay * 2, 60)
continue
if 500 <= r.status_code < 600:
time.sleep(delay)
delay = min(delay * 2, 60)
continue
r.raise_for_status()
return r.json()
raise RuntimeError("Exceeded max retries")



When NOT to retry


Don't retry these — they'll never succeed without a payload change:


  • 400 (bad request) — fix the payload
  • 401 (unauthorized) — fix the auth
  • 403 (forbidden) — fix permissions
  • 404 (not found) — the resource doesn't exist
  • 413 (file too large) — switch upload method or split the file
  • 422 (unprocessable) — fix the configuration


DO retry these (with backoff):


  • 429 (rate limited)
  • 500 / 502 / 503 / 504 (server errors)



Idempotency


Lido does not yet support an Idempotency-Key header. If you retry a submit because of a network error and don't know whether the original succeeded, you may end up with two extractions of the same file (counting toward your page allowance twice).


Mitigation:


  • On the client side, dedupe by content hash. Compute a SHA-256 of the file before submit; cache the resulting jobId. On retry, look up by hash before submitting again.
  • Don't retry within 30 seconds of a network failure — the original may still be in flight.



Logging recommendations


For every API call, log at minimum:


  • Timestamp
  • HTTP method, URL, status code
  • jobId (in the response body for submit; in the URL for poll)
  • Latency
  • For errors: full response body (redact API keys before logging)


This is the difference between "we have a problem somewhere" and "we have a problem with jobId xyz at 14:23:07 UTC."



Tips


  • Capture jobId at submit time, even on apparent failure. Some 5xx responses still create a job.
  • Set HTTP timeouts of 30+ seconds. Submits with large files and slow networks can take a while.
  • Build retry logic once and reuse it. Don't sprinkle time.sleep calls across the codebase.
  • Alert on EXTRACTION_FAILED rate. If suddenly 20% of jobs fail, something upstream changed (file format, OCR quality, etc.).
  • Don't log API keys. Log a fingerprint (last 4 chars) if you must.



Common mistakes


  • Treating 429 as fatal. It's a "wait" signal.
  • Retrying 4xx errors. They won't succeed without a fix.
  • No backoff on 5xx. Hammering a struggling service makes it worse.
  • Polling forever. Stop after a reasonable cap (5 minutes is plenty for any single document) and surface the timeout to the caller.
  • Missing the Bearer prefix in the Authorization header. Lido returns 401 with no further hint; double-check.
  • Confusing HTTP errors with job errors. A 200 OK with status: "error" is a job error, not an HTTP error.




  • Lido API: quickstart and authentication
  • Extract data via the API: deep dive
  • Webhooks and async processing
  • Improve extraction accuracy (when extractions complete but the data is wrong)

Updated on: 16/04/2026

Was this article helpful?

Share your feedback

Cancel

Thank you!