Articles on: PDF extraction

Adding context files to the data extractor

You can upload external documents to give Lido more context about how your data should be extracted.


This is useful when your extraction rules already exist somewhere else, or when you want to show examples instead of re-typing instructions.


What Are Context Documents?


Context documents are reference files that help guide how Lido interprets and extracts data from your PDFs, emails, or other inputs.

Instead of writing long instructions in the prompt, you can simply attach:


  • An existing SOP or internal guide
  • A sample document with notes or highlights
  • A spec sheet or data dictionary
  • A filled-out example showing where key fields live


Lido will use these documents as additional context when performing the extraction.


When Should I Use This?


Use context documents when:


1. You already have extraction rules written down

If your team has an SOP that explains how data should be interpreted (e.g., “Always use the ‘Net Total’ field, not the subtotal”), just upload it instead of rewriting everything.


2. The layout is complex or inconsistent

If fields move around between documents, providing annotated examples helps Lido understand what to look for.


3. You want to show, not tell

A marked-up example PDF is often clearer than a long text description.


Supported Sources

You can add context documents from:


  • Public URLs
  • Google Drive
  • OneDrive
  • SharePoint


Just choose the source, paste the link, and upload.


Note: You can add multiple documents, with a maximum of 50 total pages across all context files.


How It Works


  1. Open your extractor
  2. In the Extra Instructions section of the data extractor, click + Context Doc
  3. Select a source (URL, Google Drive, etc.)
  4. Upload your reference file(s)
  5. Run your extraction as usual

Lido will automatically use the uploaded documents to guide the extraction logic.


Best Practices


To get the best results:


  • Use clear examples – Highlight or annotate where key fields appear
  • Be specific – If your SOP mentions edge cases, include them
  • Avoid outdated docs – Make sure your references reflect current rules
  • Keep it concise – Only include what’s relevant to extraction



Updated on: 09/01/2026

Was this article helpful?

Share your feedback

Cancel

Thank you!