How to use Lido's Data Extraction Solution: The Basics
🧾 How to Extract Data from PDFs into Spreadsheets with Lido
🎥 Watch the quick tutorial (2 min): How to Extract Data from PDFs Using Lido
If your team handles a high volume of PDFs — invoices, forms, receipts, or reports — you know how time-consuming and error-prone manual entry can be.
With Lido, you can automate the entire process. This guide shows you how to extract structured data (like invoice numbers, vendor names, and amounts) directly into spreadsheet rows — no manual typing required.
✅ What You’ll Need
- A Lido account (lido.app)
- One or more documents to extract data from
📁 Supported file types: PDFs, scanned images, JPGs, PNGs, TIFFs, Excel files, and Word documents can all be processed.
🧠 Step 1: Click Extract Data and Upload Your Documents
Click the green Extract Data button on the left side of your file.
Next, upload one or more documents you want to extract data from.
Once uploaded, Lido automatically analyzes the files and intelligently populates column headers based on the content it detects — for example Invoice Number, Date Due, Vendor Name, and Amount Due.
You can:
- Add more headers to capture extra fields not automatically suggested
- Remove or rename headers for data you don’t need
- Reorder headers directly in the spreadsheet or inside the Extract Data interface
💡 The Lido column headers are your main instructions for what to extract. Because Lido reasons intelligently, you don’t need to match headers exactly to field labels in your PDFs, and you can reuse one extractor across different layouts automatically.
⚙️ Step 2: Define Your Data Fields
You’ll be prompted to confirm or adjust the data fields Lido should pull from each document — in the same order as your spreadsheet headers for a clean, consistent output.
Example input
Invoice Number
Date Due
Vendor Name
Amount Due
🧩 Step 3: Add Extra Instructions (Optional but Powerful)
Extra Instructions let you talk to the tool naturally — as if you were guiding a human. Use this field to tell Lido:
- Where to find data if it isn’t clearly labeled
(e.g., “The invoice number appears within a line item that follows a specific syntax.”)
- How to format outputs
(e.g., “Use MM/DD/YYYY for dates” or “numbers only for amounts.”)
- When to output or skip rows
(e.g., “Do not output a row for line items where the quantity sold is zero.”)
- Custom logic and rules
(e.g., “If vendor is XYZ, output 1A; if ABC, output 2B.”)
- Computations before outputting
- Or really anything you’d tell a human
These instructions give you full flexibility for complex document types, special formatting, and dynamic logic.
📂 Step 4: Process All Files
When you’re ready, click Process All Files.
Lido will automatically extract the specified data from every uploaded document and output it neatly into your spreadsheet — one row per file.
🧠 “Set it and forget it.” Once your extractor is configured, it’s automatically saved. You don’t need to press any save button or adjust it between runs.
🔄 Example Output
Invoice Number | Date Due | Vendor Name | Amount Due |
---|---|---|---|
10023 | 2025-09-15 | ABC Supply Co. | 1 245.67 |
10024 | 2025-09-22 | Delta Freight | 3 210.45 |
🚀 Why Use Lido for Data Extraction
- No manual data entry: Extract hundreds of files in minutes
- Consistent structure: Keep data aligned across every document
- AI precision: Customize results using Extra Instructions
- Flexible: Works with PDFs, images, Word, Excel, and more
- Auto-saved: Your configurations persist automatically
Updated on: 08/10/2025
Thank you!