Manual data entry is the silent killer of productivity. Nowhere is this more apparent than in accounts payable, where teams spend countless hours manually keying in information from invoices. This process is not only slow and tedious but also notoriously prone to human error, leading to payment delays and costly mistakes.
Traditional automation attempts, like template-based parsers or complex regex, often fail because they're brittle. A single change in an invoice layout from a vendor can break the entire workflow, sending you back to square one.
But what if you could simply describe the data you need—like invoice number, due date, line items, and total amount—and have an intelligent system extract it for you, regardless of the invoice's format? With extract.do, this "Business-as-Code" approach is now a reality. Our AI-powered agent can turn any unstructured invoice into clean, structured JSON with a single API call.
The core problem with automating invoice processing is the lack of a standard format. Invoices arrive as:
Each vendor has a unique layout. The "Total Amount" might be at the bottom right for one, and in a table halfway down for another. This variability makes it nearly impossible to build a one-size-fits-all parser.
Instead of writing fragile code to find data at specific coordinates or by matching text patterns, extract.do uses an AI agent that understands the context of the document. You don't tell it how to find the data; you just tell it what data you want.
You define your desired output using a simple Typescript interface or JSON schema. The AI agent then reads the source document (whether it's raw text, HTML, or text from an OCR'd image) and intelligently maps the information to your defined structure.
Let's see it in action. Imagine you've received an invoice as plain text and need to process it.
First, you define the structure of the data you want to extract. Notice how we can handle nested data like lineItems with ease.
// Define the desired data structure for our invoice
interface InvoiceData {
invoiceNumber: string;
vendorName: string;
dueDate: string; // The AI can handle various date formats
totalAmount: number;
lineItems: {
description: string;
quantity: number;
unitPrice: number;
}[];
}
Next, you provide the source text from the invoice. This could come from a PDF-to-text library, an email parser, or directly from a user.
const invoiceText = `
INVOICE
From: Global Tech Supplies Inc.
Invoice #: INV-2024-9583
Date: May 15, 2024
Bill To:
Innovate Corp
Payment Due: June 14, 2024
---
Description Qty Unit Price Total
---
15" 4K Portable Monitor 2 $250.00 $500.00
USB-C Hub 8-in-1 5 $45.00 $225.00
---
Subtotal: $725.00
Tax (8%): $58.00
TOTAL: $783.00
`;
Finally, you pass the source and your desired structure to the extract.do agent.
import { DO } from '@do-inc/sdk';
// Initialize the .do client
const secret = process.env.DO_SECRET;
const digo = new DO({ secret });
// Run the extraction agent
const extractedInvoice = await digo
.agent<InvoiceData>('extract')
.run({
source: invoiceText,
description: 'Extract key details and all line items from the invoice text.'
});
console.log(JSON.stringify(extractedInvoice, null, 2));
The AI agent processes the text and returns the data in the exact format you requested.
Output:
{
"invoiceNumber": "INV-2024-9583",
"vendorName": "Global Tech Supplies Inc.",
"dueDate": "2024-06-14",
"totalAmount": 783.00,
"lineItems": [
{
"description": "15\" 4K Portable Monitor",
"quantity": 2,
"unitPrice": 250.00
},
{
"description": "USB-C Hub 8-in-1",
"quantity": 5,
"unitPrice": 45.00
}
]
}
Notice how the AI correctly:
This AI-driven approach fundamentally changes how you handle document processing.
Automating invoice processing is no longer a complex, multi-stage ETL project. With extract.do, it's a simple, intelligent API call. By adopting a "Data as Code" philosophy, you empower your developers to build robust, scalable, and maintainable data workflows in a fraction of the time.
Ready to put an end to manual data entry and build intelligent document processing into your applications?
Get started with extract.do today and transform your invoices into structured data.