One API for PDF purchase orders, order emails, EDI, and spreadsheets. Extract every line item, match it to your catalog, and return ERP-ready JSON. Built for B2B orders, not generic OCR.
Or try the extraction free, no signup: PO Extractor
From a document in your inbox to a matched, ERP-ready order object in one API call.
Send a PDF, email, EDI file, or spreadsheet to a single endpoint. No per-format parser, no template setup. The API detects the document type for you.
Vision and language models read the PO number, dates, addresses, and every line item from any layout, including tables, headers, and handwritten amendments.
Each line resolves to your SKU using customer-specific aliases and UPCs. Units of measure normalize, prices validate against master data, and a confidence score is attached.
The response is a clean order object you can post straight to your ERP, with field-level provenance back to the source document and a review flag for low-confidence lines.
POST any order document. Get structured, catalog-matched JSON. The example below shows a PDF purchase order resolved to a SKU with a confidence score.
curl https://api.ordersync.io/v1/extract \
-H "Authorization: Bearer $ORDERSYNC_API_KEY" \
-F "file=@purchase-order.pdf" \
-F "customer_id=acme-foods"{
"document_type": "purchase_order",
"po_number": "PO-48821",
"order_date": "2026-06-28",
"requested_delivery": "2026-07-05",
"ship_to": {
"name": "Acme Foods - Tacoma DC",
"address": "1200 Port Rd, Tacoma, WA 98421"
},
"line_items": [
{
"raw_description": "CHOC BAR DARK 70% 12CT",
"matched_sku": "MV-DK70-12",
"quantity": 40,
"uom": "CASE",
"unit_price": 28.50,
"confidence": 0.98
}
],
"review_required": false,
"source": { "page": 1, "format": "pdf" }
}Illustrative contract. Endpoints, fields, and your catalog mapping are set during onboarding.
OCR and document-parsing APIs read documents. An order API understands orders. The difference is everything you would otherwise build yourself.
Generic extraction APIs hand back the text they read. OrderSync resolves each line to your catalog, normalizes the unit of measure, and checks the price. The output is an order you can post, not a transcript you reconcile by hand.
PDF, email, EDI X12, CSV, and Excel orders all hit the same endpoint and return the same JSON shape. You integrate once instead of stitching together an OCR vendor, an EDI translator, and an email parser.
Every line carries a confidence score. Clean machine-generated PDFs post automatically. Scanned, faxed, or ambiguous documents flag for a human-review step before they reach your ERP, so bad data never syncs silently.
Each extracted value traces back to where it appeared in the document. When a customer disputes a quantity, you show the highlighted region in the original PDF instead of arguing from memory.
Same endpoint, different document on the way in. Pick the order source you are drowning in.
A customer emails a PO as a PDF. POST it to the API and get back structured line items, addresses, and dates, with each SKU matched to your catalog and ready to drop into your order system.
Orders arrive as free-text email bodies or mixed attachments. The API reads the message and any attached PDFs or spreadsheets and returns one normalized order object.
Some customers send EDI 850, most send PDFs and email. Route all of them through the same endpoint so your integration does not care how the order arrived.
CSV and Excel order files vary by customer. The API reads them contextually and returns the same JSON as every other format, so you skip building a mapping per template.
It returns structured order JSON: PO number, order and delivery dates, ship-to and bill-to addresses, and line items with product codes, descriptions, quantities, units of measure, and prices. Unlike a generic OCR API, each line item is matched against your catalog and pricing, so you get a resolved SKU and a confidence score, not just raw text.
Generic OCR and document-parsing APIs (Mindee, Nanonets, Rossum, Google Document AI, Amazon Textract) return the fields they read off the page. They do not know your catalog, your customer-specific part numbers, or your pricing. The OrderSync API is purpose-built for B2B orders: it normalizes units of measure, resolves customer aliases to your SKUs, validates pricing against master data, and flags low-confidence lines for review. You get an order you can post, not a transcript you still have to reconcile.
PDF purchase orders (digital-native and scanned), order emails with bodies or attachments, EDI X12 (850, 855, 860), and CSV or Excel order files. One endpoint accepts all of them and returns the same normalized JSON shape, so you integrate once instead of building a parser per format.
Accuracy depends on document quality. On clean, machine-generated PDFs, line-item extraction is typically 95% or better on quantities and pricing. Scanned, faxed, or handwritten documents score lower on individual fields, which is why every line carries a confidence score and low-confidence orders can route to a human-review queue before they reach your ERP.
API access is granted through an onboarding call. We map your catalog, customer part-number aliases, and ERP target during setup so the API returns matched, post-ready orders from day one rather than raw fields. Book an intro call to request a key and see your own documents run through it.
Yes. The free PO Extractor, Invoice Extractor, and Email Order Parser tools run the same extraction engine in the browser with no signup. Upload a document and see the structured output the API would return.
Bring a real purchase order. On the call we map your catalog and ERP, then run your own document through the API so you see matched, post-ready JSON come back. 15 minutes, no commitment.
Request API AccessNo credit card required. Prefer to poke first? The free tools run the same engine with no signup.