James DarbyJames Darby
June 30, 2026
Last reviewed June 30, 2026
5 min read
Technology

Convert PDF Purchase Orders to JSON (or EDI) via API

How to turn PDF purchase orders into structured JSON or compliant EDI through an API, what the response looks like, and how to handle scanned and low-confidence documents.

To convert a PDF purchase order programmatically, POST it to an extraction API that returns structured JSON, then route that JSON either into your ERP or into a compliant EDI document, depending on who needs it next.

The PDF is the universal order format. Every buyer can produce one, which is exactly why it lands in your inbox more than any other channel. The problem is that a PDF is a picture of an order, not the order itself. To do anything automated with it, you have to turn it into data.

There are two destinations worth converting to, and the right one depends on what sits downstream.

Destination one: PDF to JSON

JSON is the format your own systems speak. Convert the PDF to JSON when the order is going into your ERP, your order management system, or any internal pipeline.

A single API call does it. You send the file, the API extracts and matches, and you get back a normalized object:

{
  "po_number": "PO-48821",
  "order_date": "2026-06-28",
  "ship_to": { "name": "Acme Foods - Tacoma DC" },
  "line_items": [
    {
      "raw_description": "CHOC BAR DARK 70% 12CT",
      "matched_sku": "MV-DK70-12",
      "quantity": 40,
      "uom": "CASE",
      "unit_price": 28.50,
      "confidence": 0.98
    }
  ],
  "review_required": false
}

The value is in the matched_sku and confidence fields. A raw text dump of the PDF would still leave you mapping "CHOC BAR DARK 70% 12CT" to your catalog by hand. An order-aware extraction API does that resolution and tells you how sure it is, so high-confidence orders post automatically and the rest stop for review.

Destination two: PDF to EDI

Sometimes the order has to leave your building again as EDI, because a retailer or trading partner requires it. In that case you convert the PDF to a compliant X12 document rather than to internal JSON.

The mechanics are the same up front: extract and match. The difference is the output format. Instead of JSON, the system emits an EDI 850 purchase order that carries the partner's exact qualifiers, separators, and version. The X12 standard defines the transaction set, and each partner layers their own requirements on top. OrderSync handles this as PDF to EDI, and the same engine covers email to EDI when the order arrives as a message instead of an attachment.

If you want to see what a finished 850 should contain before you generate one, the EDI 850 purchase order guide breaks it down segment by segment, and the free EDI Inspector parses a real one in the browser.

Handling scanned and low-confidence PDFs

Not every PDF is clean. Plenty are scans of printed pages, or faxes saved to PDF, or photos a rep took on a phone. Conversion still works, with a few caveats:

  • OCR runs first: The API reads the image, then extracts. Accuracy on clean machine-generated PDFs is typically 95% or better on quantities and pricing, and lower per field on scans and handwriting.
  • UPCs make matching reliable: When a UPC is on the document, the identifier resolves to your SKU unambiguously (GS1 US governs that standard); when only a description exists, the alias logic does the work.
  • Confidence routing matters more here: Because field accuracy drops on poor scans, the review flag protects your ERP, so a misread quantity never becomes a short shipment.

JSON or EDI: which destination

Convert to JSONConvert to EDI
Use whenOrder goes into your own ERP or systemsA trading partner requires an X12 document
OutputNormalized order objectCompliant EDI 850 with partner qualifiers
ConsumerYour create-order APIRetailer or partner mailbox (AS2, SFTP, VAN)
Extraction stepIdenticalIdentical

The extraction and matching are the same for both. Only the output format changes, so you choose per destination rather than per document.

A practical sequence

  1. Receive the PDF, by email, upload, or webhook.
  2. POST it to the extraction endpoint.
  3. Read the JSON response and check review_required.
  4. If clean, post to your ERP or generate the EDI 850. If flagged, route to review.
  5. Keep the source document linked to the extracted data for audit and disputes.

Most teams that get orders in several formats run all of them through one multi-format pipeline so step two is identical whether the order was a PDF, an email, or an EDI file. The downstream code only ever sees clean JSON.

Frequently asked questions

Can I really convert any PDF layout without setup?

Yes. An AI-based extraction API reads documents by structure and meaning, not fixed coordinates, so a new customer layout converts on first contact without a template project.

JSON or EDI, which should I convert to?

JSON if the order is going into your own systems. EDI if a trading partner requires an X12 document. The same extraction feeds both, so you decide per destination, not per document.

What about the data on a bad scan?

OCR handles the image and confidence scoring handles the uncertainty. Low-confidence lines flag for review rather than posting silently.

How do I get an API key?

Request access to the extraction API. Onboarding maps your catalog and ERP so the conversion returns matched, post-ready orders. You can try the conversion free first with the PO Extractor.

James Darby

Stop manually entering orders

OrderSync turns EDI, email, PDF, and fax orders into structured data automatically. See how it works for your business.

Related Articles

Purchase Order API: Extract PO Data Programmatically

How a purchase order API turns PDF and email POs into structured JSON your systems can post. What to expect from the endpoint, the response shape, and matching.

Technology

AI Order Entry Systems: How They Work and When to Use One | OrderSync Blog

AI order entry systems extract purchase order data from any format without templates or manual setup. Here is how they work, where they outperform traditional systems, and where they do not.

Technology

AI Order Agent vs EDI: Do You Still Need EDI?

How AI order agents compare to traditional EDI for B2B order processing, when you need both, and when an AI agent can replace EDI entirely.

Technology

AI Order Agent vs Manual Entry Compared

A side-by-side comparison of AI order agents and manual data entry for B2B order processing, with real cost, speed, and accuracy numbers.

Technology

AI Order Processing vs OCR: Key Differences

How AI-powered order processing compares to traditional OCR and template-based extraction, and why AI handles layout variations that break OCR systems.

Technology

AI-Powered EDI Processing for Small Teams

EDI is mandatory for major retailers but brutal for small teams. AI-powered EDI processing automates validation, exception handling, and ERP sync.

Technology

AI vs EDI vs API: B2B Order Processing

EDI and APIs handle data transport. AI handles data intelligence. The real question isn't which protocol to use, but how AI transforms order processing.

Technology

EDI vs API: Choosing the Right Method

Compare EDI and API integration for e-commerce and retail. Pros, cons, costs, and use cases for each approach to help you decide.

Technology

More from the Blog