PDF formats explained

Not all PDFs are the same. Learn the difference between native PDFs, scanned PDFs, PDF/A, and PDF/X — and what each means for conversion and editing.

Native PDF vs scanned PDF

The most important distinction for file conversion is whether a PDF is native (text-based) or scanned (image-based).

Native PDFs are created digitally — from Microsoft Word, Google Docs, InDesign, or any software that exports to PDF. They contain actual text data: every character is stored as text that can be selected, searched, and extracted. Native PDFs convert accurately to DOCX, EPUB, or plain text.

Scanned PDFs are photographs of paper documents. A scanner takes a picture of each page and embeds it in the PDF. There is no text — only images of text. You can view the content but not select, search, or copy it. Converting a scanned PDF to Word produces a DOCX with embedded images, not editable text. To get editable text from a scanned PDF, you need OCR (Optical Character Recognition) software first.

How to tell which you have: try selecting text in your PDF reader. If you can highlight individual words, it's a native PDF. If the cursor is a crosshair that selects rectangular regions, it's scanned.

PDF subtypes

Type	Purpose	Key property
PDF 1.x–2.0	General purpose	Standard PDF — the format you create and receive in daily use
PDF/A	Long-term archiving	Self-contained: all fonts and color profiles embedded; no external references
PDF/X	Print production	Standardizes color and bleed specifications for professional printing presses

For everyday use, standard PDF is always correct. PDF/A is required by government archives, legal systems, and long-term document management systems. PDF/X is for print shops and publishing workflows — you'll know if you need it because the print house will tell you.

Ready to convert?

Start converting your files for free — no signup required.

Convert a file