OCRmyPDF: OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched (macOS App)

OCRmyPDF is a free open-source command-line tool that adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted. It is already being used to scan and search millions of heavy PDF files.

Features

Its features include:

Generates a searchable PDF/A file from a regular PDF
Places OCR text accurately below the image to ease copy / paste
Keeps the exact resolution of the original embedded images
When possible, inserts OCR information as a "lossless" operation without disrupting any other content
Optimizes PDF images, often producing files smaller than the input file
If requested, deskews and/or cleans the image before performing OCR
Validates input and output files
Distributes work across all available CPU cores
Uses Tesseract OCR engine to recognize more than 100 languages
Keeps your private data private.
Scales properly to handle files with thousands of pages.

OCRmyPDF: OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

Features

Categories

License

Platforms

Other Supported Platforms

Source code

Share