› All Posts › programming › ocrmypdf: generate a searchable PDF

ocrmypdf: generate a searchable PDF

May 1, 2025 · 171 words · 1 minute read

What is ocrmypdf? 🔗

ocrmypdf is a command-line app to generate a searchable PDF or PDF/A from a scanned PDF or an image of text.

More information: https://ocrmypdf.readthedocs.io/en/latest/cookbook.html .

Usage 🔗

Create a new searchable PDF/A file from a scanned PDF or image file:

ocrmypdf path/to/input_file path/to/output.pdf

Replace a scanned PDF file with a searchable PDF file:

ocrmypdf path/to/file.pdf path/to/file.pdf

Skip pages of a mixed-format input PDF file that already contain text:

ocrmypdf --skip-text path/to/input.pdf path/to/output.pdf

Clean, de-skew, and rotate pages of a poor scan:

ocrmypdf --clean --deskew --rotate-pages path/to/input_file path/to/output.pdf

Set the metadata of the searchable PDF file:

ocrmypdf --title "title" --author "author" --subject "subject" --keywords "keyword; key phrase; ..." path/to/input_file path/to/output.pdf

Display help:

ocrmypdf --help

I hope you enjoyed reading this post as much as I enjoyed writing it. If you know a person who can benefit from this information, send them a link of this post. If you want to get notified about new posts, follow me on YouTube , Twitter (x) , LinkedIn , and GitHub .

Tags: Programming

ocrmypdf: generate a searchable PDF

What is ocrmypdf? 🔗

Usage 🔗

See Also

OCamlOpt: The OCaml native code compiler

OCamlFind: The findlib package manager for OCaml

OCamlC: The OCaml bytecode compiler

OCaml Repl commands

obs: Open Broadcaster Software

objdump: View information about object files

Orbiton: a simple configuration-free text editor

netdiscover: Network scanner used to find live hosts on a network

neotoppm: convert an Atari Neochrome NEO file into a PPM image

neofetch: display information about operating system, software and hardware