Optical Character Recognition (OCR) Shortcuts

The Shortcuts app does a good job of OCR, and I’ve included two shortcuts below. I have only tested these with documents that contain English text.

The first shortcut performs OCR on a file selected by the user:

OCR File.shortcut (23.0 KB)

The second shortcut prompts the user to select an area of the screen and performs OCR on text within the selected area. This shortcut uses the multi-condition if statement introduced in Sequoia, and I don’t know what happens if you have an older version of macOS. The following screenshot only shows a portion of the shortcut.

OCR Select.shortcut (22.9 KB)

Thank you for the great shortcuts. I use them both on the new Tahoe upgrade. ‘OCR File’ works without any issues, but the ‘OCR Select’ was throwing an error on the screencapture -ioxac line. I fixed it on my setup by replacing the zsh command with ‘Take Screenshot’ & ‘Copy to Clipboard’ actions.

1 Like

bolt. Thanks for the post. I’m always happy to hear that one of my shortcuts is of use.

I got a similar error with my OCR Select shortcut, and the reason appears to be a screen-recording permissions issue. After fiddling around a bit, I was able to get the shortcut to work when run by way of the Shortcuts app, a menu-bar icon, and a dock icon. I could not get it to work when run by Spotlight search.

I tested your shortcut and it worked well. Because the Extract Text from Image action can directly access the image returned by the Take Screenshot action, you may not need to involve the clipboard. For example:

OCR Select.shortcut (22.6 KB)

BTW, the reason I originally used the screencapture shell command was because the Take Screenshot action had a bug that caused it not to work reliably. That no longer appears to be the case, so I’ll use the new approach.

It’s a much cleaner solution, thanks!

Though not a Shortcut solution, Apple’s Preview application has had an Export… PDF OCR solution since macOS Sonoma. Being Apple, they did not use the recognizable term “OCR” but instead offer an “Embed Text” option as shown below:

Screenshot 2025-09-25 at 9.11.05 AM

This results in a searchable PDF after opening and exporting from a scanned PDF. It has worked on scanned English and German PDF text content. Probably language independent.

For images viewed in Quick Look, and if Live Text is enabled, one can select the text from the viewed image, copy to the clipboard and use elsewhere.

1 Like

The occasion might arise where the user is not able to select the app or window that will be the subject of OCR. An automation is an example.

In this instance, it’s a simple matter to use a Find Apps or Find Windows action followed by an Extract Text from Image action. The app has to be running, of course, but does not have to be frontmost. I don’t know if there’s any advantage to using the Find Apps or the Find Windows action. Both seem to work well.

Just for an interesting project, I wrote a shortcut that:

  • uses OCR to extract text from the frontmost window of the current app;

  • appends the extracted text to a text file in the folder set by the user at the beginning of the shortcut;

  • includes a header with the app name before the extracted text;

  • deletes lines that are blank, contain whitespace only, and have 3 or fewer characters; and

  • removes non-alphanumeric characters from the beginning of each line.

I ran the shortcut by way of Spotlight and FastScripts (using a shell script), and it worked without issue.

OCR App.shortcut (24.1 KB)

The shortcut did a good but far from perfect job with this 1506- by 2000-pixel photo.

1 Like