Image (PNG) to Text Through AppleScript

peavine · April 14, 2023, 1:11am

I reread the OP’s original post, and I’m not sure if the handler contained above will do what he wants. If he wants to get text from an image on the clipboard (which appears to be the case), the script contained below will do that. A few comments:

The script returns a list which is easily coerced to text after setting the desired text item delimiters.
An error is thrown if the clipboard does not contain an image, and error correction needs to be added for that.
I have essentially no expertise with the Vision framework, and there may a better way to to this. Note should also be made of the constraints noted by chrillek.

use framework "AppKit"
use framework "Foundation"
use framework "Vision"
use scripting additions

set theText to getImageText()

on getImageText()
	set thePasteboard to current application's NSPasteboard's generalPasteboard()
	set imageData to thePasteboard's dataForType:(current application's NSPasteboardTypeTIFF)
	set requestHandler to current application's VNImageRequestHandler's alloc()'s initWithData:imageData options:(current application's NSDictionary's alloc()'s init())
	set theRequest to current application's VNRecognizeTextRequest's alloc()'s init()
	requestHandler's performRequests:(current application's NSArray's arrayWithObject:(theRequest)) |error|:(missing value)
	set theResults to theRequest's results()
	set theText to {}
	repeat with observation in theResults
		copy ((first item in (observation's topCandidates:1))'s |string|() as text) to end of theText
	end repeat
	return theText
end getImageText

Asuvathdhaman · April 14, 2023, 2:59pm

Thanks @peavine and @chrillek .

Sorry I do not know the Object C script. So I am not able to understand the code.

My actual task is to count the total number of words used on that Image as report (either excel or CSV).

I thought to store that content in any array, and I will write the data in excel or csv format from that array.

I do not know how to store the data in array in Object C.
How to create the CSV or Excel from Object C.
We need to run the multiple PNG files in a time.

Even I am unable to display the result through “Display alert” comment in applescript, but I can able to run and found the result in Description panel.

Can you please help me to resolve my issue?

Thanks
Asuvath

Asuvathdhaman · April 14, 2023, 4:19pm

Hi @Fredrik71 ,

Thank you so much for your response.

I thought this is what exact want.

Thanks a lot.

Thanks
Asuvath

Asuvathdhaman · December 14, 2023, 12:24pm

Hi @Fredrik71 ,

Hope you are doing well.

After a long days, I got an enhancement of this tool.

I can able to extract the English content from this script as perfect.

Now the requirement is non-English languages like (Chinese, Korean, Japanese, Arabic etc.)

Can you please help me to get it done?

Thanks
Asuvath

peavine · December 14, 2023, 9:53pm

Fredrik71 has answered the OP’s question, but I thought I would update my script FWIW. I tested Fredrik71’s script and my script with a simple image containing English-language text, and the scripts produce essentially identical results–both return a string and both join text fragments with a space.

As written, my script automatically detects the language found in the image, and this generally worked as expected in limited testing with English, French, Japanese, and Chinese languages. If that’s not the case, you can specify the desired languages, but you should be aware of the following, which is from the Vision documentation:

If not otherwise specified, Vision biases its results toward English. To alter its default behavior, provide an array of supported languages in the request’s recognitionLanguages property. The order in which you provide the languages dictates their relative importance. To recognize traditional and simplified Chinese, specify zh-Hant and zh-Hans as the first elements in the request’s recognitionLanguages property. English is the only other language that you can pair with Chinese.

use framework "Foundation"
use framework "Vision"
use scripting additions

set theFile to (choose file of type {"public.image"})
set theText to getText(theFile)

on getText(theFile)
	set theFile to current application's |NSURL|'s fileURLWithPath:(POSIX path of theFile)
	set requestHandler to current application's VNImageRequestHandler's alloc()'s initWithURL:theFile options:(missing value)
	set theRequest to current application's VNRecognizeTextRequest's alloc()'s init()
	theRequest's setAutomaticallyDetectsLanguage:true -- test this first
	-- theRequest's setRecognitionLanguages:{"en", "fr"} -- if the above doesn't work
	theRequest's setUsesLanguageCorrection:false -- language correction if desired but not Chinese
	requestHandler's performRequests:(current application's NSArray's arrayWithObject:(theRequest)) |error|:(missing value)
	set theResults to theRequest's results()
	set theArray to current application's NSMutableArray's new()
	repeat with aResult in theResults
		(theArray's addObject:(((aResult's topCandidates:1)'s objectAtIndex:0)'s |string|()))
	end repeat
	return (theArray's componentsJoinedByString:space) as text -- return a string
end getText

Asuvathdhaman · December 15, 2023, 6:26am

Thanks lot @Fredrik71 .

Asuvathdhaman · December 15, 2023, 6:28am

Thank you very much @peavine , I hope this will solve my request, I will check and confirm.

Asuvathdhaman · December 15, 2023, 11:36am

Thank you so much @Fredrik71 and @peavine , It is working fine for Chinese and Japanese.

I tried with Arabic for this script, but it is not supported, also I am not sure which is Arabic language code, can you please advise on this?

Thanks
Asuvath

peavine · December 15, 2023, 1:47pm

Asuvathdhaman. The Arabic language codes are “ar” and “ara”. However, if set to these codes, my script does not perform OCR with Arabic language examples. I checked (with the supportedRecognitionLanguagesAndReturnError method) and the following languages are supported on my Sonoma computer

{“en-US”, “fr-FR”, “it-IT”, “de-DE”, “es-ES”, “pt-BR”, “zh-Hans”, “zh-Hant”, “yue-Hans”, “yue-Hant”, “ko-KR”, “ja-JP”, “ru-RU”, “uk-UA”, “th-TH”, “vi-VT”}

So, as far as I can ascertain, OCR of Arabic is not supported.

Asuvathdhaman · December 15, 2023, 2:39pm

Thanks @peavine . I will noted this information.

Thanks
Asuvath

Asuvathdhaman · December 15, 2023, 3:09pm

Hi @peavine and @Fredrik71 ,

Some times few image we can’t able to extract the content completely, we get it only few text from that.

Can you please advise on this?

I have attached few images here for your reference.

Thanks
Asuvath
Spanish

peavine · December 15, 2023, 4:05pm

Asuvathdhaman. The image is degraded and I suspect that’s why only small portions of the image could be read. I’ve included a similar paragraph below, and it was read as expected with my script set to automatic mode. The new image was also read with automatic mode disabled and language-specifier mode enabled with the Spanish language code “es-ES”.

Asuvathdhaman · December 16, 2023, 3:56am

Thanks @peavine , I am clear now, I will try this suggestion.

Thanks
Asuvath

Asuvathdhaman · December 16, 2023, 3:59am

Thanks @Fredrik71 . Yes, you are right, it may be due to resolution issue.

Sorry, If I was rush anything to get information from you.

Thanks

Asuvathdhaman · January 18, 2024, 10:32am

Hi @peavine and @Fredrik71 ,

Hope you are doing good.

I tried with many languages (Japanese, Chinese, and Korean) all are working good.

I am trying with Turkish Languages, the below characters are unable to extract, other than these characters are extracted as good. Can we have any idea to get these characters too?

Ş – Latin capital letter S with cedilla
ş – Latin small letter s with cedilla
Ğ – Latin capital letter G with breve
ğ – Latin small letter g with breve
ç – Latin small letter c with cedilla
İ – Latin capital letter I with dot above

Thanks
Asuvath

peavine · January 18, 2024, 1:00pm

The languages supported on your computer can be determined by running the script included below. The Turkish language (codes tr and tur) is not a supported language on my computer. A few comments:

You may want to toggle the setUsesLanguageCorrection property to see if this makes a difference.
The VNRecognizeTextRequest class does have a custom words property, but I don’t think that’s what you want.
I did a quick Google search and couldn’t find any way to add languages for use when performing OCR with VNRecognizeTextRequest. There are commercial apps that will OCR Turkish, though.

use framework "Foundation"
use framework "Vision"

set theRequest to current application's VNRecognizeTextRequest's alloc()'s init()
set supportedLanguageCodes to (theRequest's supportedRecognitionLanguagesAndReturnError:(missing value)) as list
--> {"en-US", "fr-FR", "it-IT", "de-DE", "es-ES", "pt-BR", "zh-Hans", "zh-Hant", "yue-Hans", "yue-Hant", "ko-KR", "ja-JP", "ru-RU", "uk-UA", "th-TH", "vi-VT"}

Asuvathdhaman · January 18, 2024, 3:14pm

Thanks @peavine .

use framework "Foundation"
use framework "Vision"
use scripting additions

set theFile to (choose file of type {"public.image"})
set theText to getText(theFile)

on getText(theFile)
	set theFile to current application's |NSURL|'s fileURLWithPath:(POSIX path of theFile)
	set requestHandler to current application's VNImageRequestHandler's alloc()'s initWithURL:theFile options:(missing value)
	set theRequest to current application's VNRecognizeTextRequest's alloc()'s init()
	--theRequest's setAutomaticallyDetectsLanguage:true -- test this first
	theRequest's setRecognitionLanguages:{"tr", "tur"} -- if the above doesn't work
	theRequest's setUsesLanguageCorrection:true -- language correction if desired but not Chinese
	requestHandler's performRequests:(current application's NSArray's arrayWithObject:(theRequest)) |error|:(missing value)
	set theResults to theRequest's results()
	set theArray to current application's NSMutableArray's new()
	repeat with aResult in theResults
		(theArray's addObject:(((aResult's topCandidates:1)'s objectAtIndex:0)'s |string|()))
	end repeat
	return (theArray's componentsJoinedByString:space) as text -- return a string
end getText

The above code (I have changed as your suggestion) extract the content, but still we are missing the special character as I mentioned.

I hope I have applied your suggestion as correct, please correct me if anything wrong.

Thanks

peavine · January 18, 2024, 3:59pm

Asuvathdhaman. Turkish is not a supported language and, as far as I know, setting the recognized languages to tr and tur does nothing. I got a sample of the Turkish language and the settings shown below returned the best results for Turkish language text, but the results are far from perfect.

use framework "Foundation"
use framework "Vision"
use scripting additions

set theFile to (choose file of type {"public.image"})
set theText to getText(theFile)

on getText(theFile)
	set theFile to current application's |NSURL|'s fileURLWithPath:(POSIX path of theFile)
	set requestHandler to current application's VNImageRequestHandler's alloc()'s initWithURL:theFile options:(missing value)
	set theRequest to current application's VNRecognizeTextRequest's alloc()'s init()
	theRequest's setAutomaticallyDetectsLanguage:true
	theRequest's setUsesLanguageCorrection:false
	requestHandler's performRequests:(current application's NSArray's arrayWithObject:(theRequest)) |error|:(missing value)
	set theResults to theRequest's results()
	set theArray to current application's NSMutableArray's new()
	repeat with aResult in theResults
		(theArray's addObject:(((aResult's topCandidates:1)'s objectAtIndex:0)'s |string|()))
	end repeat
	return (theArray's componentsJoinedByString:space) as text -- return a string
end getText

peavine · January 18, 2024, 4:18pm

FWIW, I’ve included a screenshot below of a Turkish language image example (top) and the results returned by the above script (below). The characters returned are generally OK, but the special characters (especially the cedilla) are often missing. I don’t know if this is a limitation of the OCR code or because the Turkish language is not supported.

Asuvathdhaman · January 18, 2024, 5:18pm

Thanks @peavine .

Yes, I am also got the same, I have the problem with below characters as mentioned earlier.

Ş – Latin capital letter S with cedilla
ş – Latin small letter s with cedilla
Ğ – Latin capital letter G with breve
ğ – Latin small letter g with breve
ç – Latin small letter c with cedilla
İ – Latin capital letter I with dot above

Thanks
Asuvath