Automate PDF Search via Safari Using Applescript

I have about 1000 URLs to PDF files that I need to determine which are searchable and which are not.

The script needs to:

For each URL:

Tell Safari to

  1. Open a given URL (in this case a remote PDF)
  2. Search the PDF for the character “a” (The search that opens on a right-click of the document - not Apple F)
  3. Write the search result to a file

Can someone help me with steps 2 and 3?

set urlList to {“http://pricelist.list.com/pricelists/A/AEA_11-15-12.pdf”,
http://pricelist.list.com/pricelists/A/API_1608_04-05-13.pdf”,
http://pricelist.list.com/pricelists/A/Access_02-01-12.pdf”,
http://pricelist.list.com/pricelists/A/Allparts_Retail_01-01-11.pdf”}

set numURLs to (count urlList)

repeat with i from 1 to (numURLs)

set theURL to (item i of urlList)
tell application “Safari”
open location theURL
activate
end tell
end repeat

Thanks

Model: iMac
AppleScript: 2.2.1
Browser: Safari 6.0.2
Operating System: Mac OS X (10.7)

This is slightly different from what you asked, but I think I found a fix to your issue. For step 2:

tell application "System Events" to tell process "Safari" to tell front window to set PDFtext to (value of static text 1 of scroll area 1 of splitter group 1 of group 1 of group 2) as string

That 1 line of code will copy the entire text of searchable PDF’s into the variable PDFtext. If it’s not searchable, the value will be “”. You can then search the variable for “a” if you like, or save the entire text somewhere if that’s more useful.

Alternatively, you can remove “as string” from the code. It will work the same for searchable PDF’s, but will have a missing value if there is no searchable text. You can make use of that like this:

try
	get PDFtext
	display dialog "PDF # " & i & " is searchable."
on error
	display dialog "PDF # " & i & " is not searchable."
end try

This works consistently for me. It’s possible it may need to be adjusted for different systems, but to the best of my knowledge Safari is Safari and the UI hierarchy is the same across OSX versions.

Model: Mac Mini
AppleScript: 2.1.2
Browser: Safari 534.59.8
Operating System: Mac OS X (10.6)