Unable to read the URL of file

Hi,
I am trying to extract the text from a PDF file using commands that I have found on this site that use the PdfKit framework.

The command I am using requires the URL of the pdf and while I have a solution I wonder why I am unable to read the URL value from the files property list.

Here is my code:

# Prompt for a folder of files
tell application "Finder"
	set file_list to entire contents of (choose folder with prompt "Please select directory.")
end tell


# Loop through each file extracting its text
repeat with i from 1 to (count of file_list)
	set FileKind to the kind of item i of file_list
	if FileKind is "PDF document" then
		# Next two lines do not work in this context
		--set posixPath to POSIX path of item i of file_list
		--set theURL to (current application's class "NSURL"'s fileURLWithPath:posixPath)
		
		# Try to get the URL via the properties - FAIL
		set thePDF to item i of file_list
		set tFileProperties to (properties of thePDF)
		set theURL to the URL of tFileProperties
		
		
		# This works but requires a call to the Finder
		tell application "Finder" to get URL of thePDF
		set theURL to the result
		
		# calls to PDFkit
		set thePDF to (current application's PDFDocument's alloc()'s initWithURL:theURL)
		set theText to thePDF's |string|() as text
		log "done"
	end if -- file is a pdf document
end repeat

Obviously I don’t understand how to work with properties. Any thoughts or guidance ?

S

Well, I have solved the problem which was getting the file reference into the correct form and remembering that an alias is not the same as a string containing the same characters.

Here is my revised code snip. I’m more than happy to have someone improve on it and or explain what is happening. I have looked at the documentation for the PDFkit framework and am not any the wiser.

# Prompt for a folder of files
tell application "Finder"
	set file_list to entire contents of (choose folder with prompt "Please select directory.")
end tell


# Loop through each file extracting its text
repeat with i from 1 to (count of file_list)
	set FileKind to the kind of item i of file_list
	if FileKind is "PDF document" then
		
		# AS does not allow a posix path to be coerced.
		set thePDFfilePath to item i of file_list as alias
		set thePDFfilePath to POSIX path of thePDFfilePath
		
		# calls to PDFkit
		# Initialise a pdf document
		set theURL to (current application's class "NSURL"'s fileURLWithPath:thePDFfilePath)
		set thePDF to (current application's PDFDocument's alloc()'s initWithURL:theURL)
		
		set theText to thePDF's |string|() as text
		log "done"
	end if -- file is a pdf document
end repeat

S

Simon. I’ve included my suggestion below with explanatory comments. For various reasons, I would not use the kind property to test for PDFs. You might instead use a whose clause with Finder to filter PDFs, but that doesn’t seem to work with macOS 26.3 (and the ASObjC solution is simpler and probably faster).

BTW, an error is reported if the chosen folder does not contain two or more PDFs. You can add error correction based on your use scenario.

use framework "Foundation"
use framework "PDFKit"
use scripting additions

# Get all files in the chosen folder and its subfolders.
# Return a list of aliases which can be coerced to POSIX paths.
tell application "Finder"
	set file_list to every file of the entire contents of (choose folder with prompt "Please select directory.") as alias list
end tell

# For testing only.
set pdfContents to ""

# Loop through each file.
repeat with aFile in file_list
	
	# Create a NSURL which in this case is a path to a local file.
	# A NSURL is required by PDFDocument.
	set theURL to (current application's |NSURL|'s fileURLWithPath:(POSIX path of aFile))
	
	# Get lowercase file extension.
	set fileExtension to (theURL's pathExtension())'s lowercaseString()
	
	# Test for file extension.
	if (fileExtension's isEqualToString:"pdf") is true then
		
		# Get the PDF document object.
		set thePDF to (current application's PDFDocument's alloc()'s initWithURL:theURL)
		
		# Get an NSString from the PDF document and coerce to text..
		set theText to thePDF's |string|() as text
		set pdfContents to pdfContents & theText # For testing only
		log "done"
	end if
	
end repeat

# Just to make sure everything is working as expected
return pdfContents # for testing only

Many thanks, I shall incorporate your suggestions.

Here a version I made. It doesn’t use the Finder.

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions
use framework "Foundation"
use framework "PDFKit"

local aFolder, myFiles, i
set aFolder to choose folder
set myFiles to list folder aFolder without invisibles
repeat with i from (count myFiles) to 1 by -1 -- this loop will remove all non-pdf files
	if item i of myFiles does not end with ".pdf" then
		set item i of myFiles to item 1 of myFiles
		set myFiles to rest of myFiles
	end if
end repeat
set aFolder to "file://" & POSIX path of aFolder
set pdfContents to ""
repeat with i in myFiles
	set theURL to aFolder & i
	# Get the PDF document object.
	set thePDF to (current application's PDFDocument's alloc()'s initWithURL:(current application's |NSURL|'s URLWithString:theURL))
	# Get an NSString from the PDF document and coerce to text..
	set theText to thePDF's |string|() as text
	set pdfContents to pdfContents & theText # For testing only
end repeat

FWIW, the above is incorrect. Both of the following worked on my macOS 26.3 computer. However, using ASObjC to filter PDF documents is probably much quicker.

tell application "Finder"
	set file_list to (every file of the entire contents of (choose folder with prompt "Please select directory.") whose kind is "PDF Document") as alias list
	# 	set file_list to (every file of the entire contents of (choose folder with prompt "Please select directory.") whose name extension is "pdf") as alias list
end tell