NSFileManager, enumeratorAtPath and pathExtension

Doing some example with instance method enumeratorAtPath

This ASObjC script will search in specified path for every file with specified extension.

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

set thePath to POSIX path of (path to desktop)

my searchByExtension:"pdf" inPath:thePath

on searchByExtension:theExt inPath:thePath
	set manager to current application's NSFileManager's defaultManager()
	set enumerator to manager's enumeratorAtPath:thePath
	set theItems to enumerator's allObjects()
	set theList to {}
	repeat with i from 1 to (count theItems)
		if (((item i of theItems)'s pathExtension)'s isEqualToString:theExt) then
			set theItem to thePath & (item i of theItems) as text
			set the end of theList to theItem
		end if
	end repeat
	return theList
end searchByExtension:inPath:

Here is other approach of above handler but instead we use NSPredicate to filter. This version
do no use repeat loop and are faster. It use the filter endswith I have also include if statement to change the class of theExt to list

There is also total count of enumerator’s allObjects() to get a idea how many objects to filter.

on searchByExtension:theExt inPath:thePath
	if (class of theExt is text) then set theExt to {theExt}
	set manager to current application's NSFileManager's defaultManager()
	set enumerator to manager's enumeratorAtPath:thePath
	set theItems to enumerator's allObjects()
	set totalObjects to (count theItems)
	set predicate to current application's NSPredicate's predicateWithFormat:"(self endswith %@)" argumentArray:theExt
	set anArray to theItems's filteredArrayUsingPredicate:predicate
	set theSortList to (anArray's sortedArrayUsingSelector:"compare:") as list
	return {objects:theSortList, countObjects:totalObjects} as record
end searchByExtension:inPath:

Fredrik71. Thanks for the thread, which is a topic I’ve been working on. I tested your first script and it worked fine but took 0.496 second to run. This is with a test folder containing about 380 files including 370 PDF files and 100 folders.

I tried various alternatives and the following was the fastest at 0.034 second.

use framework "Foundation"
use scripting additions

set theFolder to POSIX path of (choose folder)
set theFiles to getFiles(theFolder, {"pdf"})
repeat with anItem in theFiles
	set contents of anItem to POSIX path of anItem
end repeat
theFiles

on getFiles(theFolder, fileExtensions)
	set theFolder to current application's |NSURL|'s fileURLWithPath:theFolder
	set fileManager to current application's NSFileManager's defaultManager()
	set folderContents to (fileManager's enumeratorAtURL:theFolder includingPropertiesForKeys:{} options:0 errorHandler:(missing value))'s allObjects()
	set extensionPredicate to current application's NSPredicate's predicateWithFormat_("pathExtension.lowercaseString IN %@", fileExtensions)
	return (folderContents's filteredArrayUsingPredicate:extensionPredicate) as list
end getFiles

BTW, I tested the above script without the repeat loop–in which case it returned a list of files («class furl»)–and it took 0.025 second.

@peavine
Thanks for sharing, I made some change in your script to include sorting and I remove your
repeat loop to get POSIX path. As you could see I use folderContents’s |path| to get a path string and use that in the sortedArrayUsingSelector.

So that answer your last comment about «class furl» and what you could do instead of repeat loop.

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

set thePath to POSIX path of (choose folder)

getFiles(thePath, {"pdf"})

on getFiles(theFolder, fileExtensions)
	set theFolder to current application's |NSURL|'s fileURLWithPath:theFolder
	set fileManager to current application's NSFileManager's defaultManager()
	set folderContents to (fileManager's enumeratorAtURL:theFolder includingPropertiesForKeys:{} options:(current application's NSDirectoryEnumerationSkipsPackageDescendants) errorHandler:(missing value))'s allObjects()
	set totalObjects to count folderContents
	set extensionPredicate to current application's NSPredicate's predicateWithFormat_("pathExtension.lowercaseString IN %@", fileExtensions)
	set anArray to (folderContents's |path|)'s filteredArrayUsingPredicate:extensionPredicate
	set theSortList to (anArray's sortedArrayUsingSelector:"compare:") as list
	return {objects:theSortList, countObjects:totalObjects} as record
end getFiles

I did a compare test:
With your script and the script above including: (sorting, without repeat loop and POSIX path) is faster but not much. But I guess it all depends how many files to filter and later run that throw a repeat loop.

@Shane
Thanks, that is a good point. I have include NSDirectoryEnumerationSkipsPackageDescendants
in the above script to not search in content package. If the contents of the path do not
effect by using an option parameters it will be faster without the extra checks.

Reference: https://developer.apple.com/documentation/foundation/nsdirectoryenumerationoptions/nsdirectoryenumerationskipspackagedescendants?language=objc

One of the big advantages of using the URL enumerator is that you can set options. Skipping files in bundles may save time, and sometimes it may also save embarrassment.

Thanks Fredrik71–that’s excellent.

I modified my script to eliminate the repeat loop and to use the path property instead. I then reran the timing tests. I also tested a script that used Finder.

Script returns POSIX paths using path - 0.014 second
Script returns File paths («class furl») - 0.024
Script returns POSIX paths using repeat loop - 0.030
Script returns Aliases using Finder - 29.758 seconds

FWIW, my final version of the handler, which is included below, does not descend into packages, skips hidden files, and returns POSIX paths sorted by path. The timing result for this script was 0.021 second.

on getFiles(theFolder, theExtensions) -- theFolder has a POSIX path and theExtensions is a list
	set theFolder to current application's |NSURL|'s fileURLWithPath:theFolder
	set fileManager to current application's NSFileManager's defaultManager()
	set enumOptions to (current application's NSDirectoryEnumerationSkipsPackageDescendants as integer) + (current application's NSDirectoryEnumerationSkipsHiddenFiles as integer)
	set theFiles to (fileManager's enumeratorAtURL:theFolder includingPropertiesForKeys:{} options:enumOptions errorHandler:(missing value))'s allObjects()
	set filterPredicate to current application's NSPredicate's predicateWithFormat_("pathExtension.lowercaseString IN %@", theExtensions)
	set theFiles to ((theFiles's |path|)'s filteredArrayUsingPredicate:filterPredicate)
	return (theFiles's sortedArrayUsingSelector:"localizedStandardCompare:") as list
end getFiles

The following is the same script except it sorts by file name rather than path. The timing result for this script was also 0.021 second.

on getFiles(theFolder, theExtensions) -- theFolder has a POSIX path and theExtensions is a list
	set theFolder to current application's |NSURL|'s fileURLWithPath:theFolder
	set fileManager to current application's NSFileManager's defaultManager()
	set enumOptions to (current application's NSDirectoryEnumerationSkipsPackageDescendants as integer) + (current application's NSDirectoryEnumerationSkipsHiddenFiles as integer)
	set theFiles to (fileManager's enumeratorAtURL:theFolder includingPropertiesForKeys:{} options:enumOptions errorHandler:(missing value))'s allObjects()
	set filterPredicate to current application's NSPredicate's predicateWithFormat_("pathExtension.lowercaseString IN %@", theExtensions)
	set theFiles to theFiles's filteredArrayUsingPredicate:filterPredicate
	set sortDescriptor to current application's NSArray's arrayWithObject:(current application's NSSortDescriptor's sortDescriptorWithKey:"lastPathComponent" ascending:true selector:"localizedStandardCompare:")
	return ((theFiles's sortedArrayUsingDescriptors:sortDescriptor)'s |path|) as list
end getFiles

@peavine

Here is other version of your last handler to include wildcard. In other words if we like to filter
specific characters in lastPathComponent.

LIKE is other words for wildcard and it use * and ? with some characters

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

getFiles(POSIX path of (choose folder), {"py"}, "c*")
on getFiles(theFolder, theExtensions, theWildcard) -- theFolder has a POSIX path and theExtensions is a list
	set theFolder to current application's |NSURL|'s fileURLWithPath:theFolder
	set fileManager to current application's NSFileManager's defaultManager()
	set enumOptions to (current application's NSDirectoryEnumerationSkipsPackageDescendants as integer) + (current application's NSDirectoryEnumerationSkipsHiddenFiles as integer)
	set theFiles to (fileManager's enumeratorAtURL:theFolder includingPropertiesForKeys:{} options:enumOptions errorHandler:(missing value))'s allObjects()
	set filterPredicate to current application's NSPredicate's predicateWithFormat_("(pathExtension.lowercaseString IN %@) AND (lastPathComponent like %@)", theExtensions, theWildcard)
	set theFiles to theFiles's filteredArrayUsingPredicate:filterPredicate
	set sortDescriptor to current application's NSArray's arrayWithObject:(current application's NSSortDescriptor's sortDescriptorWithKey:"lastPathComponent" ascending:true selector:"localizedStandardCompare:")
	return ((theFiles's sortedArrayUsingDescriptors:sortDescriptor)'s |path|) as list
end getFiles

Fredrik71, I tested your script and it works well. FWIW, changing “like” to “like[cd]” will make it case and diacritic insensitive, if that’s desired.

@peavine you are right, but its more useful I think without specially if you search for tags
in filename. Same approach Apple guidelines use to name methods and properties we could use
to name files. Uppercase could mean something or separate words in name when we decide what a file should be called.

I could search for every python script with NS* is very useful. I also do the same approach with
AppleScriptObjC in my library there I name it in the same approach as Apple guidelines.

Handlers become methods in my world.

ex.
I have every man pages in text format (stored in same parent directory with subfolders) to be able to search any text content and its very fast. I have over 14.000 files of man pages and it take less and 10 seconds to find any word from it file content. When we like to filter a large collection we sometimes need
to convert its format to something a computer are very good at to find and do it faster. To limit
the subdirectory three and only search in specified path and not everywhere make it much faster.

I find NSPredicate to be a very powerful but its not always straightforward how to code it in
AppleScriptObjC. On the internet there are great of information how to do it in Objective-C and
Swift. I have not done all examples in:

Predicate Programming Guide
https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/Predicates/AdditionalChapters/Introduction.html

If you are intresting in Coding Guidelines for Cocoa
https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/CodingGuidelines/CodingGuidelines.html

If you like to learn NSComparisonPredicate
https://developer.apple.com/documentation/foundation/nscomparisonpredicate?language=objc

If you like to learn NSExpression
https://developer.apple.com/documentation/foundation/nsexpression?language=objc

If you like to learn FormatString
https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/Strings/Articles/FormatStrings.html

Predicate filters is query language of it own and the last week I have learn so much.