SPEED OF do shell script VS tell application finder

Hello,

please forgive me if this topic have been already discussed. I looked and searched around for a while with no success…

Over the time I developed my own library of handlers that I use regularly and intensively. A lot of those handlers deal with files (copying, moving, create, delete, rename…).

Files management is done most of the time telling application “Finder” to do this and this…

I suspect that substituting tell finder blocks with equivalents do shell script would end in big speed improvement since there is no apple events involved.

Before beginning a complete rewrite of a lot of handlers I would like to know if anyone can confirm this or, even better, give me some idea of the speed gain involved in the conversion.

Thanks in advance,
Paolo Bertani.

Hi, Paolo. Welcome to MacScripter.

Yes. I’ve you’re managing large numbers of files, shell scripts will be much faster than the Finder. The exact improvement will obviously depend on how the old and new scripts are written. :wink:

Hi Nigel,
thank you.

While waiting for a reply I started making some experiments…

Every script that operate on finder items at some time requires the list of the files and folders for a given path.

So the first handler I wrore, ItemsInFolder, does exacty that. I wrote it both in the tell finder way and in the do shell script way.

For testing I wrote a third handler that uses ItemsInFolder to get the count of the items (files and folders) inside a given folder, scanning subfolders recursively.

I got the following results (in the form of {items count, seconds})

Finder way, via ItemsInFolderFinder
{14515, 242}

Shell way, via ItemsInFolderShell
{14517, 9}

The speed gain is about 25x

The script is as follows

set f to "MyUsbHD:Marketing - local:" -- this is the path I used for test
set d1 to current date
set n to CountItemsRecursive(f)
set d2 to current date

return {n, d2 - d1}

on CountItemsRecursive(f)
	set n to 0
	set itm to ItemsInFolderShell(f) -- change to ItemsInFolderFinder(f) to test "finder way"
	script ss
		property ritm : itm
	end script
	set L to length of ss's ritm
	set n to n + L
	repeat with i from 1 to L
		if last character of (item i of ss's ritm) is ":" then set n to n + CountItemsRecursive(item i of ss's ritm)
	end repeat
	return n
end CountItemsRecursive

on ItemsInFolderFinder(f)
	set res to {}
	try
		tell application "Finder" to set ilist to items of folder f
	on error
		return {}
	end try
	script ss
		property Rilist : ilist
	end script
	set L to length of ss's Rilist
	repeat with i from 1 to L
		set end of res to (item i of ss's Rilist) as string
	end repeat
	return res
end ItemsInFolderFinder

on ItemsInFolderShell(f)
	set pof to POSIX path of f
	try
		set ilist to paragraphs of (do shell script "ls -1 -p " & quoted form of pof)
	on error
		return {}
	end try
	script ss
		property Rilist : ilist
	end script
	set L to length of ss's Rilist
	repeat with i from 1 to L
		set item i of ss's Rilist to (POSIX file (pof & (item i of ss's Rilist))) as Unicode text
	end repeat
	return ss's Rilist
end ItemsInFolderShell

It is notable that ls ends up finding 2 more items.

In fact it turned out that

tell application "Finder" to set ilist to items of folder f

is quite difficult to translate accurately to a script that relies on shell’s ls

If you run ls on the root path you get some invisible items the Finder doesn’t show, for example Volumes

If Finder preferences are tweaked to show invisible items, then the Finder shows more items than ls -1 -p

If the above is changed to ls -1 -p -A then ls and the Finder return the same number of items, but some of them differs:

ls follows symbolic links to the referenced file (even if -P directive is added), furthermore Finder shows packages as files (Safari.app), ls shows them as folders (Safari.app:).

These are just some examples, I think more issues may reveal when dealing with alias files or remote volumes…

Thank you,
Paolo.

Well to answer Finder VS Shell is difficult because the shell itself is always faster than the Finder, much faster but both deal with overhead.

Let’s say you have a folder containing 10,000 files and you want to move files with particular extensions to a folder. If you get every file and check it’s file extension 1 by one with a repeat loop and invoke an action for each file it would take a while with shell scripts and I think that the Finder with a filter would be much faster. But what if we embed this action in a single shell command? You would see that in a split second you have moved 10,000 files.

So Finder vs Shell is completely depended on how good you are in getting the best out of both. The CLI itself is without any doubt much much much faster than OSA.

Back to the moving example:
I have a folder ‘/All Downloads’’ and I want to move all files with a .doc extenion to folder ‘/All Documents/’.

Bad:

set allFiles to every paragraph of (do shell script "ls '/All Downloads'")
repeat with thisFile in allFiles
if thisFile ends with ".doc" then
do shell script "mv '/All Downloads/'" & quoted form of thisFile & " /All Documents/" & quoted form of thisFile
end
end

The mothod above works good when we’re talking about a folder with only a few till hundred files in it. But every file runs through an if statement with a string comparison. And we all know that AppleScript’s string comparison is tremendously slow.

Better but still bad but most used

set allDocuments to every paragraph of (do shell script "ls '/All Downloads\*.doc'")
repeat with thisDocument in allDocuments
do shell script "mv '/All Downloads/'" & quoted form of thisDocument & " '/All Documents/'" & quoted form of thisDocument
end

Now LS will do the string comparison for us. The command is much more efficient but still we invoke for ever .doc file a shell command. Which means another apple event is created, a shell is opened, command is send to the CLI, command is executed and it standard output is send back to your script. That is still a lot of overhead so what if we could tell the shell to handle the LS result

best (fasted) way to do this is

do shell script "find '/All Downloads/' -type f -name '*.doc' -maxdepth 1 -exec mv {} '/All Documents/' \\;"

This is the fastest way to do this.

P.S. I’ve only tested the last example

Hi, Paolo.

In your timing script, which gets and counts HFS paths, both the Finder and shell handlers could be made faster, the Finder one considerably so. And in the counting handler, (item i of ss’s ritm) ends with “:” would be a little faster than last character of (item i of ss’s ritm) is “:”

set f to "MyUsbHD:Marketing - local:" -- this is the path I used for test
set d1 to current date
set n to CountItemsRecursive(f)
set d2 to current date

return {n, d2 - d1}

on CountItemsRecursive(f)
	set n to 0
	set itm to ItemsInFolderShell(f) -- change to ItemsInFolderFinder(f) to test "finder way"
	script ss
		property ritm : itm
	end script
	set L to length of ss's ritm
	set n to n + L
	repeat with i from 1 to L
		if (item i of ss's ritm) ends with ":" then set n to n + CountItemsRecursive(item i of ss's ritm)
	end repeat
	return n
end CountItemsRecursive

on ItemsInFolderFinder(f)
	try
		set astid to AppleScript's text item delimiters
		set AppleScript's text item delimiters to linefeed
		tell application "Finder" to set res to text items of ((items of folder f) as text)
		set AppleScript's text item delimiters to astid
		return res
	on error
		set AppleScript's text item delimiters to astid
		return {}
	end try
end ItemsInFolderFinder

on ItemsInFolderShell(f)
	set pof to POSIX path of f
	try
		return paragraphs of (do shell script "ls -1 -p " & quoted form of pof & " | sed -e 's|:|:::|g' -e 's|/|:|g' -e 's|:::|/|g' -e 's|\\(.*\\)|" & f & "\\1|'")
	on error
		return {}
	end try
end ItemsInFolderShell

The point I’m trying to make with my post was that the correct way to count files in a folder recursive (including all files inside bundles) would be something like:


--use mdfind for handling bundles like files
(do shell script "find /Users/<me>/Desktop/ ! -name '.*' | wc -l") as integer

The scripts in the examples you’ve posted took 5 seconds to count file on my desktop (and subfolders). IN your script there was still a lot of overhead like I mentioned in my previous post. I couldn’t measure the speed with the example above because it is very fast, it would be more than 200 times faster than your shell example (time measured 0.027 seconds).

Hi, DJ. Were you up early or late? :wink:

We’re agreed that maximum speed depends on using the best tools for the job and using them in the best way they offer. Obviously there’s no need to convert POSIX paths to HFS if you only want to count them ” nor indeed to reconstitute the paths to the files at all.

Just for interest, I applied the scripts to my own desktop this morning and timed them with the LapTime OSAX:

{Count result, count time in seconds}

{429, 0.97} – Paolo’s shell
{354, 8.014} – Paolo’s Finder

{426, 1.088} – My shell ” obviously not as good as I thought!
{354, 0.201} – My Finder

{430, 0.02} – Your shell

For my own curiosity, I shall now have to investigate the 76-item result range difference…

Finder (either accessed via AppleScript or used directly by the user) adds a layer of abstraction over the UNIX level.

The most noticeable case is Packages or Application Bundles that Finder shows as files and the shell shows as directories. Unfortunately is very complicated to determinate if a directory is a package without asking the Finder.

Then the Finder hides by default some directories and symlinks.

Finally (but this does not affect the item count) Finder’s aliases can’t (or are very difficult to) be resolved without asking the Finder.

I agree, but my original plan was to translate a bunch of handlers in my own personal library from “finder way” to "shell way" and substitute them in my scripts to gain speed without modifyng the code. The first thing to do for me was “getting the list of items for a given path”.

Thank you for the optimization tips like

Regards,
Paolo.

It was late Nigel:D… I was back from the pub and before I went to bed I did a check to this interesting topic

The shell command counts every file in a bundle so an script saved as applet my command counts more files (contents, resources, MacOS, applet, main.scpt, info.plist) than yours. In the comment I posted to to use mdfind but I didn’t use it becaise it’s not really fair. Mdfind is a database search and not a real drive search so the results (even with more than 10K files) will always be less than 0.1 seconds but the advantage is that it will handle bundles (applications, plugins etc) as files.

That also answers/discards paol0 conclusion(s).