Need help optimizing a script

Hi,

I’ve got a move by name script that some posters on here helped me develop a while back. It does exactly what I need but it’s really slow. I’d love any suggestions on ways to optimize this to get it running a little faster.

set baseFolder to choose folder with prompt "choose source folder"
set printFolder to choose folder with prompt "choose Destination folder"
-- or hard-coded
-- set printFolder to alias "MacHD:Users:myuser:path:to:Print:"
tell application "Finder" to set fileList to files of baseFolder
repeat with aFile in fileList
	set prefix to getPrefix(name of aFile, name extension of aFile)
	tell application "Finder"
		try
			set destinationFolder to (1st folder of printFolder whose name begins with prefix & "_")
			move aFile to destinationFolder
		end try
	end tell
end repeat

on getPrefix(aName, anExtension)
	set {TID, text item delimiters} to {text item delimiters, "_"}
	if (count text items of aName) > 1 then
		set prefix to text item 1 of aName
	else
		set prefix to text 1 thru ((get offset of "." & anExtension in aName) - 1) of aName
	end if
	set text item delimiters to TID
	return prefix
end getPrefix

Hi.

This is theoretically faster if I haven’t made any errors, but I haven’t run the code. It basically gets all the names and extensions at once and makes all the decisions in vanilla AppleScript, rather than quizzing individual files and using a Finder ‘whose’ filter each time to decide the appropriate folder. I’ve put the main body of the script into a handler to keep all the variables local and thus non-persistent ” ie. their values don’t get saved back into the script file at the end.

There are other ideas one could pursue too, but this is a start.

on main()
	script o
		property fileNameList : missing value
		property extensionList : missing value
		property folderNameList : missing value
	end script
	
	set baseFolder to choose folder with prompt "choose source folder"
	set printFolder to choose folder with prompt "choose Destination folder"
	-- or hard-coded
	-- set printFolder to alias "MacHD:Users:myuser:path:to:Print:"
	tell application "Finder"
		set {o's fileNameList, o's extensionList} to {name, name extension} of files of baseFolder
		set o's folderNameList to name of folders of printFolder
	end tell
	
	set folderNameListLen to (count o's folderNameList)
	repeat with i from 1 to (count o's fileNameList)
		set thisFileName to item i of o's fileNameList
		set thisExtension to item i of o's extensionList
		set prefix to getPrefix(thisFileName, thisExtension) & "_"
		repeat with j from 1 to folderNameListLen
			set thisfoldername to item j of o's folderNameList
			if (thisfoldername begins with prefix) then
				tell application "Finder"
					try
						move file thisFileName of baseFolder to folder thisfoldername of printFolder
					end try
				end tell
				exit repeat
			end if
		end repeat
	end repeat
end main

on getPrefix(aName, anExtension)
	set {TID, text item delimiters} to {text item delimiters, "_"}
	if (count text items of aName) > 1 then
		set prefix to text item 1 of aName
	else
		set prefix to text 1 thru ((get offset of "." & anExtension in aName) - 1) of aName
	end if
	set text item delimiters to TID
	return prefix
end getPrefix

main()

Get rid of the finder and use an alternative like a do shell script to move files faster. Finder isn’t designed for batch processing, it’s designed to be controlled on user level (read: mouse and keyboard). Listing folders, moving, renaming and other standard file handling are all slow when using the Finder.

Thanks Nigel,

That really sped things up I tired it on a small batch images and got about a 50% speed increase.

Thanks

In 10.9, System Events is able to move and to duplicate files.
Maybe replacing the Finder by this tool may enhance the script.
I’m afraid that triggering do shell script would not be efficient due to added tasks :
– converting the source path and the destination one into quoted POSIX one
– calling the standard addition do shell script

Yvan KOENIG (VALLAURIS, France) jeudi 14 novembre 2013 17:32:08

Depends how good you are in writing shell scripts :smiley: if you use bash’s for loop on a folder you don’t need to quote, you only need to invoke 1 shell script. I’m sure it will outperform system events as finder. But I’m suggesting it only if performance is more important than vanilla AppleScript solutions.

Here is a solution that is outperformed by Nigel’s by over 50%, but I still think it interesting to post it, though, I am not totally sure if it still fits the bill as a solution, as I have juggled the problem around: I get the folder list first, and then get the files that fit with the name of the folder.

The timings are like this: Original: 8.56, mine: 5.92, Nigels 4.06.

on main()
	script o
		property folderAliasList : missing value
	end script
	
	set baseFolder to choose folder with prompt "choose source folder"
	set printFolder to choose folder with prompt "choose Destination folder"
	-- or hard-coded
	-- set printFolder to alias "MacHD:Users:myuser:path:to:Print:"
	tell application "Finder"
		set o's folderAliasList to folders of printFolder as alias list
	end tell
	
	repeat with i from 1 to (length of o's folderAliasList)
		tell application "Finder"
			try
				set dm to (name of item i of o's folderAliasList)
				set dm to text 1 thru ((offset of "_" in dm) - 1) of dm
				repeat with af in (get every file of folder baseFolder whose name begins with dm)
					move af to folder (item i of o's folderAliasList)
				end repeat
			end try
		end tell
		
	end repeat
end main
main()

Hello.

Believe it or not, but this version actually executes not much worser than Nigels (4.08 seconds). It may be due to the fact that I executed it twice, or not. The open question is, if the inverted logic works.



script main
	script o
		property folderAliasList : missing value
	end script
	set baseFolder to choose folder with prompt "choose source folder"
	set printFolder to choose folder with prompt "choose Destination folder"
	-- or hard-coded
	-- set printFolder to alias "MacHD:Users:myuser:path:to:Print:"
	tell application "Finder"
		set o's folderAliasList to reverse of (sort folders of printFolder by name)
		--	set o's folderAliasList to reverse of (sort o's folderAliasList by name)
		repeat with i from 1 to (length of o's folderAliasList)
			
			try
				set dm to (name of item i of o's folderAliasList)
				set dm to text 1 thru ((offset of "_" in dm) - 1) of dm
				repeat with af in (get every file of baseFolder whose name begins with dm)
					move af to (item i of o's folderAliasList)
				end repeat
			end try
		end repeat
	end tell
	
end script
run script main

That’s a potentially good approach, since presumably there are fewer folders than files. Do you get any increase (or decrease) in speed if each batch of files is moved with a single command? eg. instead of the inner repeat:

move (every file of folder baseFolder whose name begins with dm) to folder (item i of o's folderAliasList)

Communication time between applications and OSAXen is far less than that between applications. By delegating the execution of the code to the StandardAdditions’ ‘run script’ command, you save the script-running application having to communicate commands to ” and received acknowledgements from ” the Finder directly. This is particularly effective if you have to send a lot of commands to other applications.

Hello.

I couldn’t make that work. I tried before I posted the first one, I can only move one at a time. But if it worked, it would surely have been faster.

To be fair, I have now updated the script above, so that it sorts the folder lists, and is in the reverse direction, so that files hopefully lands in the correct folder, the speed is now 4.6 seconds. :slight_smile:

The second thing, is that if you want further speed improvements, then I’d end up with your approach anyway, and I had to move away from this approach. But it sure doesn’t take much effort to improve things with Finder.

But of course doing this in bash is somewhat faster. :slight_smile:

In a script like this that takes several seconds to run, that extra overhead is going to be negligible, especially compared with the time it could save.

OK. On my machine, with 255 files in baseFolder and five destination folders in printFolder but one of these not matched by any of the files, my previously posted script takes 30.443 seconds. Much of this is taken up by the stupid sound effects the FInder makes as it moves the files.

McUsrII’s ‘run script’ script ” once the two bugs have been corrected :wink: ” takes 97.017 seconds.

The one below is a bit crude, but takes just 0.189 seconds. It may run into trouble if there are any single-quotes in the file names.

All timings taken from after the second ‘choose folder’ line to the end of the moving.

set baseFolder to POSIX path of (choose folder with prompt "choose source folder")
set printFolder to quoted form of POSIX path of (choose folder with prompt "choose Destination folder")

do shell script (do shell script ("find -f " & printFolder & " \\( -type d -mindepth 1 -maxdepth 1 \\) | sed -En 's|'\\''|'\\''\\'\\\\\\'''\\''|;s|(.+//([^_.]+)[_.].+)|mv -n '\\''" & baseFolder & "'\\''\\2{_,.}*  '\\''\\1'\\''|;1 h;1 !H;$ {g;s|\\n|;|g;s|$| \\&>/dev/null \\&\\& echo true \\|\\| echo false|;p;}'"))

Hello.

I am sorry about any bugs, it worked perfectly with 8 files not matching any folder, 24 files that matched one of three folders. It would be nice to know about the bugs.

And. A single do shell script is kind of unbeatable. :slight_smile:

They become apparent as soon as the script’s run. The two folder variables aren’t set. :slight_smile:

Last second edit error, sorry about that, I have added the missing folder variables. :slight_smile:

Thanks for all the input on this. I’m a total beginner so a lot of it is going over my head but I appreciate it none the less.

I’m hitting some errors with the latest version of Nigels script.

if I have files named 1.jpg, 1_1.jpg, and 1_A.jpg it moves 1_1.jpg, and 1_A.jpg but leaves behind 1.jpg

It also throws an error when I point it at a large directory. When I run it trying to move 224 images into 137 folders I get the following error, I only pasted the first couple of lines it repeats for all the items:

error "sh: -c: line 0: syntax error near unexpected token (' sh: -c: line 0: mv -n '/Users/Owen/Desktop/Test/Images/'106BF_* ‘/Users/Owen/Desktop/Test/Folders//106BF_Friends 4 ever Scrapbookâ„¢’;mv -n '/Users/Owen/Desktop/Test/Images/'106D_* ‘/Users/Owen/Desktop/Test/Folders//106D_My Sweet Scrapbookâ„¢’;mv -n '/Users/Owen/Desktop/Test/Images/'106PN_* ‘/Users/Owen/Desktop/Test/Folders//106PN_Friends 4 ever Scrapbookâ„¢’;mv -n '/Users/Owen/Desktop/Test/Images/'108C_* ‘/Users/Owen/Desktop/Test/Folders//108C_Color a Peace & Love Memo Boardâ„¢’;mv -n '/Users/Owen/Desktop/Test/Images/'108F_* ‘/Users/Owen/Desktop/Test/Folders//108F_Color a Funky Fashion Memo Boardâ„¢’;mv -n '/Users/Owen/Desktop/Test/Images/'109W_* ‘/Users/Owen/Desktop/Test/Folders//109W_My Flower Pressâ„¢’;mv -n '/Users/Owen/Desktop/Test/Images/'111W_* ‘/Users/Owen/Desktop/Test/Folders//111W_Flower Hourâ„¢’;mv -n ’

The weird part is if I use it to move a smaller subset of the images it seems to work just fine.

Sorry, Gleeptoid7. I was assuming from my analysis of your original script that all the file names had “_” in them too. I’ve now adjusted my script above to handle the possibility that they may not.

I’m not sure what the “large directory” problem is. The error suggests that one or more of your directories’ names may contain characters which trip the script. As I said in the post with the script, I would expect problems if any of the names contained single-quotes. I’ll think a bit more on it. (Edit: I’ve now further adjusted the script to be able to cope with single-quotes (apostrophes) in the folder names.)

Another possibility, if there are 137 folders, is that 137 “mv” commands in one string may be too much for the outer ‘do shell script’.

I don’t think there are any single quotes. The fact that the script works just fine on any one of the folders and images but errors when I try to do the whole thing at once makes think it’s not a file name issue.

I was wrong I looked back through my data set and found a pesky single quote. Once I cleared that out it worked just fine. Thanks so much for all the help on this.