How to speed up Applescript?

Hello EM,

Sorry for the long delay.

The Applescript Language Guide PDF defines the “a reference to” operator on pages 203, 204 and 205. Here’s a link: http://developer.apple.com/documentation/applescript/conceptual/applescriptlangguide/AppleScriptLanguageGuide.pdf It says it has several uses, one of which is the one I mentioned about accessing items in a list efficiently. It also gives example scripts that you can run to see for yourself. I can assure you it does work, although loading the whole script as a script object, as Adam Bell suggested, may provide the same speed gain. (I will try it when I have time, to see how it compares).

However, since you did NOT see any difference in speed, I wonder if your delay might be coming from somewhere else in your script. As with most complex problems, I think your best approach would be to “divide and conquer.” Since you suspect the list-checking part of the script, why not make a copy of your script and then delete or comment out all but the list checking. Also, I couldn’t tell where the file name strings you were checking were coming from. If they are coming from another list, bear in mind that you might want to use the “a reference to” operator for both lists. Can you load the entire list of 24000 file name strings into a list and then check your other list of name strings without doing any logging, data checking or other work? If so, and it turns out to run without bogging, work outward from there.

Here are some wild guesses:

How are you building your lists? I’m pretty sure that 24,000 cycles of:

set my_list to my_list and this_string

will take a lot longer than:

copy this_string to the end of my_list

and it should be even faster with:

copy this_string to the end of my_list_ref (where my_list_ref is a reference to my_list)

How about the data checking? Could your script be comparing the data to every item in the list? Can you remove the data checking then then run the script to see how it compares?

How about timing every piece of work so you can really see where the slowdown occurs?

Well, I’m sure you know all this. My guess is that you were hoping someone could spot the problem without you having to slog through a major rebuild. Sometimes the hard way is the only way. Be encouraged, I’m sure you can get it.

Wishing you complete success,
icta

Hi.

I haven’t studied the big script in the middle of this thread yet, so excuse me for that.

The a reference to technique works on the same principle as the script object idea mentioned near the top of the thread ” except that the script object solution is wrongly implemented. The idea is to use a reference to the list variable while accessing the list’s items.

The a reference to operator sets up a reference in another variable:

set my_list to {lots of items} 
set my_list_ref to a reference to my_list
--> my_list of «script»

my_list_ref refers to my_list as something belonging to the script. Through a quirk of AppleScript, accessing the list’s items using the reference is very much faster than accessing them using my_list directly. The process can be made slightly faster still by simply putting my in front of the original variable:

item i of my my_list
-- as opposed to:
item i of my_list_ref

The reference is the same in both cases, but it has to be retrieved and interpreted from my_list_ref, whereas it’s written directly into the script with my my_list.

The script object technique is another way of implementing a “directly into the script” reference. It comes into its own within handlers, where variables are local and temporary and can’t be referenced. Assigning a list to a property of a script object assigns it to a variable that belongs to something, so a reference can be set up in that context:

local my_list
set my_list to {lots of items}

script V
	property l : missing value
end script

set V's l to my_list

my_list is a local variable. V’s l is a reference. The list is the same physical object in both cases.

To get the best out of these techniques, you have to used a numbered repeat index, not the repeat with theFile in . construction. And, ideally, commands like is in, which operate on the list itself rather than on specified items, should not use a reference.

Since you’re comparing string paths, and the cases of matching paths are likely to be the same, you could get a further speed increase by enclosing the process in a considering case block. This allows AppleScript simply to compare the strings to see if they’re the same, instead of checking each character to see if it has an equivalent in another case that has to be taken into account.

Since ABort_Copy has a boolean value, you could use it directly after if instead of comparing it with true first and using the boolean result of that. The difference is minimal, but real.

-- way up at the beginning of your script:
script V
	property CapLocList : missing value
end script
-- read the data in, then
set V's CapLocList to theCaptureLocationFileList
-- set things up as necessary, then

considering case
	repeat with i from 1 to (count theCaptureLocationFileList)
		set theFile to item i of V's CapLocList
		---at this point theFile is a string
		---at this point theCaptureLocationFileList is a list of strings
		---at this point theVerifiedList_in_Capture_LOG is a list of strings (about 25 000 items to process)
		if (ABort_Copy) then exit repeat
		if (theFile is in theVerifiedList_in_Capture_LOG) then
			my Info_Display("Verified: " & (theFile as string)) --- very slow.
		else --not verified
			set theFile to theFile as alias
			set FileinCaptureLog to GetFileinCaptureLog(theFile)
		end if
	end repeat
end considering

It also helps if the texts being compared are of the same class ” ie. both string or both Unicode text. And if theVerifiedList_in_Capture_LOG were a single, linefeed-delimited text rather than a list, that would be an advantage too. (This might not be desirable if you’re constantly updating it, though.)

That’s about all I can say, AppleScriptwise, on the given information. As if_confused_then_ask says, what goes on in your other handlers could be relevant. If one of them adds items to theVerifiedList_in_Capture_LOG, that could slow things down when the list gets very long. AppleScript can only append a certain number of items to a list before it has to replace it with a new list in a larger block of reserved memory. This process is invisible to the script, but takes longer the more items there are in the list already.

Hello EM,

I thought of one more thing that could slow your script down. Please do not be insulted if you have known this for eons. It seems like nearly everyone would know this but I’m new to the list and don’t know if it has been reported or even if it is widely known:

Are you watching Script Editor’s Event Log History window for feedback as your script runs? If you are, it will certainly slow your script down. I have included a short script to illustrate the difference. If you didn’t know about this behavior (or for anyone who might not), run the script once with the Event Log History window open and then run it again with the window closed and compare the times. I ran it on my machine and chose a folder containing 3251 items. Here are the results:

Event Log History window open: 317 seconds
Event Log History window closed: 64 seconds

Since your script is doing a lot more work, it could amount to a much greater difference.

With a list of 24,000 items you will surely want some feedback. You could insert a counter into your repeat loop and count up to, say, 100 and then report to a TextEdit file (report once every 100 files). Or, for the most minimal feedback, launch Activity Monitor and watch that just to make sure the script is still running.

Best wishes,
icta


set my_files_list to {}
set item_count to 0

set start_time to (time of (current date))

tell application "Finder"
	
	set source_folder to choose folder with prompt "Select a folder containing LOTS of files"
	
	set entire_contents to entire contents of source_folder
	set item_count to count of entire_contents
	
	repeat with this_item in entire_contents
		if kind of this_item is file then -- ignore folders
			set this_file_path_string to this_item as string
			copy this_string to end of my_files_list
		end if
	end repeat
	
	set end_time to (time of (current date))
	set elapsed_time to end_time - start_time
	display dialog "elapsed_time is: " & elapsed_time & " seconds." & return & return & "item_count is: " & item_count
	
end tell

Thanks for the replies folks.

if_confused_then_ask and Nigel

Thanks for the ideas. I am aware of the slow down with Script Editor ;). Also , I implemented the a reference to operator after the script object idea. I never tried timing them exactly and as a result didn’t notice any significant increase in speed.

However there was an increase when I implemented the script object(it was VERY obvious) and if the results are the same for both the a reference to operator and script object then I’m sure it does work !

The slow down occurs not when creating the list since this all happens way at the top . The list is created by reading an sqlite db file. This process is pretty quick. I will however look at that section again even if only to improve with a second or two :stuck_out_tongue:

The script actually slows down as it loops through each item, after about 3000-4000 items to be more precise. I suppose it could have something to do with the memory limitation in applescript.

I will try the tips Nigel gave and report back.

Thanks again folks
EM

As I understand it, you are reading an SQLite table of 25000 records into an AppleScript variable, then looping through that list to test if something exists. Is that right? No wonder it’s slow :wink:

Why don’t you use the SQLite database for the searching, since it’s much better at it than AppleScript lists. Your pseudo code would be something like:

SQLSelect(“select exists (select 1 from MyHugeTableOfFiles where FileName = '” & myFileNameToMatch & “';”)

which will return 1 if it exists, 0 if it doesn’t.

If your search is more complex, then you are still much better off doing the actual search in SQL rather than transferring a pile of data into AppleScript and searching there.

Tom
BareFeet