I am tweaking an old script to speed thigns up and would like to verify that by reference a list as a script property, I would shave time off each one of the instances below.
repeat with aDate in my datelist
if not (my dateTest contains short date string of aDate) then
set end of my dateTest to short date string of aDate
if isoImageDate is not in my projList then make new project with properties {name:isoImageDate}
I would love to hear any other ideas on how to speed things up!
Here is the script in its entirety:
property delimiter : "-"
property datelist : {}
property dateTest : {}
property projList : {}
tell application "Aperture"
-- Get the selected project ID
tell item 1 of (selection as list) to set projId to parent's id
-- Get date of every image in the selected project
tell project id projId to set datelist to every image version's (value of EXIF tag "ImageDate")
tell library 1
-- Create "Imported by Date" folder if it does not exist
if not (exists folder "Imported by Date") then make new folder with properties {name:"Imported by Date"}
-- Get name of every project in "Imported by Date" folder.
--This is time consuming and should not be included in the loop below.
tell folder "Imported by Date" to set projList to name of every project
repeat with aDate in my datelist
-- Test each date to avoid processing duplicates
if not (my dateTest contains short date string of aDate) then
set end of my dateTest to short date string of aDate
-- Convert the image date to YYYY-MM-DD format
set projectYear to year of aDate
set projectMonth to (month of aDate as integer) as string
if length of projectMonth is 1 then set projectMonth to "0" & projectMonth
set projectDay to (day of aDate as integer) as string
if length of projectDay is 1 then set projectDay to "0" & projectDay
set isoImageDate to projectYear & delimiter & projectMonth & delimiter & projectDay as string
tell folder "Imported by Date"
--Create the project if it does not exist
if isoImageDate is not in my projList then make new project with properties {name:isoImageDate}
-- Move the images into the project
move (every image version of project id projId whose value of EXIF tag "CaptureYear" is year of aDate and value of EXIF tag "CaptureMonthOfYear" is month of aDate as integer and value of EXIF tag "CaptureDayOfMonth" is day of aDate) to project isoImageDate
end tell
end if
end repeat
-- Move the initial project to the Trash if no images remain
if (count of image versions of project id projId) is 0 then delete project id projId
end tell
end tell
a repeat with var in list doesn’t make a script much faster per se. Only when list is a reference.
A small example
set bigList to {}
--this only takes a second now
repeat with x from 1 to 12500
set end of bigList to x
end repeat
set sum to 0
--this takes very long
repeat with theNumber in bigList
set sum to sum + theNumber
end repeat
This took around 10 seconds in my machine (i7 quad core) because repeat with var in list is very slow this way. But what if I make it a reference?
set a to {}
set bigList to a reference to a
repeat with x from 1 to 12500
set end of bigList to x
end repeat
set sum to 0
repeat with theNumber in bigList
set sum to sum + theNumber
end repeat
Now it is more than 10 times faster. A small change with huge improvement!
set bigList to {}
--this only takes a second now
repeat with x from 1 to 12500
set end of bigList to x
end repeat
set sum to 0
--this takes very long
repeat with theNumber in my bigList
set sum to sum + theNumber
end repeat
I am referring to the list as a script property by including “my” in the repeat statement.
Here’s a comparison of some different techniques to speed access to large lists. Each technique is in a “repeat 50 times” loop so we can get some time measurements in seconds. The times are from my MacBook Pro 2 GHz core I7 with 4 GB RAM.
Here’s the scripts… Base Script
set inTime to current date
repeat 50 times
set a to {}
repeat with x from 1 to 12500
set end of a to x
end repeat
set sum to 0
repeat with theNumber in a
set sum to sum + theNumber
end repeat
end repeat
set totalTime to (current date) - inTime
return {totalTime, sum}
--> {300, 78131250}
Using “a reference to”
set inTime to current date
repeat 50 times
set a to {}
set bigList to a reference to a
repeat with x from 1 to 12500
set end of bigList to x
end repeat
set sum to 0
repeat with theNumber in bigList
set sum to sum + theNumber
end repeat
end repeat
set totalTime to (current date) - inTime
return {totalTime, sum}
--{7, 78131250}
Using “my” to make a script object
set inTime to current date
repeat 50 times
set a to {}
repeat with x from 1 to 12500
set end of my a to x
end repeat
set sum to 0
repeat with theNumber in my a
set sum to sum + theNumber
end repeat
end repeat
set totalTime to (current date) - inTime
return {totalTime, sum}
--{3, 78131250}
Using a distinct script object
set inTime to current date
repeat 50 times
script s
property bigList : {}
end script
set s's bigList to {} -- needed to make sure the list is emptied during each loop
repeat with x from 1 to 12500
set end of s's bigList to x
end repeat
set sum to 0
repeat with theNumber in s's bigList
set sum to sum + theNumber
end repeat
end repeat
set totalTime to (current date) - inTime
return {totalTime, sum}
--{3, 78131250}
Example 2 and 3 are testing same thing because the script itself is a top level script object, right? The reason to create a declare a script object for this purpose would be if the variable existed inside a handler (it was local) and therefore you could not use “my”.
Actually I have never thought of using “my” to create a script object. I always used the “distinct” script object approach. That’s actually what interested me about this thread. I setup the tests to see if “my” did the same thing as the distinct script object. It seems both approaches work equally well. Your explanation sounds like a reasonable approach to why you may use one option versus the other. I now have another tool in my arsenal, so good thread.
Interesting. Either things have changed recently or I missed this particular variation when testing for myself a few years ago. I didn’t think using a reference to the list variable made any difference in a ‘repeat with in ’ context. But the above does indeed seem to be slightly faster than:
repeat with i from 1 to (count dateList)
set aDate to item i of my dateList
However, one important caveat is that the ‘repeat with in my ’ form can only be used in a getting context. Changing 's contents is a complete non-starter:
repeat with aDate in my dateList
set aDate's contents to aDate + days
--> error "Can't set item 1 of dateList of «script» to date \"Sunday 27 May 2012 09:22:28\"." number -10006 from item 1 of dateList
end repeat
With commands affecting the list itself (‘count’, ‘contains’, ‘is in’), using a reference to the list variable used to be a slight handicap rather than an advantage. Testing today in Snow Leopard, I find that ‘my dateTest contains .’ is slightly faster than plain ‘dateTest contains .’ only if the list is quite short or the item occurs near the beginning of it. Otherwise, the non-referenced form is faster.
An obvious minor efficiency in the code immediately above would be to extract each aDate’s ‘short date string’ once only:
set sdString to short date string of aDate
if not (my dateTest contains sdString) then
set end of my dateTest to sdString
Or use a value that’s faster to obtain and faster to compare:
set dayNum to (adate - (date "Saturday 1 January 1583 00:00:00")) div days -- The date format should be adjusted, if necessary, to suit the compiling machine.
if not (my dateTest contains dayNum) then
set end of my dateTest to dayNum
But most efficiencies in your script ” including referencing the list variables if the lists aren’t very long ” will be insignificant against the time it takes to communicate with the application.
You’re not the only one :D. I’m also sure that these tests by Hank would have a different result in AppleScript 1.x. But anyway I’m glad this topic is born, otherwise I still wouldn’t knew it.
Just FYI: The second loop’s my is the only one that’s responsible for the performance improvement. I tested this on my machine”a G4 running 10.4.11”and actually had a 2 second performance increase by removing the “my” from the first loop (“set end of my a”).
Good catch Marc. I tested this with the below 2 scripts. I get the same results though, not an improvement (granted my method of measuring the time isn’t the most accurate and I’m running 10.7). I guess the “set end of bigList…” is a very optimized command so there is no performance gain. We only see the gain when iterating the big list, not when filling it. Here’s the scripts I ran…
set inTime to current date
repeat 50 times
set a to {}
repeat with x from 1 to 12500
set end of a to x
end repeat
set sum to 0
repeat with theNumber in my a
set sum to sum + theNumber
end repeat
end repeat
set totalTime to (current date) - inTime
return {totalTime, sum}
--{3, 78131250}
set inTime to current date
repeat 50 times
set a to {}
repeat with x from 1 to 12500
set end of a to x
end repeat
script s
property bigList : {}
end script
set s's bigList to a
set sum to 0
repeat with theNumber in s's bigList
set sum to sum + theNumber
end repeat
set s's bigList to {} -- needed to make sure the list is emptied during each loop
end repeat
set totalTime to (current date) - inTime
return {totalTime, sum}
Using a script object instead of my can be faster. When my gets bigger it seems that resolving the local variable of it takes more time. When I added 25 handlers and 10 properties into my it went from 10 ms to 25 ms on a clean run. In that case a script object (which is very small) seems to be faster. Clean run is compiling between each run because my gets cached and used the second run. From an cached my it only took 5ms. But those numbers are all relative, I’m running an ‘old’ MBP i7 and it should be faster on my newer MBP or my newest iMac. But I think that these differences percentage-wise will be the same on every machine.
So after all my conclusion is that with large script (objects) my is much more unreliable performance-wise than references or using an extra script object.
p.s. I subtracted 2 ms from my actual measured time because that’s the time it takes to execute the perl script.