Stack Overflow

Nigel_Garvey · April 7, 2010, 10:50am

It’s an AppleScript quirk that getting or setting items in a list is greatly speeded up if the list variable is referenced rather than being used directly. The AppleScript Language Guide gives examples of this used when building a list, which I’ve paraphrased below. However, running these two particular scripts in AppleScript Editor on my Snow Leopard machine produces almost exactly the same timings (5 or 6 seconds). But I’m sure there was a vast difference on earlier systems:

set bigList to {}

set numItems to 100000
set t to (current date)
repeat with n from 1 to numItems
	set end of bigList to n -- Using 'bigList' directly.
end repeat
set total to (current date) - t

set bigList to {}
set bigListRef to a reference to bigList --> bigList of «script»

set numItems to 100000
set t to (current date)
repeat with n from 1 to numItems
	set end of bigListRef to n -- Referencing 'bigList' as a property of the current script.
end repeat
set total to (current date) - t

But once the list has been built, the old speed dichotomy still applies when accessing its items:

set bigList to {}

set numItems to 100000
repeat with n from 1 to numItems
	set end of bigList to n
end repeat

set t to (current date)
repeat with n from 1 to numItems
	get item n of bigList -- Direct use of 'bigList'.
end repeat
set total to (current date) - t --> About 592 seconds to get all the items in turn.

set bigList to {}
set bigListRef to a reference to bigList --> bigList of «script»

set numItems to 100000
repeat with n from 1 to numItems
	set end of bigListRef to n
end repeat

set t to (current date)
repeat with n from 1 to numItems
	get item n of bigListRef -- 'bigList' referenced.
end repeat
set total to (current date) - t --> About 1 second to get all the items in turn.

The value of bigListRef is bigList of «script» ” that is, the variable bigList belonging to the current script. So instead of using the bigListRef, you could write my bigList, which means exactly the same thing. Because it’s written directly into the source code rather than being stored in a variable, it’s actually very slightly faster still.

Only properties, globals, and run-handler variables can be referenced. Local variables can’t. The trick with them, say inside a normal handler, is to set up a temporary script object with a property, set that property to the list, then reference the property through the script object:

set bigList to {}
set bigListRef to a reference to bigList --> bigList of «script»

set numItems to 100000
repeat with n from 1 to numItems
	set end of bigListRef to n
end repeat

timeGettingItems(bigList) --> About 0 seconds to get all the items in turn.

on timeGettingItems(bigList)
	script o -- Temporary script object .
		property l : bigList -- . with a property set to the passed list.
	end script
	
	set numItems to (count bigList)
	set t to (current date)
	repeat with n from 1 to numItems
		get item n of o's l -- 'o's l' references the script object's property.
	end repeat
	
	return (current date) - t
end timeGettingItems

The run-handler variable bigList, the variable bigList inside the handler, and the script object’s property l are different variables, but they “contain” exactly the same physical list. Changes to the items of o’s l inside the handler will show up as changes in bigList outside it.

The referencing trick is only an advantage when accessing items (and properties, if I remember correctly) of a list. Actions involving the list itself are actually slightly faster without a reference. That’s why I wrote (count bigList) inside the handler rather than (count o’s l), although it doesn’t really make a lot of difference.

Richard · April 7, 2010, 10:44pm

Could this be made more efficient?

set primenumbers to {}
repeat with n from 249900 to 300000 -- current 2 half values, equivalent to 499800 to 600000
	set m to (2 * n - 1) --equation for odd numbers, the 2* being why above is half
	set y to 0
	repeat with x from 1 to n --only first half needed to be scanned for factor pairs
		if m / x = (m / x as integer) then set y to y + 1
	end repeat
	if y < 2 then set primenumbers to primenumbers & m --if only one factor pair (caught at 1) is found y will be 1
	log m
	if (m + 1) / 1000 = ((m + 1) / 1000 as integer) then log primenumbers --log result every 1000
end repeat
primenumbers

Nigel_Garvey · April 7, 2010, 11:41pm

Yep.

on getPrimeNumbers from a to b
	-- Have the list variable in a script object for referencing. 
	script o
		property primenumbers : {}
	end script
	
	-- Get the first and last valid odd numbers in the specified range, special-casing 2.
	if (a < 2) then set a to 2 -- Discard any part of the range < 2.
	if (a is 2) and (b > 1) then set end of o's primenumbers to 2
	if (a mod 2) is 0 then set a to a + 1
	if (b mod 2) is 0 then set b to b - 1
	repeat with m from a to b by 2 -- odd numbers from a to b.
		set prime to true -- This flag will be changed to false if a factor is found below.
		repeat with x from 3 to (m ^ 0.5) div 1 by 2 -- only need to scan odd numbers from 3 to the square root.
			if ((m / x) mod 1 is 0) then
				-- If a factor's found, flag prime as false and end the scan.
				set prime to false
				exit repeat
			end if
		end repeat
		if (prime) then set end of o's primenumbers to m -- Append, don't concatenate!
		-- Logging huge lists slows down the script.
		-- log m
		-- if (m + 1) mod 1000 = 0 then log o's primenumbers --log result every 1000
	end repeat
	
	return o's primenumbers
end getPrimeNumbers

getPrimeNumbers from 499798 to 600000

Richard · April 8, 2010, 12:24am

Oh gosh I just woke up and realised I haven’t scripted for 5 months, out classed!

Dylan_Weber · April 8, 2010, 12:36am

Then I don’t know what scripting is.

Richard · April 8, 2010, 12:52am

??? Be kind with your language, I’m extremely confused at the moment :lol:

I was about to restart my Dialog Maker app (1200+ lines), I think I need some revision… and more sleep than 4 hours a night :lol:

Richard · April 8, 2010, 12:55am

Which is quicker div 1 or as integer?

Nigel_Garvey · April 8, 2010, 8:43am

They don’t quite do the same thing. div 1 rounds down (ie. divides by 1 and discards the remainder), while as integer rounds to nearest, .5 rounding to the nearest even.

Sorry. I woke up this morning realising that only the odd numbers from 3 to the square root need to be scanned. Also that 2 has to be special-cased. I’ve changed the script above. It now also starts at 2 if the beginning of the specified range is less than that.

Richard · April 9, 2010, 11:39am

set t to seconds of (current date)
repeat with n from 1 to 10000000
	(1 / n) div 1
end repeat
log "div: " & (seconds of (current date)) - t

set t to seconds of (current date)
repeat with n from 1 to 10000000
	1 div n
end repeat
log "div alone: " & (seconds of (current date)) - t

set t to seconds of (current date)
repeat with n from 1 to 10000000
	(1 / n) as integer
end repeat
log "as int: " & (seconds of (current date)) - t

Sometimes as integer gets 7

But interesting…

Richard · April 9, 2010, 12:16pm

Current script:

 set primenumbers to {}
repeat with n from 499999 to 999999 by 2
	set y to 0
	repeat with x from 3 to n ^ 0.5 as integer by 2 --only first half needed to be scanned for factor pairs
		if n mod x = 0 then
			set y to 1
			exit repeat
		end if
	end repeat
	if y = 0 then set the end of primenumbers to n
	log n
	if (n + 1) / 1000 = ((n + 1) div 1000) then log primenumbers --log result every 1000
end repeat
primenumbers

It did hours work in a few minutes, thanks for helping Will the ref trick help much on my dual core SL iMac?

Nigel_Garvey · April 9, 2010, 2:09pm

Yes. Even using more interesting numbers, I get broadly similar results myself.

When I first started AppleScripting back in the days of OS 8.0, as integer could only be used if the original value equated to a whole number anyway:

8.0 as integer --> 8
"8.0" as integer --> 8
-- But:
8.1 as integer --> Error
"8.1" as integer --> Error

It’s different now, but I never got into the habit of using it for rounding purposes. Maybe I should!

But, as I said in my previous post, as integer and div 1 don’t round in the same way, which can affect the bigger picture. In your script, as integer will probably always be slightly faster than div 1 at rounding the square root of the number being tested for factors. On the other hand, div 1 always rounds down (with positive numbers) whereas as integer rounds to nearest (.5 to nearest even), which is sometimes up and sometimes down. On occasions when the square root’s rounded up to an odd number and n has no factors, the inner repeat will do one more iteration than if the square root had been rounded down, which takes a lot more time than was saved by using as integer rather than div 1 when setting the target. The effect on the script’s overall performance will depend on how many speed-advantaged roundings it cancels out and how often it happens in relation to the total number of roundings done. I must admit I’ve no idea what the ratio’s likely to be ” and the overall effect’s probably tiny anyway ” but I just wanted to point out that a faster instruction at one point doesn’t necessarily mean faster overall.

MeStinkBAD · April 9, 2010, 10:12pm

There is no limit on the size of a list. Try this.

property aList : {}


repeat 50000 times
	copy (time string of (current date)) to end of aList
end repeat

count items of aList

works…

Richard · April 10, 2010, 2:30am

About the memory thing, when I left it to do only 1000000 to 1650000 it quickly gobbled up 1.4GB ish of memory, the list couldn’t have been over 50000 long, probably closer to 20000, this needs 1400000000 bytes??? 70kB per item??? It’s crippling the machine.

Nigel_Garvey · April 10, 2010, 10:31am

Hi, Richard.

Ditch the two log lines!

Richard · April 10, 2010, 12:00pm

No, actually it takes the GB from opening the file :S… which is helped by removing the commented list of prime numbers (yes I’m that stupid) but when I run it the problem is as I said. Removing the log n fixed the memory problem, but now it is faster than the memory it hits about 100 cpu in Activity Monitor. I might add a percentage completion log.

Richard · April 11, 2010, 10:53am

set primenumbers to {}
set a to 3999999
set b to 9999999
repeat with n from a to b by 2
	set y to 0
	repeat with x from 3 to n ^ 0.5 as integer by 2
		if n mod x = 0 then
			set y to 1
			exit repeat
		end if
	end repeat
	if y = 0 then set the end of primenumbers to n
	--if (n + 1) mod 100000 = 0 then log primenumbers
	if (n + 1) mod 10000 = 0 then
		log 100 * ((n - a) / (b - a)) & "% -" & n as string
	end if
end repeat
primenumbers

The rate of logs just goes from 2 or 3 per second to 1 every 10 seconds, wherever you start, it’s not because of the size of the numbers. Why, I don’t understand at all.

Nigel_Garvey · April 11, 2010, 1:05pm

I don’t know how AppleScript Editor’s log text is augmented internally. In vanilla AppleScript, it would be by concatentation, which involves more and more memory each time:

1st entry:
“(0.0% -3999999)” – (17 characters + line ending) * 2 = 36 bytes

2nd entry:
“(0.166666666667% -4009999)” – (28 characters + line ending) * 2 = 58 bytes

Concatenate 1st & 2nd entries = create a new text containing text of 1st & 2nd entries (94 bytes) and discard 1st & 2nd entries (release 94 bytes). Memory involved in adding this entry: 188 bytes.

3rd entry:
“(0.333333333333% -4019999)” – (28 characters + line ending) * 2 = 58 bytes

Concatenate text so far & 3rd entry = create a new text (94 + 58 = 152 bytes) and discard text so far & 3rd entry (release 152 bytes). Memory involved in adding this entry: 304 bytes.

4th entry:
“(0.5% -4029999)” – (17 characters + line ending) * 2 = 36 bytes

Concatenate text so far & 4th entry = create a new text (152 + 36 = 188 bytes) and discard text so far & 4th entry (release 188 bytes). Memory involved in adding this entry: 376 bytes.

. and so on.

The figures above are no doubt less than the amounts actually allocated and released.

It reminds me of a story I was told at school of a cunning blacksmith who conned an unsuspecting punter into having his horse reshoed (four shoes, seven nails per shoe) at only a penny for the first nail, tuppence for the second, fourpence for the third, eightpence for the fourth, and so on. The shoes themselves would be free!

set nailPrice to 1
set totalPence to 0
repeat with shoe from 1 to 4
	repeat with nail from 1 to 7
		set totalPence to totalPence + nailPrice
		set nailPrice to nailPrice * 2
	end repeat
end repeat

set l to totalPence div 240 -- In old money!
set s to totalPence mod 240 div 12
set d to totalPence mod 12 div 1

set totalBill to "£" & l & "-" & s & "s-" & d & "d"

Richard · April 11, 2010, 3:08pm

1+2+3… triangle number= (n^2+n)/2 e.g. T3 = (3^2+3)/2 = 12/2 = 6 = 1+2+3
T4*7 = T28 = (28^2+28)/2 = 812/2 = 406 was that the answer?

Edit: Oops just saw it was doubling :lol:… quite simple =2^28-1

I tried long tests for the percentage logs, the results showed a margin under 100th for me, the only reason I allowed the logs in, memory is not strained at all thanks to removing the other logs but it allows the script to run at full speed putting the process at 100 cpu (and it would like more I think). Could logs really affect the speed? I’ll try testing again. I was thinking of this becoming an ASOC app with the percentages linking to a progress bar, not a log. Not really to know all these primes but mostly to get me back into AS and ASOC before I mess around with my precious old work.