Wednesday, July 23, 2014

#1 2006-05-20 12:51:51 pm

Adam Bell
Administrator
From: Nova Scotia, Canada
Registered: 2005-10-04
Posts: 4631

Embedded Script Objects for Speed in Processing

I have seen it stated several times in several places that the manipulation of large lists is faster if the lists are properties of embedded script objects within the script than if they are simply part of the script. For example this simple ASCII sort should run faster with the internal script than it does without:

Applescript:

script BL
   property Big_List : { a list of 1500 words }
   property index_list : {}
   property sorted_list : {}
end script

set t1 to GetMilliSec
repeat (count of BL's Big_List) times
   set the low_item to ""
   repeat with i from 1 to (count of BL's Big_List)
       if i is not in the BL's index_list then
           set this_item to item i of BL's Big_List as text
           if the low_item is "" then
               set the low_item to this_item
               set the low_item_index to i
           else if this_item comes before the low_item then
               set the low_item to this_item
               set the low_item_index to i
           end if
       end if
   end repeat
   set the end of BL's sorted_list to the low_item
   set the end of BL's index_list to the low_item_index
end repeat
set t2 to GetMilliSec
t2 - t1

After testing this a bit with a 1500-word list, I haven't found that to be the case for every run; and have found the results to be quite variable, in fact.

Am I doing something wrong, or is the original assertion incorrect?

Adam Bell


iMac & MBP both running OS X 10.9.1

Offline

 

#2 2006-05-20 06:04:36 pm

Nigel Garvey
Moderator
From: Warwickshire, England
Registered: 2002-11-19
Posts: 3495

Re: Embedded Script Objects for Speed in Processing

Hi, Adam.

Access to the items or properties of a list is faster if you use a reference to identify the list rather than a simple variable. Assigning the list to a property of a script object allows you to use a reference expression like 'BL's Big_List' rather than just, say, 'Big_List'. The script object itself doesn't make access faster, it simply allows a reference expression to be written into the script. This is particularly useful inside a handler, where the list variable may be local and thus unreferenceable. At the top level of a script, though, properties and variables that aren't explicitly declared local belong to the script itself and so you can use the keyword 'my' to set up the reference:

Applescript:

set Big_List to { 1500-item list } -- A list assigned to a top-level variable.

repeat with i from 1 to 1500
   item i of my Big_List -- 'my Big_List' is a reference.
end repeat

The speed advantage with a reference only comes when accessing things belonging to the list. Commands acting on the list itself — such as 'count', 'contains', 'is in', etc. — are often a trifle slower if the list is referenced.

In your script, there are other things too that can be done to improve the speed — principally, counting Big_List only once and assigning the result to a variable before the repeats. (Otherwise it'll be counted 1501 times, always with the same result!)

In extreme cases, you can sometimes squeeze out an extra fraction of a second or so by doing positive tests instead of negative ones.  smile

Applescript:

-- Negative test:
if (i is not in the BL's index_list) then
   -- Blah blah blah
end if

-- Positive test:
if (i is in the BL's index_list) then
else
   -- Blah blah blah
end if

I would think you could leave out the 'as text' coercion too.

If you know that all your text is the same case, or that the characters involved only have one case, putting the whole thing in a 'considering case' block can also give quite a boost. It saves the internal comparison routines having to check and allow for case and casedness.

Your central 'if … else if ' block is a bit perplexing as the action is the same in both parts. As the script works correctly, I supposed you're looking for an 'or' condition. It shouldn't make any difference to the speed, but it'll make the script a little shorter.  wink

The above is all quoted from past experience. I haven't tested it today to see what actual improvement it makes to your script's speed!

Applescript:

set Big_List to { a list of 1500 words }
set index_list to {}
set sorted_list to {}

set t1 to GetMilliSec
set BL_count to (count Big_List) -- Count Big_List just once, before the repeats.
repeat BL_count times
   set the low_item to ""
   repeat with i from 1 to BL_count
       -- Positive test, and no reference when accessing the list itself.
       if (i is in the index_list) then
       else
            -- A reference when accessing an item in the list.
           set this_item to item i of my Big_List
           if (the low_item is "") or (this_item comes before the low_item) then
               set the low_item to this_item
               set the low_item_index to i
           end if
       end if
   end repeat
   -- References while setting the lists' ends.
   set the end of my sorted_list to the low_item
   set the end of my index_list to the low_item_index
end repeat
set t2 to GetMilliSec
t2 - t1


NG

Offline

 

#3 2006-05-20 06:26:33 pm

Bruce Phillips
Administrator
Registered: 2004-07-15
Posts: 2649

Re: Embedded Script Objects for Speed in Processing

Nigel Garvey wrote:

counting Big_List only once and assigning the result to a variable before the repeats. (Otherwise it'll be counted 1501 times, always with the same result!)

This would work, right?

Applescript:

set Big_List to {"a list of 1500 words"}

count Big_List
repeat with i from 1 to result
   -- whatever
end

Offline

 

#4 2006-05-20 06:38:11 pm

Adam Bell
Administrator
From: Nova Scotia, Canada
Registered: 2005-10-04
Posts: 4631

Re: Embedded Script Objects for Speed in Processing

Nigel;

Thanks for the reply and cleaned up code. I note that it runs at exactly the same speed with or without the internal script object and refererences to it. I wasn't trying to be elegant with the sort, btw - I was primarily intererested in whether the script object made a difference, and it is still my observation that it doesn't - the timing with "bl's" in place is the same as with the "my" you used.

In my dual-core G5/2.3, the unreferenced forms take 39 seconds and the referenced forms takes 26 (a ratio of 1.444). Adding a 'my' here and there is obviously the way to go in problems like this one.

AB


iMac & MBP both running OS X 10.9.1

Offline

 

#5 2006-05-20 08:43:11 pm

Nigel Garvey
Moderator
From: Warwickshire, England
Registered: 2002-11-19
Posts: 3495

Re: Embedded Script Objects for Speed in Processing

Bruce Phillips wrote:

Nigel Garvey wrote:

counting Big_List only once and assigning the result to a variable before the repeats. (Otherwise it'll be counted 1501 times, always with the same result!)

This would work, right?

Applescript:

set Big_List to {"a list of 1500 words"}

count Big_List
repeat with i from 1 to result
   -- whatever
end

Hi, Bruce.

Yeah. That's fine if there's just one repeat block. The result goes straight into the target for the repeat. Adam's script, though, has an inner repeat that's set up and cycled through in each of the 1500 iterations of the outer repeat. The value of 'result' is different every time the inner repeat's set up, so a variable's the only way to preserve the target value for that — apart from recounting the list every time as in Adam's original.


NG

Offline

 

#6 2006-05-22 07:41:50 pm

Bruce Phillips
Administrator
Registered: 2004-07-15
Posts: 2649

Re: Embedded Script Objects for Speed in Processing

Would something like this get ‘entire contents’ every time?

Applescript:

set auto_backup to "path:to:auto backup:" (* modify as required *)
tell application "Finder" to repeat with f in (get files of entire contents ¬
   of folder auto_backup whose name does not start with "bu_")
   tell f to set name to "bu_" & name
end repeat


Filed under: Finder

Offline

 

#7 2006-05-23 09:03:52 am

Adam Bell
Administrator
From: Nova Scotia, Canada
Registered: 2005-10-04
Posts: 4631

Re: Embedded Script Objects for Speed in Processing

Did you mean this response to go here, Bruce, or in this thread: http://bbs.applescript.net/viewtopic.php?id=17272?


iMac & MBP both running OS X 10.9.1

Offline

 

#8 2006-05-23 09:45:42 am

Bruce Phillips
Administrator
Registered: 2004-07-15
Posts: 2649

Re: Embedded Script Objects for Speed in Processing

Here.

Offline

 

#9 2006-05-31 08:04:09 pm

kai
Member
From: Brighton, UK
Registered: 2005-05-28
Posts: 912

Re: Embedded Script Objects for Speed in Processing

Bruce Phillips wrote:

Would something like this get ‘entire contents’ every time?

Applescript:

set auto_backup to "path:to:auto backup:" (* modify as required *)
tell application "Finder" to repeat with f in (get files of entire contents ¬
   of folder auto_backup whose name does not start with "bu_")
   tell f to set name to "bu_" & name
end repeat

No, that wouldn't be very efficient, Bruce. The 'get' evaluates the target and creates a list of Finder file references - much the same as assigning a value to a variable using the 'set' command. (BTW, that code sure looks familiar... wink)

Incidentally, if anyone still has any lingering doubts about the efficacy of using references to access lists, perhaps I can just add my 2¢ to the discussion.

If you have the latest super-duper whizzy machine, don't wish to distribute scripts or advice for more general use, process only relatively short lists, aren't dealing with repeat-intensive situations and/or haven't otherwise optimised your script for performance - then don't worry about it. Otherwise, referencing might (among other optimisation techniques) be worth considering.

The following demo script, which perhaps illustrates the point better than anything I might say, could take some time (up to a few minutes) to execute on some machines.

Requires GetMilliSec:

Applescript:

property rtf1 : "{\\rtf1\\mac\\ansicpg10000\\cocoartf824\\cocoasubrtf330
\\readonlydoc1{\\fonttbl\\f0\\fswiss\\fcharset77 Helvetica;\\f1\\fswiss\\fcharset77 Helvetica-Bold;\\f2\\fswiss\\fcharset77 Helvetica-Oblique;
}
{\\colortbl;\\red255\\green255\\blue255;\\red0\\green51\\blue204;\\red204\\green0\\blue0;\\red204\\green204\\blue204;
}
\\paperw11900\\paperh16840\\margl1440\\margr1440\\vieww12740\\viewh"


property rtf2 : "\\viewkind0
\\pard\\tx280\\tqr\\tx2260\\tqr\\tx4540\\tqr\\tx6240\\tqr\\tx7940\\tqr\\tx9640\\tqr\\tx12480\\ql\\qnatural\\pardirnatural

\\f0\\fs24 \\cf0 \\up4 \\
\\pard\\tqc\\tx1280\\tqc\\tx5220\\tqc\\tx8600\\tqr\\tx12480\\ql\\qnatural\\pardirnatural
\\cf0 \\up0 \\super    
\\f1\\b List Length    \\cf2 Direct Access\\cf0    \\cf3 Script Object\\cf0    Speed Comparison
\\f0\\b0 \\
\\pard\\tx280\\tqr\\tx2260\\tqr\\tx4540\\tqr\\tx6240\\tqr\\tx7940\\tqr\\tx9640\\tqr\\tx12480\\ql\\qnatural\\pardirnatural

\\f2\\i \\cf0 \\dn4 \\ul \\ulc0    items    growth    \\cf2 \\ulc2 ms    growth\\cf0 \\ulc0    \\cf3 \\ulc3 ms    growth\\cf0 \\ulc0    factor
\\f0\\i0 \\
"

property base0 : missing value
property base1 : missing value
property base2 : missing value

on access_times from current_list for list_count against curr_test
   set m1 to GetMilliSec
   repeat with i from 1 to list_count
       set current_list's item i to current_list's item i
   end repeat
   set m2 to GetMilliSec
   script o
       property l : current_list
   end script
   repeat with i from 1 to list_count
       set o's l's item i to o's l's item i
   end repeat
   set m3 to GetMilliSec
   set t1 to m2 - m1
   set t2 to m3 - m2
   if t1 is 0 or t2 is 0 then return "\\f2\\i \\cf4 results too small for comparison" & return & "\\f0\\i0 \\cf0 \\'ca    \\'ca    \\'ca    \\'ca    \\'ca    \\"
   if base0 is missing value then
       set base0 to curr_test
       set base1 to t1
       set base2 to t2
   end if
   list_count & tab & (2 ^ (curr_test - base0) div 1) & tab & "\\cf2 \\ulc2 " & t1 div 1 & tab & (round (t1 / base1)) & "\\cf0 \\ulc0 " & tab & "\\cf3 \\ulc3 " & t2 div 1 & tab & (round (t2 / base2)) & "\\cf0 \\ulc0 " & tab & "x " & (round (t1 / t2)) & "\\"
end access_times

to compare_access_times at result_file from list_count for test_count
   set current_list to {"test"}
   repeat while (count current_list) < list_count
       set current_list to current_list & current_list
   end repeat
   set current_list to current_list's items 1 thru list_count
   set file_ref to open for access result_file with write permission
   try
       set eof file_ref to 0
       write rtf1 & 2500 - 1.2 mod (application "TextEdit"'s version) div 1.2 * 1120 + 360 * test_count & rtf2 to file_ref
       access_times from current_list for list_count against 1 (* a first run can be less reliable - so don't record initial results *)
       set base0 to missing value
       repeat with curr_test from 1 to test_count
           write return & tab & (access_times from current_list for list_count against curr_test) to file_ref
           set list_count to list_count * 2
           set current_list to current_list & current_list
       end repeat
       set eof file_ref to (get eof file_ref) - 1
       write "}" to file_ref
   end try
   close access file_ref
end compare_access_times

launch application "TextEdit"
set result_file to ((path to desktop as Unicode text) & "List Access Times.rtf") as file specification

compare_access_times at result_file from 250 for 6
(* 'from' [initial number of listed items] 'for' [number of rows] (in which list length is doubled each time) *)

tell application "TextEdit"
   activate
   open result_file
end tell

Edit notes (see discussion below):
• view height calculation modified to take account of TextEdit's behaviour in version 1.2 (Mac OS X Jaguar)
as integer expressions replaced by div or round - also to allow testing in Jaguar

Last edited by kai (2006-06-11 12:38:41 pm)


kai

Filed under: Finder, TextEdit

Offline

 

#10 2006-05-31 08:08:04 pm

Bruce Phillips
Administrator
Registered: 2004-07-15
Posts: 2649

Re: Embedded Script Objects for Speed in Processing

kai wrote:

No, that wouldn't be very efficient, Bruce. … (BTW, that code sure looks familiar... wink)

Just wanted to make sure. wink

Offline

 

#11 2006-06-01 07:39:20 am

Nigel Garvey
Moderator
From: Warwickshire, England
Registered: 2002-11-19
Posts: 3495

Re: Embedded Script Objects for Speed in Processing

kai wrote:

The following demo script ….

That's a lovely bit of RTF work, kai.  cool


NG

Offline

 

#12 2006-06-09 09:50:35 pm

kai
Member
From: Brighton, UK
Registered: 2005-05-28
Posts: 912

Re: Embedded Script Objects for Speed in Processing

Nigel Garvey wrote:

That's a lovely bit of RTF work, kai.  cool

Ooh - thanks, Nigel. (Missed that most generous compliment earlier.) big_smile

I'm sure the little feature that changes the window height didn't escape your attention - but, just in case anyone missed it, I'll mention it anyway. The following statement in the script should produce a list in TextEdit with six rows of data - which the window height should accommodate:

Code:

compare_access_times at result_file from 250 for 6

If the 'for' parameter is then amended to, say, 4 - the window height should be automatically adjusted to fit the resulting four rows of data:

Code:

compare_access_times at result_file from 250 for 4

(The 'at' parameter may also be adjusted. Just don't set the value of either parameter too high - unless you have a very fast machine and/or a great deal of patience! wink)


kai

Offline

 

#13 2006-06-10 11:04:17 am

Adam Bell
Administrator
From: Nova Scotia, Canada
Registered: 2005-10-04
Posts: 4631

Re: Embedded Script Objects for Speed in Processing

kai wrote:

I'm sure the little feature that changes the window height didn't escape your attention - but, just in case anyone missed it, I'll mention it anyway. The following statement in the script should produce a list in TextEdit with six rows of data - which the window height should accommodate:

Code:

compare_access_times at result_file from 250 for 6

If the 'for' parameter is then amended to, say, 4 - the window height should be automatically adjusted to fit the resulting four rows of data:

Code:

compare_access_times at result_file from 250 for 4

(The 'at' parameter may also be adjusted. Just don't set the value of either parameter too high - unless you have a very fast machine and/or a great deal of patience! wink)

I not only missed it, but I'm still missing it, Kai. Too fiendishly clever for this plodder.


iMac & MBP both running OS X 10.9.1

Offline

 

#14 2006-06-11 04:50:57 am

Nigel Garvey
Moderator
From: Warwickshire, England
Registered: 2002-11-19
Posts: 3495

Re: Embedded Script Objects for Speed in Processing

kai wrote:

The following statement in the script should produce a list in TextEdit with six rows of data - which the window height should accommodate:

Hi, kai.

As a matter of historical interest, TextEdit 1.2 (which came with Jaguar) doesn't hide its ruler and tool bar when displaying the document produced by your script, so the window isn't quite big enough to display all the results without scrolling. The 'viewh' setting appears to apply to the whole window rather than just the viewing area. Changing the base figure in 'compare_access_times' from 1380 to 2500 produces a satisfactory result on my Jaguar machine. (Any Jaguar users thinking of trying this will also need to change every instance of '(expression as integer)' in the last line of 'access_times' to '(round (expression))'.)


NG

Offline

 

#15 2006-06-11 12:41:17 pm

kai
Member
From: Brighton, UK
Registered: 2005-05-28
Posts: 912

Re: Embedded Script Objects for Speed in Processing

Nigel Garvey wrote:

... TextEdit 1.2 (which came with Jaguar) doesn't hide its ruler and tool bar when displaying the document...

Thanks for the heads-up on that, Nigel.

I'm not sure where that leaves us in terms of TextEdit version 1.3 (presumably released with Panther) but, for the moment, I'll assume that (like version 1.4 in Tiger) it opens read-only documents without showing the ruler/tool bar. (If any obliging Panther user knows different, perhaps they could chip in.)

I've modified the above script to take account of this information. (No doubt you'll let us know if the adjustment doesn't work in Jaguar. wink)

To complement the change (and thus to cover your other point), the real-to-integer coercions (introduced in AppleScript 1.9.2/Mac OS X 10.3) have been replaced by div or round functions as appropriate. I'm sure Jaguar users will appreciate your continued efforts to represent their interests. smile


kai

Offline

 

#16 2006-06-11 12:46:23 pm

kai
Member
From: Brighton, UK
Registered: 2005-05-28
Posts: 912

Re: Embedded Script Objects for Speed in Processing

Adam Bell wrote:

I not only missed it, but I'm still missing it, Kai.

You mean the mechanics of it, Adam?

As you probably know already, an RTF header holds various formatting data, including stuff like font and color tables. Apple has also extended Microsoft's standard RTF specification with custom commands - to support formatting constructs available in the Cocoa text system. These include document attribute extensions to specify (amongst other things) the size of a document's display area (measured in twips).

So I basically split the RTF header into two parts (represented by the properties rtf1 and rtf2) - just at the point where the value for the document's view height would normally appear. (Note that the property rtf1 ends with the RTF extension label \viewh.)

As mentioned earlier, the number of rows in the document is set near the beginning of the (implicit) run handler - by amending the compare_access_times handler's for parameter. This value, when introduced to the subroutine, is represented by the variable test_count.

While the width of the final RTF document is fixed, I calculated that the view height would need to be 69 points (1380 twips) - to accommodate the table headers - with a further 18 points (360 twips) for each row. So the document's overall view height can be derived by multiplying the row height (360) by the number of rows (test_count) - and adding the product to the initial height (1380).

This calculation is performed in the second line of the subroutine's try statement:

Code:

write rtf1 & 1380 + 360 * test_count & rtf2 to file_ref

As you'll see, the view height calculation is placed between the properties rtf1 and rtf2. So the expression evaluates the math calculation, coerces it to text (implicitly, through concatenation with the value of rtf1), concatenates it with rtf2 - and then writes the result to file.

That describes the statement in the original script - which has just been modified. So I suppose I should add this:

To take account of Nigel's info about TextEdit version 1.2, the calculation referred to above has now been changed to:

Code:

write rtf1 & 2500 - 1.2 mod (application "TextEdit"'s version) div 1.2 * 1120 + 360 * test_count & rtf2 to file_ref

This replaces the initial height from a fixed value of 1380 to a calculated one:

Code:

2500 - 1.2 mod (application "TextEdit"'s version) div 1.2 * 1120

The expressions should evaluate like this:

expression                               version ≤ 1.2      version > 1.2
                                                                                               
application "TextEdit"'s version                    "1.2"                   "1.4"
1.2 mod result                                             0.0                       1.2
result div 1.2                                                  0                         1
result * 1120                                                  0                    1120
2500 - result                                             2500                    1380

So to sum up briefly, the document's view height is calculated at runtime and then inserted, at the appropriate position, in the RTF header. Programmatically (considering the mix of AppleScript and RTF code), perhaps somewhat abstruse - though conceptually, quite simple. smile


kai

Offline

 

#17 2006-06-11 01:35:04 pm

Adam Bell
Administrator
From: Nova Scotia, Canada
Registered: 2005-10-04
Posts: 4631

Re: Embedded Script Objects for Speed in Processing

As you point out, "conceptually quite simple"; but mighty sophisticated and very elegant just the same. Thanks for the explanation, Kai. It must have been a major project to discover and assemble all of that.


iMac & MBP both running OS X 10.9.1

Offline

 

#18 2006-06-11 04:26:41 pm

Nigel Garvey
Moderator
From: Warwickshire, England
Registered: 2002-11-19
Posts: 3495

Re: Embedded Script Objects for Speed in Processing

kai wrote:

As you probably know already….

That's a lovely bit of BBCode work, kai.  cool


NG

Offline

 

#19 2006-06-11 05:37:02 pm

kai
Member
From: Brighton, UK
Registered: 2005-05-28
Posts: 912

Re: Embedded Script Objects for Speed in Processing

lollollol Why do I feel this overpowering sense of Déjà Vu? smile


kai

Offline

 

Board footer

Powered by FluxBB

[ Generated in 0.046 seconds, 10 queries executed ]

RSS (new topics) RSS (active topics)