i looked on the faq, couldn’t find anything… i need some code that will take:
abccdefgghijk
and return
{“c”, “g”}
anybody know? thanks.
i looked on the faq, couldn’t find anything… i need some code that will take:
abccdefgghijk
and return
{“c”, “g”}
anybody know? thanks.
Well, while Nigel comes with a better algorithm, here is what I use:
set theList to {{1, 2, 3}, {a:1}, «data utxt0032», 0, 4, 5, 6, 9, 8, 7, 4, {1, 2, 3}, {a:1}, «data utxt0032»}
findDuplicates(theList) --> {4, {1, 2, 3}, {a:1}, «data utxt0032»}
to findDuplicates(x)
script a
property theList : x
property originals : {}
property duplicates : {}
end script
repeat with i from 1 to count a's theList
set theItem to a's theList's item i
if {theItem} is not in a's originals then
set a's originals's end to theItem
else
if {theItem} is not in a's duplicates then set a's duplicates's end to theItem
end if
end repeat
a's duplicates
end findDuplicates
[quote=“jj”]
Well, while Nigel comes with a better algorithm, here is what I use:
;-)
Thanks for the flattery (I think!), jj. Your algorithm's probably the best that's possible and your implementation of it is eminently sensible.
If you're prepared to get a little silly with it - as a Sunday afternoon diversion - it's possible to speed it up very slightly with a couple of pieces of arcane knowledge....
1. It takes slightly longer to test that something's [i]not[/i] so than to test that it [i]is[/i] so. (Presumably, the 'is so' test is done anyway and the result is then 'notted'.) This is normally (as here) hardly significant, but if the test is repeated thousands of times, say, it's worth considering making it positive:
if {theItem} is in a’s originals
– Do that.
else
– Do this.
end if
– Rather than:
if {theItem} is not in a’s originals
– Do this.
else
– Do that.
end if
Even when there's no alternative action to perform: :-)
if {theItem} is in a’s duplicates then
else
– Do this.
end if
– Rather than:
if {theItem} is not in a’s duplicates then do this
2. References to lists are in fact [i]slightly slower[/i] with the 'is in' or 'contains' commands. For the optimum speed effect, you'd need to reference the lists via your script object properties when iterating though them or setting their ends, but access them via unreferenced variables for 'is in'.
set theList to {{1, 2, 3}, {a:1}, «data utxt0032», 0, 4, 5, 6, 9, 8, 7, 4, {1, 2, 3}, {a:1}, «data utxt0032»}
findDuplicates(theList)
to findDuplicates(x)
– Ordinary local variables.
set origs to {}
set dups to {}
script a
– The same lists, but assigned to this script object’s properties too for referencing.
property theList : x
property originals : origs
property duplicates : dups
end script
repeat with i from 1 to count x
set theItem to a’s theList’s item i
if {theItem} is in origs then
if {theItem} is in dups then
else
set a's duplicates's end to theItem
end if
else
set a's originals's end to theItem
end if
end repeat
dups
end findDuplicates
So, being the un-referenced local variable directly linked to the script object’s analogous property, both lists are being automatically updated; then, you can access the local list for the “is in” operation ('cause it’s faster), and append data to the script object’s properties ('cause it’s a bit-bit-bit faster).
Then, I earn 76 ticks in 10000 iterations over the sample above.
I know you’ve been a bunch of years learning to optimize speed in AS tasks, along with other AS-Lords, such as Arthur Knapp or Kai Edwards (obviously, you didn’t found these tips in a weekly magazine). :rolleyes:
Why don’t you write a little article with some rules on the speed topic? Perhaps we could publish it in unScripted (if not soooooo long) or keep it here as a sticky note or create a PDF and host it here for public access… And this would help others!