I like to perform a search/compare on two list items and result the same existing list elements alone.
Please see my code below, and correct me how to get the desired result without using repeat loop.
set xx to {"a", "b", "f"}
set yy to {"a", "b", "c", "d", "e"}
set zz to every item of xx that is in yy
-- result would be a is {"a", "b"}
set xx to {"a", "b", "f"}
set yy to {"a", "b", "c", "d", "e"}
set zz to CommonElements(xx, yy)
on CommonElements(aList, bList)
set commonList to {}
try
set commonList to CommonElements(rest of aList, bList)
if (item 1 of aList) is in bList then
copy item 1 of aList to beginning of commonList
end if
end try
return commonList
end CommonElements
That’s clever mickerson. I was curious to see if it was actually faster too, so I ran a speed test of your code versus this code…
on CommonElements(aList, bList)
set commonList to {}
repeat with a in aList
if a is in bList then set end of commonList to contents of a
end repeat
return commonList
end CommonElements
It was 7 times slower. I ran 5000 tests on each and here’s the results. Note, your code was test #2 and the test times are in seconds.
“First Test Time: 0.358
Second Test Time: 2.459
Times Faster: 6.9”
Apparently, it’s faster to avoid an error than to trap it.
on CommonElementsBeta(aList, bList)
set commonList to {}
try
if 0 < (count of aList) then
set commonList to CommonElementsBeta(rest of aList, bList)
if (item 1 of aList) is in bList then
copy item 1 of aList to beginning of commonList
end if
end if
end try
return commonList
end CommonElementsBeta
On 80,000 repetitions, this took 6 seconds. The method in Post #2 took 15 and the repeat loop took 4.
Hey, mikerickson (Mike?). I still have questions about this approach/your construction, but how would this variant fare in your speed test? I suspect that purging the code of “rest” may increase the speed.
set xx to {"a", "b", "f"}
set yy to {"a", "b", "c", "d", "e"}
CE3(xx, yy, 0)
on CE3(aList, bList, counter)
set commonList to {}
set counter to counter + 1
if counter < (count aList) then try
set commonList to CE3(aList, bList, counter)
tell aList's item counter to if it is in bList then set beginning of commonList to it
end try
commonList
end CE3
**I edited this to include a conditional consideration on my counter. The reason that I thought this might be faster is that I seem to remember reading that “rest” retains its iterations in memory. Can anyone verify this one way or the other?
I’m at work (Windows) and can’t test your version, but I don’t see how it will terminate.
In my version, the method does not call CommonElements when aList = {}
When it does call CommonElements, it passes an list that is shorter than the aList that was passed to it.
Every time CommonElements is called, the aList argument gets shorter until it is empty and the recursion stops.
I see no shortening feature of the method you posted, hence the stack error.
This is a (perhaps) useful excersize in recursive functions, but to return the list of elements common to two lists, the a repeat loop is faster. Best Practices rules like “don’t loop” are useful guides, but when there are always exceptions to those rules.
I ran my own tests on three of these arrangements, and I’ve come up with significantly different results. In my testing, the recursive method performed 19 times faster. I’d be interested to learn the testing environment and parameters of the other testers. The details of my test were as follows: I created two text files containing numeric lists of 15,000 integers; list one was 1-15,000, and list two was 1 to 29,999, in multiples of 2. Before putting the respective code snippets through their paces, each text file was read into memory and assigned to the variables {xx, yy}.
I just put the three methods into a script and ran this.
set testSize to 90000
set startTime to current date
repeat with i from 1 to testSize
set zz to CommonElementsAlpha(xx, yy)
end repeat
set alphatime to (current date) - startTime
set startTime to current date
repeat with i from 1 to testSize
set zz to CommonElementsBeta(xx, yy)
end repeat
set BetaTime to (current date) - startTime
set startTime to current date
repeat with i from 1 to testSize
set zz to CELoop(xx, yy)
end repeat
set LoopTime to (current date) - startTime
return {alphatime, BetaTime, LoopTime} -- returns {18, 6, 4}
This technique isn’t to find out how fast (in an absolute sense) a particular method is, but it will compare different routines to find which is faster.
Your technique found that looping is 7 times faster than recusion on CommaonElementsAlpha.
My routine found a factor of 18/4.
For a quick and dirty “which is faster”, I’d say our results match.
If I was looking for a more absolute number, I’d vary the length and degree that the two lists match.
I just ran Mike’s supplied test with the 90000 iterations, using the small list of original values supplied by the OP. xx = {“a”, “b”, “f”} yy = {“a”, “b”, “c”, “d”, “e”}
I wasn’t sure if CELoop or CommonElementsAlpha referred to my code, so I relabeled the routines to the original names. This test took notably more time, as it was performed on my home machine, a G4 Mac Mini with OS 10.4.11; RAM was also limited to 512MB, where I have 4GB on my work setup. Comparing this against the other’s results, I’m not sure it proves anything definitively about the performance of the routines other than “it depends”.
Are you asking how fast it will go through dual lists of 15K over 90,000 iterations?
I don’t know. If I were doing that much processing I would use C or Objective-C.
Hello again, Craig. If you’d be willing to do the test, one iteration using that size list would work. I was just curious to see how the performance scales. For my initial test, I generated a list of integers 1-15,000 and another that was 1-30,000 by 2, for parity. Thanks for your time.