search a very long list for an item

Hello.

I just want to add to this thread, that using binary search, for searching for values in list of lists, is useful too. Well, it is not that easy to ( or you can’t really) find the index of the list that is contained by the list.
So, in such cases where you want to search for items that are contained in a list of lists, you are better off constructing a list with the index values, and use binarysearch on that one. :slight_smile:

I ran the Nigel’s script on a list of 4326 datestrings.
It took less than one second to return the list of 21 values matching the searched key.

So, I immediately added the handler to my scripts library.

KOENIG Yvan (VALLAURIS, France) lundi 30 septembre 2013 16:56:07

I wonder if that’s what you really meant to say. Seems to me they’re certainly effective in other languages, although the potential gain – and hence tipping point from advantageous to necessary – is obviously greater in high-level languages like AS.

And it saves programmer time too! -In using binarysearch, spares you from a lot l of steps to iterate over while debugging your code, compared to looking up elements one after the other. :stuck_out_tongue:

Hi Nigel,

Great routine!!! Very essential and fastest.

Many thanks for your contribution

Ame

If you look closely to Nigel’s code you’ll see that it’s not a binary search and he takes fully advantage of the fast underlying linear search from AS (contains) using some of the binary search technique. A binary search only works on ordered lists while Nigel’s approach doesn’t need a an ordered list which makes his code great. Instead of using the greater than and less than operators he’s using the contains and not contains operators instead.

It was six and a half years ago. I don’t even remember writing the handler. :wink:

But I broadly agree with the statement the context in which it was written ” that is, with regard to a search of a randomly ordered list. In this case, a binary search has repeatedly to ascertain if a value exists in one of two subranges of the list and home in on it there if it does or home in on it in the other subrange if it doesn’t. At the lowest level, the ascertaining can only be done with linear searches of the tested subranges, so a hit result may as well be returned directly from one of these. It just so happens that the several range extractions, subrange re-extractions, and sometimes overlapping low-level linear searches involved in repetitions of the AppleScript ‘. is in items . thru . of .’ are statistically faster than a linear search of the list at the AppleScript level. Obviously the actual time difference in any one case depends on whereabouts in the list the value occurs.

However, when dealing with a list known to be ordered, the criterion for each subrange selection is simply whether the sought value is less than or greater than the value at the current index. This kind of binary search, I agree, is possible and potentially effective in low-level languages. And by “low-level”, I’m thinking “assembler” rather than any intermediate language between this and AppleScript.

Yes, I should have done you the courtesy of at least looking at the code :rolleyes: I saw “binary search” and assumed the classical binary search on sorted data.