Sort handlers for use with small lists

I thought I would do a quick review of sort handlers which are intended for use with relatively small lists. The main criteria in this review were speed and code brevity, and my test list contained 100 items:

set theList to {"mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa"}

The following ASObjC script performed the sort in 1 millisecond (with the Foundation framework in memory) and only required two lines of code. I include this script primarily to set a benchmark, because my main interest is sort routines that employ basic AppleScript.

use framework "Foundation"
use scripting additions

set sortedList to getSortedList(theList)

on getSortedList(theList)
	set theArray to current application's NSArray's arrayWithArray:theList
	return (theArray's sortedArrayUsingSelector:"localizedStandardCompare:") as list
end getSortedList

Second place went to the shell, which took 5 milliseconds and required 5 lines of code. This approach is also very fast when used with extremely large lists and is competitive with ASObjC in this regard. However, it returns an unexpected result in rare cases when a case-insensitive sort is done (the -f option).

set sortedList to getSortedList(theList)

on getSortedList(theList)
	set {TID, text item delimiters} to {text item delimiters, linefeed}
	set theText to theList as text
	set sortedList to paragraphs of (do shell script "echo " & quoted form of theText & " | sort -f")
	set text item delimiters to TID
	return sortedList
end getSortedList

Third place went to a bubble sort enhanced with a script object. It took 9 milliseconds and required 9 lines of code:

set sortedList to getSortedList(theList)

on getSortedList(theList)
	script o
		property a : theList
	end script
	repeat with i from (count o's a) to 2 by -1
		repeat with j from 1 to i - 1
			if item j of o's a > item (j + 1) of o's a then set {item j of o's a, item (j + 1) of o's a} to {item (j + 1) of o's a, item j of o's a}
		end repeat
	end repeat
	return o's a
end getSortedList

Next came QSort which only took 5 milliseconds but contained 17 lines of code. I’m not certain of the author of this script (Nigel perhaps):

set sortedList to Qsort(theList, 1, (count theList))

on Qsort(theList, leftEnd, rightEnd)
	set {i, j} to {leftEnd, rightEnd}
	set v to item ((leftEnd + rightEnd) div 2) of theList
	repeat while (j > i)
		repeat while (item i of theList < v)
			set i to i + 1
		end repeat
		repeat while (item j of theList > v)
			set j to j - 1
		end repeat
		if (not i > j) then
			set {item i of theList, item j of theList} to {item j of theList, item i of theList}
			set {i, j} to {i + 1, j - 1}
		end if
	end repeat
	if (leftEnd < j) then Qsort(theList, leftEnd, j)
	if (rightEnd > i) then Qsort(theList, i, rightEnd)
	return theList
end Qsort

Last was bubble sort, which took 31 milliseconds and contained 6 lines of code. This approach should not be rejected out of hand, because it only required one-half of one millisecond to sort a list of 20 items.

set sortedList to getSortedList(theList)

on getSortedList(a)
	repeat with i from (count a) to 2 by -1
		repeat with j from 1 to i - 1
			if item j of a > item (j + 1) of a then set {item j of a, item (j + 1) of a} to {item (j + 1) of a, item j of a}
		end repeat
	end repeat
	return a
end getSortedList

All of the timing tests were done on my 2023 Mac mini with Script Geek.

1 Like

Here is a version of comb-sort.


set theList to {"mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa", "mm", "cc", "xx", "oo", "aa"}

combsort for theList

on combsort for alist given shrinkfactor:sf : 1.7
	local i, j, cc, ns, js, gap, pgap, sw -- ns means No Swap
	script m
		property nList : alist
	end script
	set cc to count m's nList
	set gap to cc div sf
	repeat until gap = 0 --with j from (lc - 1) to 1 by -1
		repeat with i from 1 to gap
			set js to cc - gap
			repeat until js < 1 -- do each gap till nor more swaps
				set ns to gap
				repeat with j from i to js by gap
					if (item j of m's nList) > (item (j + gap) of m's nList) then
						set sw to (item j of m's nList)
						set (item j of m's nList) to (item (j + gap) of m's nList)
						set (item (j + gap) of m's nList) to sw
						set ns to j
					end if
				end repeat
				set js to ns - gap
			end repeat
		end repeat
		set pgap to gap
		set gap to gap div sf
		if gap = 0 then
			if pgap ≠ 1 then set gap to 1
		end if
	end repeat
end combsort

and here is my version of Quick-sort that is non-recursive.

on quickSort for blist -- Non-Recursive FASTEST
	local px, stack, lo, hi, L, H, sw -- px means 'Pivot Index'
	script mL
		property alist : blist
		property stack : {{1, count blist}}
	end script
	repeat until (count mL's stack) = 0
		set lo to item 1 of item 1 of mL's stack
		set hi to item 2 of item 1 of mL's stack
		set px to item ((hi + lo) div 2) of mL's alist -- start partitionHoare
		set L to lo
		set H to hi
		repeat
			repeat while item L of mL's alist < px
				set L to L + 1
			end repeat
			repeat while item H of mL's alist > px
				set H to H - 1
			end repeat
			if L ≥ H then exit repeat
			set sw to item L of mL's alist
			set item L of mL's alist to item H of mL's alist
			set item H of mL's alist to sw
			set L to L + 1
			set H to H - 1
		end repeat
		set px to H -- end of partitionHoare
		set mL's stack to rest of mL's stack --items 2 thru -1 of stack
		if px + 1 < hi then set beginning of mL's stack to {px + 1, hi}
		if lo < px then set beginning of mL's stack to {lo, px}
	end repeat
end quickSort

The reason i use non-recursive is because AppleScript recursion can only go around 520 recursion levels deep. If the list is big enough you will hit that limit

Just goes to show you that looking at ALL posts, even though the post subject may not grab your interest, can be beneficial. What caught my eye was the "All of the timing tests were done on my 2023 Mac mini with Script Geek. Google search, and up comes Shane Stanley’s web site. Quick look through a few pages, and bookmarked. Great site for information for those of us in the “learning” stage. Thanks @peavine

1 Like

I occasionally need to sort a list containing no more than 20 items, and I reran my tests with a list of this size. I also changed the individual items of the list to longer words to see if this would make a difference:

set theList to {"this is line one", "another list", "peavine", "Nigel", "Shane Stanley", "scriptable app", "Safari", "Google Chrome", "Ventura", "monterey", "Sonoma", "Mac mini", "Prescott Arizona", "script", "app", "Late Night Software", "ok", "works", "an item of list", "last item of list"}

The results were:

ASObjC - less than 1 millisecond
Shell sort command - 5 milliseconds
Enhanced bubble sort - less than 1 millisecond
QSort - less than 1 millisecond
Bubble sort - less than 1 millisecond

Since the list to sort is so small, use Bubble-sort. It has less memory overhead. The others, tho faster, don’t have enough items to sort to overcome the overhead

How are you running your scripts?

I can get each script to run but neither generates any output.

I’m trying this, which matches the combsort command, FWIW:

quickSort for theList

Thanks

The Wikipedia article on sorting algorithms identifies three sorting algorithms that have “tiny code”, and, in addition to bubble sort, they are exchange sort and gnome sort. I wondered if anyone has written an AppleScript of either of these?

What output. The list is sorted in place, no output needed

I see. That was too subtle for me :slight_smile:

I’m gonna have to look at the script more closely. Thanks.

This is my first attempt at a gnome sort. It’s a few milliseconds slower than a bubble sort and contains 10 lines of code, so it appears to be out of the running.

set sortedList to getSortedList(theList)

on getSortedList(a)
	set i to 1
	repeat while i ≤ (count a)
		if i = 1 or (item i of a ≥ item (i - 1) of a) then
			set i to i + 1
		else
			set {item i of a, item (i - 1) of a} to {item (i - 1) of a, item i of a}
			set i to (i - 1)
		end if
	end repeat
	return a
end getSortedList

Thanks Fredrik71 for the suggestions.

Unfortunately, neither of them worked correctly. The gnome sort appears fixable by changing line 1 below to line 2 below. Exchange sort doesn’t work at all, but it’s a starting point. Thanks again.

repeat while pos < n
repeat while pos ≤ n

My apologies–exchange sort works as written and gnome sort works with the one correction. The timing results for both are equal to, or a few milliseconds greater than bubble sort, so bubble sort seems a better choice.

Fredrik71. FWIW, I retested your gnomeSort handler with theList from post 1 and the result doesn’t appear to be correct (the last item in theList is not sorted):

{"aa", "aa", "aa", "aa", "aa", "aa", "aa", "aa", "aa", "aa", "aa", "aa", "aa", "aa", "aa", "aa", "aa", "aa", "aa", "cc", "cc", "cc", "cc", "cc", "cc", "cc", "cc", "cc", "cc", "cc", "cc", "cc", "cc", "cc", "cc", "cc", "cc", "cc", "cc", "mm", "mm", "mm", "mm", "mm", "mm", "mm", "mm", "mm", "mm", "mm", "mm", "mm", "mm", "mm", "mm", "mm", "mm", "mm", "mm", "oo", "oo", "oo", "oo", "oo", "oo", "oo", "oo", "oo", "oo", "oo", "oo", "oo", "oo", "oo", "oo", "oo", "oo", "oo", "oo", "xx", "xx", "xx", "xx", "xx", "xx", "xx", "xx", "xx", "xx", "xx", "xx", "xx", "xx", "xx", "xx", "xx", "xx", "xx", "xx", "aa"}

Perhaps an interesting solution might involve a smart sort handler that selects from multiple sort methodologies based on list length, data size, and/or sample list content. Rarely would it make a difference and it might take years to recover the dev time a few milliseconds at a time, but optimization!