I find myself in need of a handler that will find the most common item in a list. The list happens to be pages in an InDesign document, but I would think that the best approach would be to write something generic that will handle any kind of list.
Here’s the only way my brain can break down the steps:
Get the unique items of the list
Count the instances of each item in the list. Store this information in record form {item:x,frequency:1}
Sort the records according to frequency. Return the item of the last (most frequent) record.
That seems overly complicated, so before I waded in I thought I check here first to see if someone already has a nifty, three-line solution.
This will handle a list of numbers or strings. If you need support for a list of lists or records, it could be adaptated. It will return the last item more frequent in the list. For example, if it finds 4 “x” and 4 “y”, it will return “y”. It could be also adaptated to return the first more-frequent, or a list of more-frequents (modify “greaterInteger()”).
set theList to {}
repeat 50 times
set theList to theList & items of "This is a simple test and I'll guess which is the most popular character in this sentence."
end repeat
--> theList contains now 4500 items
mostPopularItemInList(theList)
to mostPopularItemInList(x)
script accellerator
property theList : x
property originalItems : {}
property originalItemsCount : {}
on indexof(theItem, theList) -- credits Emmanuel Levy
set text item delimiters to return
set theList to return & theList & return
set text item delimiters to {""}
try
-1 + (count (paragraphs of (text 1 thru (offset of (return & theItem & return) in theList) of theList)))
on error
0
end try
end indexof
to greaterInteger()
set largerItem to 0
set largerItemIndex to 0
repeat with i from 1 to count originalItemsCount
if originalItemsCount's item i > largerItem then
set largerItem to originalItemsCount's item i
set largerItemIndex to i
end if
end repeat
originalItems's item largerItemIndex
end greaterInteger
end script
repeat with i from 1 to count accellerator's theList
considering case
if accellerator's theList's item i is not in accellerator's originalItems then
set accellerator's originalItems's end to accellerator's theList's item i
set accellerator's originalItemsCount's end to 1
else
set ind to accellerator's indexof(accellerator's theList's item i, accellerator's originalItems)
set accellerator's originalItemsCount's item ind to ((accellerator's originalItemsCount's item ind) + 1)
end if
end considering
end repeat
return accellerator's greaterInteger()
end mostPopularItemInList
on mostCommonItemIn(theList)
if theList is {} then return -- make your own arrangements for empty lists
script o
property mainList : theList
property remainderList : {}
end script
set highestScore to 0
repeat until (o's mainList's length < highestScore)
set currentItem to item 1 of o's mainList
set currentScore to 1
repeat with i from 2 to (count o's mainList)
if (item i of o's mainList is currentItem) then
set currentScore to currentScore + 1
else
set end of o's remainderList to item i of o's mainList
end if
end repeat
if (currentScore < highestScore) then
else
set highestScore to currentScore
set mostCommonItem to currentItem
end if
set o's mainList to o's remainderList
set o's remainderList to {}
end repeat
return mostCommonItem
end mostCommonItemIn
set theList to {1, 2, {a:"Hello", b:2}, {"I am a list"}, "Hello", {b:2, a:"Hello"}, Wednesday, true}
mostCommonItemIn(theList)
--> {a:"Hello", b:2}
That assumed, of course, that there would would always be more of one item in the list than of any of the others. Here’s an improved version that allows for the fact that there may be two or more “most common” items:
on mostCommonItemsIn(theList)
if theList is {} then return {}
script o
property mainList : theList
property remainderList : {}
end script
set highestScore to 0
set listLen to o's mainList's length
repeat until (listLen < highestScore)
set currentItem to item 1 of o's mainList
set currentScore to 1
repeat with i from 2 to listLen
set x to item i of o's mainList
if (x is currentItem) then
set currentScore to currentScore + 1
else
set end of o's remainderList to x
end if
end repeat
if (currentScore > highestScore) then
set highestScore to currentScore
set mostCommonItems to {currentItem}
else if (currentScore = highestScore) then
set end of mostCommonItems to currentItem
end if
set o's mainList to o's remainderList
set o's remainderList to {}
set listLen to listLen - currentScore
end repeat
return mostCommonItems
end mostCommonItemsIn
set theList to {false, true, 1, Wednesday, 2, {a:"Hello", b:2}, {"I am a list"}, "Hello", {b:2, a:"Hello"}, Wednesday, true, false}
mostCommonItemsIn(theList)
--> {false, true, Wednesday, {a:"Hello", b:2}}
Frequency count is a standard CS101 problem (i.e. Google is your friend). e.g. A typical solution using Python:
alist = ['a', 'b', 'c', 'd', 'b', 'c', 'a', 'c', 'c', 'd']
# count number of times an item appears (this implementation assumes items are immutable, e.g. strings)
freqs = {}
for item in alist:
# item = item.lower() # uncomment this line to do case-insensitive word counts
freqs[item] = freqs.get(item, 0) + 1
# find item(s) with highest frequency
high = max(freqs.values())
found = [item for item, freq in freqs.items() if freq == high]
print high, found # -> 4 ['c']
You can do it in AppleScript, but it’ll be deadly slow compared to languages like Perl and Python so probably best avoided if you can help it.
thanks, all. you’ve saved me much time and aggravation.
i’ve read about the trick of using a reference to a script’s list, as a way of speeding things up, but it never stuck in my brain before. maybe this time.
Here’s another one, using do shell script, which I’ve modified from a file that was posted on a FileMaker list by Bruce Robertson:
(the 1 can be changed to however many you want to see, 5, etc.)
set theData to {“red”, “blue”, “purple”, “green”, “red”, “red”, “red”, “yellow”}
my getTop(theData)
on getTop(theData)
set AppleScript’s text item delimiters to ASCII character 10
try
set shellStuff to " | sort | uniq -c | sort -rn | head -1"
set theTop to do shell script "echo " & quoted form of (theData as text) & shellStuff
–set theTop to do shell script "echo " & (theData as text) & shellStuff
–just gets the top thing, not the count
on error errMsg number errNum
set AppleScript’s text item delimiters to “”
end try
set AppleScript’s text item delimiters to “”
return theTop
end getTop