Filter Array

Trying to figure out best filter process for a list of lists using ApplescriptObjC

I have an array of 2600+ lists and want to filter out the ones with “I” in item 31 of each.

I tried using this Nigel Garvey snippet and it Is returning the incorrect results. Im sure its meant for a string and not list. Repeat loops are just too slow using standard AppleScript.



on itemsWithout:theVal inList:theListofLists
	set theArray to current application's NSArray's arrayWithArray:theListofLists
	set thePred to current application's NSPredicate's predicateWithFormat:"NOT (self CONTAINS %@)" argumentArray:{theVal}
	return (theArray's filteredArrayUsingPredicate:thePred) as list
end itemsWithout:inList:


on a separate note, does anyone know of a good location that documents the options for predicateWithFormat?

Thanks

It’s meant to filter a list of lists for lists not containing a certain value:

set lst to {{1, 2, 3}, {4, 5, 6}, {9, 7, 5}}
my itemsWithout:4 inList:lst
--> {"1 2 3", "9 7 5"}

But it does work for filtering a list of texts for those not containing a certain substring.

I can’t think how you’d filter for what you want without using a repeat. It’s not actually as slow with that many lists as you might think:

on listsWithout:theSubstring inItem:idx inList:listOfLists
	script o
		property mainList : listOfLists
		property filteredList : {}
	end script
	
	repeat with i from 1 to (count listOfLists)
		set thisItem to item i of o's mainList
		if (item idx of thisItem does not contain theSubstring) then set end of o's filteredList to thisItem
	end repeat
	return o's filteredList
end itemsWithout:inItem:inList:

-- Set up a list containing 2700 35-item lists, the 31st item in each
-- being a string which may or may not contain "l".
set template to {}
repeat with i from 1 to 35
	set end of template to missing value
end repeat

set aList to {}
repeat 2700 times
	set tmplt to my template's items
	set item 31 of my tmplt to some item of {"aardvark", "plunge", "parallel", "banana"}
	set end of my aList to tmplt
end repeat

-- Announce the beginning and end of the filtering operation.
say "Filtering…"
set filteredList to my listsWithout:"l" inItem:31 inList:aList
say "Done. The filtered list contains " & (count filteredList) & "items out of the original 2700."

https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/Predicates/AdditionalChapters/Introduction.html#//apple_ref/doc/uid/TP40001789

Thank you!
I appreciate your response. I thought someone would be pushing me to a predicateWithFormat inside a predicate or something. predicate seems like voodoo witchcraft but not necessarily in a bad way.

Your as always “The Man” or you may prefer “The Dude”.

This should do what you want:

use AppleScript version "2.5" -- macOS 10.11 or later
use framework "Foundation"
use scripting additions

set theListOfLists to {{"a", "b", "c"}, {"g", "l", "k"}, {"a", "b", "d"}, {"a", "l", "c"}}
set theArray to current application's NSArray's arrayWithArray:theListOfLists
set thePred to current application's NSPredicate's predicateWithFormat:"NOT (self[1] CONTAINS %@)" argumentArray:{"l"}
return (theArray's filteredArrayUsingPredicate:thePred) as list

Just remember the index is zero-based, so you’d use self[30].

Nigel,
You were right it wasn’t that much of a slowdown. It’s a lot faster in your subroutine then in the main script as I put it. 8.35 seconds Mine was like 30 seconds.

Just curious. Something I get when testing your code here and the previous code you posted in your sorting library, both give me table overflow errors when in debugging mode. I think it has something to do with property being used in the subroutine. Any thoughts?

Shane - 7.36 seconds Thanks for clarifying….always close but never obvious on the index. I tried at some point but never would have tried starting with 0.

Debugging mode is a peculiarity of Script Debugger. I take no responsibility for it. :wink:

Shane,

If the item 2/index 1 of the list in LOL to be filtered is a list, too.
(1) I can think of using a repeat loop.
(2) For the combination of “is” and “is not” situation, e.g. contains “a” AND does not contain “C”, I can think of using two steps of repeat loops: the first loop uses ‘predicateWithFormat:“(self[1] CONTAINS %@)”’ and followed by the second loop uses ‘predicateWithFormat:“NOT (self[1] CONTAINS %@)”’.

I can generate the expected outcome for (1) and (2) by using the above-mentioned repeat loops, but is there a better way to do it? The list that I am dealing with is no more that 1000 lists in the LOL, where the list to be filtered within the list within the LOL at most is having 20 strings with each string of than 20 characters. The filters are an exact match of string only.

Thank you very much.



-- for case (1), does not contain "a" OR "c"
set theListOfLists to {{"a", {"b", "a", "C"}, "c"}, {"g", {"l", "b", "c"}, "k"}, {"a", {"b", "a", "l"}, "d"}, {"a", {"l", "b", "b"}, "c"}}

set theFilter to {"a", "c"}  -- does not contain either "a" or "c"

repeat with each in theFilter
	set theArray to (current application's NSArray's arrayWithArray:theListOfLists)
	set thePred to (current application's NSPredicate's predicateWithFormat:"NOT (self[1] CONTAINS %@)" argumentArray:{each})
	set theListOfLists to (theArray's filteredArrayUsingPredicate:thePred) as list
end repeat
-- return {"a", {"l", "b", "b"}, "c"}


-- for case (2) contains "a" AND does not contain "C"
set theListOfLists to {{"a", {"b", "a", "C"}, "c"}, {"g", {"l", "b", "c"}, "k"}, {"a", {"b", "a", "l"}, "d"}, {"a", {"l", "b", "b"}, "c"}}

set theIsFilter to {"a"}

repeat with each in theIsFilter
	set theArray to (current application's NSArray's arrayWithArray:theListOfLists)
	set thePred to (current application's NSPredicate's predicateWithFormat:"(self[1] CONTAINS %@)" argumentArray:{each})
	set theListOfLists to (theArray's filteredArrayUsingPredicate:thePred) as list
end repeat


set theNotFilter to {"C"}

repeat with each in theNotFilter
	set theArray to (current application's NSArray's arrayWithArray:theListOfLists)
	set thePred to (current application's NSPredicate's predicateWithFormat:"NOT (self[1] CONTAINS %@)" argumentArray:{each})
	set theListOfLists to (theArray's filteredArrayUsingPredicate:thePred) as list
end repeat
-- return {"a", {"b", "a", "l"}, "d"}


Just use a compound predicate:

-- for case (1), does not contain "a" OR "c"
set theListOfLists to {{"a", {"b", "a", "C"}, "c"}, {"g", {"l", "b", "c"}, "k"}, {"a", {"b", "a", "l"}, "d"}, {"a", {"l", "b", "b"}, "c"}}

set theFilter to {"a", "c"} -- does not contain either "a" or "c"

set thePreds to {}
repeat with each in theFilter
	set end of thePreds to (current application's NSPredicate's predicateWithFormat:"NOT (self[1] CONTAINS %@)" argumentArray:{each})
end repeat
set theArray to (current application's NSArray's arrayWithArray:theListOfLists)
set theListOfLists to (theArray's filteredArrayUsingPredicate:(current application's NSCompoundPredicate's andPredicateWithSubpredicates:thePreds)) as list

Dear Shane

Your coding is more efficient because it doesn’t require to repeat the manipulation of the LOL by the count of the filter, correct?

Thank you very much!

Compound predicates are generally more efficient than filtering multiple times.

Thanks for the explanation.