Find Array Within Array

I am trying to find all of the records within an Array that meet certain criteria using Foundation’s predicate method, but have been unsuccessful.
I have an array of records


set PDFRecord to {{PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"el", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ts"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-002.pdf", EORPageNumber:"2", PrepositionWord:"ot", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf"}}

I want to group all of the records by the last three digits of their pdf name, so that, for an example, all of the records whose property PDFFile ends with 001 are grouped together, and all of the records of whose whose property PDFFile ends with 002 are grouped together, and group each group as the script increments the last three digits by 1, that is to say to 003, 004 and so on.

I then want to analyze each group to compare the data of its records in regard to their EORPageNumber and EORTotalPageCount.

My attempt with the following approach woefully failed.


set thePredicate to current application's NSPredicate's predicateWithFormat:"PDFFile =[c] '001"
set theFiles to (folderContentsArray's filteredArrayUsingPredicate:thePredicate)'s valueForKey:"path

I have also tried Shane Stanley’s example from https://macscripter.net/viewtopic.php?pid=203093 but also without success.


set PDFRecord to {{PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"el", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ts"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-002.pdf", EORPageNumber:"2", PrepositionWord:"ot", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf"}}

set theFilter to {"001"}
set thePreds to {}
repeat with each in theFilter
    set end of thePreds to (current application's NSPredicate's predicateWithFormat:" (self[1] CONTAINS %@)" argumentArray:{each})
end repeat
set theArray to (current application's NSArray's arrayWithArray:PDFRecord)
set theListOfLists to (theArray's filteredArrayUsingPredicate:(current application's NSCompoundPredicate's andPredicateWithSubpredicates:thePreds)) as list

How can I write an AppleScript to group the records by their final three digits, analyze the data, and then move on to the next set of records, whose PDFFile’s path name is incremented by 1?

Hello,

what you name “Array Within Array” really is “Record Within Array”. Your script is very close to correct, but contains 1 mistake: you can’t refer to item 1 of the record (like “self[1]”), because the records are referred only by its keys. No indexing exist for the records. So:
.
Compound predicate applying example (AsObjC):


use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

set PDFRecordsList to {{PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"el", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ts"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-002.pdf", EORPageNumber:"2", PrepositionWord:"ot", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf"}}

set theFilter to {"001.pdf"}
set thePreds to {}
repeat with each in theFilter
	set end of thePreds to (current application's NSPredicate's predicateWithFormat:" (PDFFile CONTAINS %@)" argumentArray:{each})
end repeat
set theArray to (current application's NSArray's arrayWithArray:PDFRecordsList)
set theListOfRecords to (theArray's filteredArrayUsingPredicate:(current application's NSCompoundPredicate's andPredicateWithSubpredicates:thePreds)) as list

.
Simple plain AppleScript solution for this AppleScript | Mac OS X forum:


set PDFRecordsList to {{PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"el", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ts"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-002.pdf", EORPageNumber:"2", PrepositionWord:"ot", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf"}}

set filteredRecordList to {}
repeat with aRecord in PDFRecordsList
	if PDFFile of aRecord ends with "001.pdf" then set end of filteredRecordList to (contents of aRecord)
end repeat
return filteredRecordList

.
Simple predicate applying example (AsObjC), ignoring the case:


use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

set PDFRecordsList to {{PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.PDF", EORPageNumber:"1", PrepositionWord:"el", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ts"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-002.pdf", EORPageNumber:"2", PrepositionWord:"ot", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf"}}

set thePredicate to current application's NSPredicate's predicateWithFormat:"PDFFile ENDSWITH [c]'001.pdf'"
set theArray to current application's NSArray's arrayWithArray:PDFRecordsList
set filteredrecordList to (theArray's filteredArrayUsingPredicate:thePredicate) as list

Hi akim.

KniazidisR has provided three different methods for extracting a group of records based on a particular path ending. But it’s not clear from your query whether or not you want to end up with a list of the groups or what you mean by “analyze each group”. Are the paths always identical except for the last three digits? Are records occasionally incomplete, like the last one in your example?

Building on KniazidisR’s third solution, and assuming you won’t know the last three digits in advance and that the paths may not be otherwise identical, the following produces a list of lists of records. Each list of records is a group with the same path ending and is sorted by EORPageNumber and subsorted by EORTotalPageCount. Hopefully this is useful:

use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later
use framework "Foundation"
use scripting additions

set PDFRecordsList to {{PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"el", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ts"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-002.pdf", EORPageNumber:"1", PrepositionWord:"ot", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf"}}

-- Derive an array of dictionaries from the list of records.
set theArray to current application's NSMutableArray's arrayWithArray:PDFRecordsList
-- Sort the array on the dictionaries' EORPageNumbers, subsorting on their EORTotalPageCounts.
set onEORPageNumber to current application's NSSortDescriptor's sortDescriptorWithKey:"EORPageNumber" ascending:true
set onEORTotalPageCount to current application's NSSortDescriptor's sortDescriptorWithKey:"EORTotalPageCount" ascending:true
theArray's sortUsingDescriptors:{onEORPageNumber, onEORTotalPageCount}

-- Get a sorted array containing one copy of each path ending in the records.
set pathEndings to {}
repeat with thisRecord in PDFRecordsList
	set end of pathEndings to text -7 thru -1 of thisRecord's PDFFile
end repeat
set onSelf to current application's NSSortDescriptor's sortDescriptorWithKey:"self" ascending:true
set pathEndings to (current application's NSSet's setWithArray:pathEndings)'s sortedArrayUsingDescriptors:{onSelf}

-- Extract (as a list of records) each group containing the same path ending and add it to a group list.
set groupList to {}
repeat with thisEnding in pathEndings
	set filter to current application's NSPredicate's predicateWithFormat_("self.PDFFile ENDSWITH %@", thisEnding)
	set end of groupList to (theArray's filteredArrayUsingPredicate:filter) as list
end repeat
return groupList

Things are slightly simpler if the files are all in the same folder and their names are all in the format “PDFDirectoryYYYY-MM-DD-xxx.PDF”:

use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later
use framework "Foundation"
use scripting additions

set PDFRecordsList to {{PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"el", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ts"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-002.pdf", EORPageNumber:"1", PrepositionWord:"ot", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf"}}

-- Derive an array of dictionaries from the list of records.
set theArray to current application's NSMutableArray's arrayWithArray:PDFRecordsList
-- Sort the array on the dictionaries' PDFFiles, subsorting on their EORPageNumbers, then their EORTotalPageCounts.
set onPDFFile to current application's NSSortDescriptor's sortDescriptorWithKey:"PDFFile" ascending:true
set onEORPageNumber to current application's NSSortDescriptor's sortDescriptorWithKey:"EORPageNumber" ascending:true
set onEORTotalPageCount to current application's NSSortDescriptor's sortDescriptorWithKey:"EORTotalPageCount" ascending:true
theArray's sortUsingDescriptors:{onPDFFile, onEORPageNumber, onEORTotalPageCount}

-- Get an array containing one copy of each path. (The NSOrderedSet itself may be enough.)
set filePaths to (current application's NSOrderedSet's orderedSetWithArray:(theArray's valueForKey:"PDFFile"))'s array()

-- Extract (as a list of records) each group containing the same path and add it to a group list.
set groupList to {}
repeat with thisName in filePaths
	set filter to current application's NSPredicate's predicateWithFormat_("self.PDFFile == %@", thisName)
	set end of groupList to (theArray's filteredArrayUsingPredicate:filter) as list
end repeat
return groupList

I do this grouping quite often what you need to do is get an
Array of distinct values using:

set theUniqueValues to (theRecords’s valueForKeyPath:“@distinctUnionOfObjects.PDFFile”)

I noticed that your looking to group by the 00x.pdf

Thankfully the whole string returned by this is grouped,
Otherwise you’d have to use some RegEx methods and create another array
With the unique values like 001.pdf, 002.pdf etc

Once you have theUniqueValues you loop thru them and create a predicate
Based on aUniqueValue. The filter theRecord with that predicate.
You now have a “grouped” array.
You can then add that subArray to a masterGroupsArray or
Assign them as a value for aGroupKey in a NSMutableDictionary.

If you want to say pull the “00x” from aUnique value you can use
A RegEx to get that.

If you created theUniqueValues with {“001.pdf”,”002.pdf”, etc}
You’ll have to create a predicate that uses either

  • contains or
  • matches (which uses RegEx pattern matching)

When I’m back at my computer I can post up some code examples

Here’s code that use’s RegEx to dig deeper to find the “xxx”.pdf. and
Also uses a RegEx to match pattern.

I added more to the records changed the user names, trimming paths etc.

The tests using only the uniqueValues fails.
Using the uniqueSubValues works:

aGroupedArray1 has 8 items
aGroupedDict1 eachKey only has one item (3 total)

aGroupedArray2 has 3 items(subArrays)
aGroupedDict2 has 3Keys with subItems (4,2,4)


use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later
use framework "Foundation"
use scripting additions

-- classes, constants, and enums used
property NSMutableDictionary : a reference to current application's NSMutableDictionary
property NSPredicate : a reference to current application's NSPredicate
property NSArray : a reference to current application's NSArray
property NSRegularExpressionSearch : a reference to 1024
property NSString : a reference to current application's NSString
property NSMutableArray : a reference to current application's NSMutableArray

property theArray : {}
property theUniqueValues : {}
property theUniqueSubValues : {}

set PDFRecordsList to {{PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"el", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ts"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-002.pdf", EORPageNumber:"1", PrepositionWord:"ot", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf"}, {PDFFile:"/Users/bob/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"el", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/rick/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ts"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2021-10-13-002.pdf", EORPageNumber:"1", PrepositionWord:"ot", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectoy003.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/mickey/PDFDirectory2022-10-13-003.pdf"}}

-- Derive an array of dictionaries from the list of records.
set theArray to NSArray's arrayWithArray:PDFRecordsList
set theUniqueValues to (theArray's valueForKeyPath:"@distinctUnionOfObjects.PDFFile")

set theUniqueSubValues to NSMutableArray's array()
repeat with aUniqueValue in theUniqueValues
	set aUniqueSubValue to (my subStringFromString:aUniqueValue matchingPattern:"(\\d\\d\\d)(?=\\.pdf)")
	if not (theUniqueSubValues's containsObject:aUniqueSubValue) then
		(theUniqueSubValues's addObject:aUniqueSubValue)
	end if
end repeat

set aGroupedArray1 to my arrayOfGroupedItemsUsingUniqueValues:theUniqueValues
set aGroupedDict1 to my dictionaryOfGroupedItemsUsingUniqueValues:theUniqueValues

set aGroupedArray2 to my arrayOfGroupedItemsUsingUniqueSubValues:theUniqueSubValues
set aGroupedDict2 to my dictionaryOfGroupedItemsUsingUniqueSubValues:theUniqueSubValues

on arrayOfGroupedItemsUsingUniqueValues:uniqueValues
	set aArrayOfGroups to NSMutableArray's array()
	repeat with aUniqueValue in uniqueValues
		set aPredicate to NSPredicate's predicateWithFormat_("self.PDFFile == %@", aUniqueValue)
		set aGroupMatch to (theArray's filteredArrayUsingPredicate:aPredicate)
		(aArrayOfGroups's addObject:aGroupMatch)
	end repeat
	return aArrayOfGroups
end arrayOfGroupedItemsUsingUniqueValues:

on dictionaryOfGroupedItemsUsingUniqueValues:uniqueValues
	set aDictOfGroups to NSMutableDictionary's |dictionary|()
	repeat with aUniqueValue in uniqueValues
		set aPredicate to NSPredicate's predicateWithFormat_("self.PDFFile == %@", aUniqueValue)
		set aGroupMatch to (theArray's filteredArrayUsingPredicate:aPredicate)
		set aGroupKey to (my subStringFromString:aUniqueValue matchingPattern:"(\\d\\d\\d)(?=\\.pdf)")
		(aDictOfGroups's setValue:aGroupMatch forKey:aGroupKey)
	end repeat
	return aDictOfGroups
end dictionaryOfGroupedItemsUsingUniqueValues:

on arrayOfGroupedItemsUsingUniqueSubValues:uniqueSubValues
	set aArrayOfGroups to NSMutableArray's array()
	repeat with aUniqueSubValue in uniqueSubValues
		set aPattern to NSString's stringWithFormat_(".*(%@).pdf", aUniqueSubValue)
		set aPredicate to NSPredicate's predicateWithFormat_("self.PDFFile MATCHES[cd] %@", aPattern)
		set aGroupMatch to (theArray's filteredArrayUsingPredicate:aPredicate)
		(aArrayOfGroups's addObject:aGroupMatch)
	end repeat
	return aArrayOfGroups
end arrayOfGroupedItemsUsingUniqueSubValues:

on dictionaryOfGroupedItemsUsingUniqueSubValues:uniqueSubValues
	set aDictOfGroups to NSMutableDictionary's new()
	repeat with aUniqueSubValue in uniqueSubValues
		set aPattern to NSString's stringWithFormat_(".*(%@).pdf", aUniqueSubValue)
		set aPredicate to NSPredicate's predicateWithFormat_("self.PDFFile MATCHES[cd] %@", aPattern)
		set aGroupMatch to (theArray's filteredArrayUsingPredicate:aPredicate)
		set aGroupKey to aUniqueSubValue
		(aDictOfGroups's setValue:aGroupMatch forKey:aGroupKey)
	end repeat
	return aDictOfGroups
end dictionaryOfGroupedItemsUsingUniqueSubValues:

on subStringFromString:aString matchingPattern:aPattern
	set aSubString to ""
	set aSource to NSString's stringWithString:aString
	set aRegExPattern to NSString's stringWithString:aPattern
	set aRange to (aSource's rangeOfString:aRegExPattern options:NSRegularExpressionSearch)
	(*
		log {"aSource is:", aSource}
		log {"aRegExPattern is:", aRegExPattern}
		log {"aRange is:", aRange}
	*)
	if (aRange's |length|() > 0) then
		set aSubString to (aSource's substringWithRange:aRange)
	end if
	return aSubString
end subStringFromString:matchingPattern:


Akim has many fine suggestions to choose from and doesn’t need another one, so my suggestion is FWIW. The script returns an array of arrays and each subarray contains one or more dictionaries. The subarrays are sorted by file counter.

Edit: December 4, 2022
After writing and posting my script, I studied the other suggestions, and Nigel’s script does exactly what my script does but is simpler and probably faster. So, no reason to consider my suggestion.

use framework "Foundation"
use scripting additions

set theRecords to {{PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"el", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ts"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-002.pdf", EORPageNumber:"2", PrepositionWord:"ot", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf"}}
set theDictionaries to current application's NSMutableArray's arrayWithArray:theRecords

-- add file ID to each dictionary
set thePattern to "(?im)^.*(\\d{3})\\.pdf$"
repeat with i from 1 to (theDictionaries's |count|())
	set aDictionary to (theDictionaries's objectAtIndex:(i - 1))
	set theFile to (aDictionary's valueForKey:"PDFFile")
	set fileID to (theFile's stringByReplacingOccurrencesOfString:thePattern withString:"$1" options:1024 range:{0, theFile's |length|()})
	try -- skip files without a counter
		(aDictionary's setObject:(fileID as integer) forKey:"theID")
		(theDictionaries's replaceObjectAtIndex:(i - 1) withObject:aDictionary)
	end try
end repeat

-- get sorted array of unique file IDs and create array of arrays
set theIDs to (theDictionaries's valueForKeyPath:"@distinctUnionOfObjects.theID")'s sortedArrayUsingSelector:"compare:"
set theData to current application's NSMutableArray's new()
repeat with anID in theIDs
	set thePredicate to current application's NSPredicate's predicateWithFormat_("(theID == %@)", anID)
	set matchingDictionaries to (theDictionaries's filteredArrayUsingPredicate:thePredicate)
	(theData's addObject:matchingDictionaries)
end repeat

return theData -- an array of arrays with each subarray containing 1 or more dictionaries

Just for learning purposes, I spent some additional time with Nigel’s first suggestion and had a thought as to a two possible enhancements. The first notifies the user if a file does not have a counter, and the second makes PDF file extensions case insensitive.

use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later
use framework "Foundation"
use scripting additions

set PDFRecordsList to {{PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.PDF", EORPageNumber:"1", PrepositionWord:"el", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ts"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-002.pdf", EORPageNumber:"1", PrepositionWord:"ot", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf"}}

-- Derive an array of dictionaries from the list of records.
set theArray to current application's NSMutableArray's arrayWithArray:PDFRecordsList
-- Sort the array on the dictionaries' EORPageNumbers, subsorting on their EORTotalPageCounts.
set onEORPageNumber to current application's NSSortDescriptor's sortDescriptorWithKey:"EORPageNumber" ascending:true
set onEORTotalPageCount to current application's NSSortDescriptor's sortDescriptorWithKey:"EORTotalPageCount" ascending:true
theArray's sortUsingDescriptors:{onEORPageNumber, onEORTotalPageCount}

-- Get a sorted array containing one copy of each path ending in the records.
set thePattern to "(?im)^.*(\\d{3}\\.pdf)$"
set pathEndings to current application's NSMutableArray's new()
repeat with aDictionary in theArray
	set theFile to (aDictionary's valueForKeyPath:"PDFFile.lowercaseString")'s mutableCopy()
	set matchCount to (theFile's replaceOccurrencesOfString:thePattern withString:"$1" options:1024 range:{0, theFile's |length|()})
	if matchCount > 0 then
		(pathEndings's addObject:theFile)
	else
		set fileName to theFile's lastPathComponent()
		display alert "A file did not contain a counter and will be skipped" message (fileName as text)
	end if
end repeat
set theDescriptor to current application's NSSortDescriptor's sortDescriptorWithKey:"self" ascending:true
set pathEndings to (current application's NSSet's setWithArray:pathEndings)'s sortedArrayUsingDescriptors:{theDescriptor}

-- Extract (as a list of records) each group containing the same path ending and add it to a group list.
set groupList to {}
repeat with thisEnding in pathEndings
	set filter to current application's NSPredicate's predicateWithFormat_("self.PDFFile ENDSWITH[c] %@", thisEnding)
	set end of groupList to (theArray's filteredArrayUsingPredicate:filter) as list
end repeat
return groupList

Thanks, peavine. There are some good ideas in all the solutions offered so far, although which are best or most relevant will depend on feedback from akim.

Looking through your contribution immediately above, I’d be tempted to filter dictionaries with non-conforming PDFFile values from theArray immediately, not checking in the repeat or bothering to display dialogs. But again, it would depend on what was actually required. Since the repeat builds pathEndings as a mutable array and then derives a set from it, it might as well build a mutable set anyway.

use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later
use framework "Foundation"
use scripting additions

set PDFRecordsList to {{PDFFile:"I am a banana"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.PDF", EORPageNumber:"1", PrepositionWord:"el", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-001.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ts"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-002.pdf", EORPageNumber:"1", PrepositionWord:"ot", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf", EORPageNumber:"1", PrepositionWord:"of", EORTotalPageCount:"2", OCRMethod:"ap"}, {PDFFile:"/Users/alan/Desktop/PDFDirectory2022-10-13-003.pdf"}}

-- Derive an array of dictionaries from the list of records.
set theArray to current application's NSMutableArray's arrayWithArray:PDFRecordsList
-- Lose any dictionaries with nonconforming PDFFile values.
set thePattern to "(?im)^.*(\\d{3}\\.pdf)$"
set filter to current application's NSPredicate's predicateWithFormat_("PDFFile MATCHES %@", thePattern)
theArray's filterUsingPredicate:filter
-- Sort the array on the dictionaries' EORPageNumbers, subsorting on their EORTotalPageCounts.
set onEORPageNumber to current application's NSSortDescriptor's sortDescriptorWithKey:"EORPageNumber" ascending:true
set onEORTotalPageCount to current application's NSSortDescriptor's sortDescriptorWithKey:"EORTotalPageCount" ascending:true
theArray's sortUsingDescriptors:{onEORPageNumber, onEORTotalPageCount}

-- Get a sorted array containing one copy of each path ending in the records.
set pathEndings to current application's NSMutableSet's new()
repeat with theFile in (theArray's valueForKeyPath:"PDFFile.lowercaseString")
	(pathEndings's addObject:(theFile's stringByReplacingOccurrencesOfString:thePattern withString:"$1" options:1024 range:{0, theFile's |length|()}))
end repeat
set theDescriptor to current application's NSSortDescriptor's sortDescriptorWithKey:"self" ascending:true
set pathEndings to pathEndings's sortedArrayUsingDescriptors:{theDescriptor}

-- Extract (as a list of records) each group containing the same path ending and add it to a group list.
set groupList to {}
repeat with thisEnding in pathEndings
	set filter to current application's NSPredicate's predicateWithFormat_("self.PDFFile ENDSWITH[c] %@", thisEnding)
	set end of groupList to (theArray's filteredArrayUsingPredicate:filter) as list
end repeat
return groupList

Thanks Nigel–that’s much cleaner and I suspect faster.