Here’s an expansion of Shane’s second script which ignores intermediate words not beginning with capitals and deals with “Mc” and “St” surnames in the BT Phone Book manner. The assumption is that SunnyFrinton hasn’t bothered to include people’s middle names in his file and that therefore the second capital initial in each line is the beginning of sequence on which to sort.
use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions
set peeps to "Adam de Collins,27 The Mill Apartments Colchester CO1 2QT,01206 863601,
Alan Armstrong,Ruby Dene Clacton Road Weeley Heath CO16 9DN,01255 830942,
Alan le Irwin,53 Greenacres Clacton on Sea Essex CO15 6LZ, ,
Alison Lightly,2 Upper Second Avenue Frinton on Sea CO13 9LL,01255 677407,
Alison O'Reilly,Small World Coggeshall Road Dedham CO7 6ET,01206 323363,
Amanda Elliot,6 Manor Road Great Holland Essex CO13 0JT,01255 674057,
Andrea Poulter,89 Rainham Way Frinton on Sea CO13 9NT,01255 673293,
Andrew Theobald,7 St.Andrews Place Brightlingsea CO7 0RH,01206 303000,
Angela Evans,7 Regency Lodge Clacton on Sea CO15 2AN,07540 433885,
Paul Loup Sulitzer,7 Regency Lodge Clacton on Sea CO15 2AN,07540 433885,
Paul-Loup Sulitzer,7 Regency Lodge Clacton on Sea CO15 2AN,07540 433885,
Jean Paul Sartre,7 Regency Lodge Clacton on Sea CO15 2AN,07540 433885,
Jean-Paul Sartre,7 Regency Lodge Clacton on Sea CO15 2AN,07540 433885,
Bret Easton Ellis,7 Regency Lodge Clacton on Sea CO15 2AN,07540 433885,
Linda Joy Singleton,7 Regency Lodge Clacton on Sea CO15 2AN,07540 433885,
Frank Lloyd Wright,7 Regency Lodge Clacton on Sea CO15 2AN,07540 433885,
Charles de Gaulle,7 Regency Lodge Clacton on Sea CO15 2AN,07540 433885,
Antoine de Saint-Exupéry,7 Regency Lodge Clacton on Sea CO15 2AN,07540 433885,
Daphne du Maurier,7 Regency Lodge Clacton on Sea CO15 2AN,07540 433885
Maurice Béjart,7 Regency Lodge Clacton on Sea CO10 2AN,07540 433885,
Maève Ravenwood,8 Regency Lodge Clacton on Sea CO11 2AN,07540 433885,
Maëlys Moongoddess,9 Regency Lodge Clacton on Sea CO12 2AN,07540 433885,
Brièle Ironwood,10 Regency Lodge Clacton on Sea CO13 2AN,07540 433885,
Maurice Bôjart,7 Regency Lodge Clacton on Sea CO10 2AN,07540 433885
Angus MacTavish,7 Regency Lodge Clacton on Sea CO10 2AN,07540 433885
Doctor McCoy,7 Regency Lodge Clacton on Sea CO10 2AN,07540 433885
Norman St. John-Stevas,7 Regency Lodge Clacton on Sea CO10 2AN,07540 433885
Valeska Saab,7 Regency Lodge Clacton on Sea CO10 2AN,07540 433885
Sheila Staefel,7 Regency Lodge Clacton on Sea CO10 2AN,07540 433885
Charles-Augustin Sainte-Beuve,7 Regency Lodge Clacton on Sea CO10 2AN,07540 433885
Brett D'Oliveira,7 Regency Lodge Clacton on Sea CO10 2AN,07540 433885
Fred Dexter,7 Regency Lodge Clacton on Sea CO10 2AN,07540 433885"
set theString to current application's NSMutableString's stringWithString:peeps
-- get line break string used
set lineBreakRange to theString's rangeOfString:"[\\r\\n]+" options:(current application's NSRegularExpressionSearch)
-- search for first name and rest of name, and insert at the beginning of line followed by commas
set lineBreakString to theString's substringWithRange:lineBreakRange
theString's replaceOccurrencesOfString:"(?m)(^.(?:[^[:upper:]]|(?<! )[[:upper:]])++)([^,]+).++$" withString:"$2 $1,,$0" options:(current application's NSRegularExpressionSearch) range:{0, theString's |length|()}
-- Expand any "Mc" now at the beginning of a line to "Mac" if followed by a capital letter.
theString's replaceOccurrencesOfString:"(?m)^Mc(?=[[:upper:]])" withString:"Mac" options:(current application's NSRegularExpressionSearch) range:{0, theString's |length|()}
-- Similarly expand "St" or "St." to "Saint".
theString's replaceOccurrencesOfString:"(?m)^St(e)?[. ]? " withString:"Saint$1 " options:(current application's NSRegularExpressionSearch) range:{0, theString's |length|()}
-- Now replace any hyphens in the sorting names with spaces and apostrophes with nothing.
repeat
set wholeRange to {0, theString's |length|()}
set changesMade to (theString's replaceOccurrencesOfString:"(?m)^([^,-]+)-" withString:"$1 " options:(current application's NSRegularExpressionSearch) range:(wholeRange)) -- No change to string length.
set changesMade to changesMade + (theString's replaceOccurrencesOfString:"(?m)^([^,'']+)['']" withString:"$1" options:(current application's NSRegularExpressionSearch) range:(wholeRange)) -- Possible shortening.
if changesMade = 0 then exit repeat
end repeat
-- split into paragraphs and sort them
set theArray to theString's componentsSeparatedByString:lineBreakString
set theArray to theArray's sortedArrayUsingSelector:"localizedCaseInsensitiveCompare:"
-- rejoin sorted paragraphs
set theString to theArray's componentsJoinedByString:lineBreakString
-- remove sorting strings
set theString to (theString's stringByReplacingOccurrencesOfString:"(?m)^[^,]+,," withString:"" options:(current application's NSRegularExpressionSearch) range:{0, theString's |length|()}) as text
Edits: Handling of “St” improved. French "Ste."s keep their gender, but I don’t know if that’s right for sorting purposes.
Hyphens in the doctored names are now replaced with spaces to eliminate their influence on the sort.
Dehyphenisation code replaced with the improvement suggested by Shane below (post #32). Apostrophes also zapped to exclude them from the sort.