If you decide to use “my” version, here is a newer version that has fewer bugs:
on listGetter(folder_to_scan, scan_level, folder_exceptions)
--exceptions formatted for shell find
copy folder_exceptions to folder_exceptions
repeat with fe_ref in folder_exceptions
set contents of fe_ref to quoted form of contents of fe_ref
end repeat
set ASTID to AppleScript's text item delimiters
set AppleScript's text item delimiters to " -or -name "
set exclude_code to text 6 thru -1 of ("" & ({""} & folder_exceptions))
set AppleScript's text item delimiters to ASTID
--do shell find with exceptions
do shell script "/usr/bin/find " & (quoted form of POSIX path of folder_to_scan) & " ! \\( \\( " & exclude_code & " \\) -prune \\) -maxdepth " & scan_level & " -mindepth " & scan_level & " | sed -e 's|//*|/|g; y|:/|/:|; /^:Volumes:/ ! s/^/'\"$(ls -F /Volumes | sed -ne 'y|:/|/:|; s|[/&\\]|\\\\&|g; s|@$||p;')\"'/; s|^:Volumes:||'"
paragraphs of result
end listGetter
Here is that do shell script command again, with some embedded comments:
-- Embedded newlines below...
do shell script "/usr/bin/find " & (quoted form of POSIX path of folder_to_scan) & " ! \\( \\( " & exclude_code & " \\) -prune \\) -maxdepth " & scan_level & " -mindepth " & scan_level & " | sed -e ' # Convert POSIX to Mac, prepending name of startup disk or stripping /Volumes/
s,//*,/,g; # delete multiple adjacent slashes, they are ignored in POSIX paths, but adjacent colons can cause problems in Mac paths
y|:/|/:|; # swap all colons and slashes
/^:Volumes:/ ! s/^/'\"$(ls -F /Volumes | sed -ne '# Convert POSIX to Mac and print out only symlinks from ls -F
y|:/|/:|; # swap all colons and slashes
s|[/&\\]|\\\\&|g; # the output will be used as the replacement for a slash delimited s command; escape slash, ampersand and backslash in the name of the startup disk
s|@$||p; # only print out lines that end with the at sign (which means that the named directory entry is a symlink)
')\"'/; # if it did NOT start with /Volumes/, prepend the name of the startup disk
s/^:Volumes://; # if it started with /Volumes/, then just strip it, the next part should be the volume name
'"
Changes from previous versions:
¢ included paragraphs of . to make it produce a list (I presume you were always doing this with the output of this handler, so I pulled it into the handler; this also makes the output (a list instead of a long newline delimited string) comparable to the output of the last version in this post)
¢ regularized the folder exception building code a bit
¢ added -prune to the find commands to abort descending into trees that will be excluded from the output (this may not be as important since you are using -mindepth and -maxdepth (which seem to always be evaluated, even if in places where I expect otherwise), but it is a common time saver in most uses of find where directories are skipped)
¢ use an alternate delimiter in some sed commands to avoid having to escape actual slash characters
Bugs fixed from my most recently posted version:
¢ folder exceptions quoted properly (a double quote, dollar sign, back quote, or back slash in one of the exceptional names could have caused problems in the previous code that just wrapped them in double quotes)
¢ repeated slashes in the POSIX pathname are collapsed into a single slash (a different fix for the double colon problem that Bruce Phillips noted)
¢ tr invocation subsumed by equivalent sed functionality (the sed y-command has all of the functionality of the tr program that we need here, I must have had tunnel vision for sed s-commands at the time I proposed using tr); this is not really a bug but using tr was less efficient
¢ colons in the POSIX startup disk name are translated to slashes in the Mac startup disk name
¢ slashes, ampersands and backslashes in the startup disk’s name no longer cause syntax errors for sed
Known bugs:
¢ file/folder names with embedded newlines will not be properly handled (impossible to fix with the newline delimited data format that find and sed are using here)
Fixing the newline bug with any version of sed is going to be impossible or pretty unreliable (depending on how the implementation of sed chooses to behave). One way to fix the newline bug would be to use the -print0 command of find to produce null terminated pathnames and combine that with a language that can handle embedded null characters and that can also do the required string manipulation. Various UNIXy languages could fill this role, but so can AppleScript.
Here is my take on teaming up “find -print0” with AppleScript (maybe this is something like what Bruce Phillips had in mind when he mentioned using POSIX file). It is (much) slower than the sed versions, but the conversion it does is probably about as reliable as possible:
on listGetter(folder_to_scan, scan_level, folder_exceptions)
--exceptions formatted for shell find
copy folder_exceptions to folder_exceptions
repeat with fe_ref in folder_exceptions
set contents of fe_ref to quoted form of contents of fe_ref
end repeat
set ASTID to AppleScript's text item delimiters
set AppleScript's text item delimiters to " -or -name "
set exclude_code to text 6 thru -1 of ("" & ({""} & folder_exceptions))
set AppleScript's text item delimiters to ASTID
--do shell find with exceptions
do shell script "/usr/bin/find " & (quoted form of POSIX path of folder_to_scan) & " ! \\( \\( " & exclude_code & " \\) -prune \\) -maxdepth " & scan_level & " -mindepth " & scan_level & " -print0 ; true" without altering line endings
set find0 to result
set {ASTID, text item delimiters} to {text item delimiters, {ASCII character 0}}
try
set POSIX_pathnames to text items 1 through -2 of find0 -- Drop the last text item because it is always empty (find -print0 always prints a trailing null).
set text item delimiters to ASTID
on error m number n from o partial result r to t
set text item delimiters to ASTID
error m number n from o partial result r to t
end try
script speedHack
property Mac_pathnames : {}
end script
repeat with P_pn in POSIX_pathnames
set end of speedHack's Mac_pathnames to (POSIX file (contents of P_pn)) as Unicode text
end repeat
speedHack's Mac_pathnames
end listGetter
All the bugs I have found (as of this moment) in the sed versions are handled in this “find -print0”/AppleScript version (including the newline bug). Like the first handler in this post, this one yields a list of Mac pathnames. Symbolic links show up a bit differently in the outputs of the first and last versions in this post. The sed versions yield the Mac path to the symlink itself. This “find -print0” and “(POSIX file .) as Unicode text” version yields the Mac path to the target of the symlink. Since the target of the symlink may be at a different depth, you may not be expecting it to be included in the output. For those familiar with typical UNIX program options, this is similar to the “follow symlinks” behavior of most UNIX programs. If you want to exclude symlinks, you could add "-type l -or " to the front of exclude_code.