Loading file list from a folder

Hi.

It’s to preserve the contents of the list set at the top of the script, in case it’s needed again somewhere else in the script.

The list passed to the handler is the actual one set at the top of the script. Although the extensionList variables inside and outside the handler are different variables with the same name, the list they contain is initially the same one. copy extensionList to extensionList makes a copy of the list and resets the local extensionList variable to the copy, so that any changes made during the repeat which follows happen in the copy and not in the original.

It may be a bit confusing that I’ve used the same variable label throughout. But it’s nothing to do with the variables, it’s to do with the list itself. A copy would still have to be made (if you wanted to preserve the original) even if the variables all had different labels. It’s something of which you have to be aware when passing lists to handlers or assigning them to other variables.

set extensionList to {"pdf", "zip", "dmg", "jpg"}
set sourceFolder to (path to downloads folder)

set myFileList to its listFilesWithGivenExtensions(sourceFolder, extensionList)

on listFilesWithGivenExtensions(theSourceFolder, theExtensionList)
	-- Blah blah blah
	
	copy theExtensionList to extensionListCopy -- Make a copy of the list.
	repeat with thisExtension in extensionListCopy
		-- Possibly modify the copy.
	end repeat
	
	-- etc.
end listFilesWithGivenExtensions

Only if the user wishes to work with HFS paths or Applescript aliases rather than POSIX paths, and only marginally so because of the repeat loop that would be needed for the conversions.

While the shell solution presented above is robust owing to the find command’s extraordinary capabilities (of which my example, of course, barely scratches the surface), to my knowledge, the shell offers no easy way to convert from POSIX paths to HFS paths other than to “cheat” via an osascript command. Still, one can do amazing things (and amazing damage if used improperly) with the find command, all the more so if options such as -exec or -delete are incorporated. I would think it should be somewhere in one’s toolbox. :slight_smile:

Thanks for the tip. I wasn’t aware of that potential problem.

Thanks for the nice handler that adds filtering to the System Events solution.

What a clever way to subselect items from an input list!

I was thinking that the text item delimiters property might be a particularly efficient way of filtering paths. I did a quick execution speed test comparing a variation of your method vs Nigel’s technique of nested repeat loops. I must admit that I fully expected the former to be faster, but lo and behold, the nested repeat loops turned out to be about 1.7 x faster:


tell application "System Events" to set hfsPaths to path of files of folder [HFS path to parent folder containing 600 files with various file endings, including "pdf", "txt", and others]

set extensionList to {".pdf", ".txt"}

on getPathsViaTID(hfsPaths, extensionList)
	set tid to AppleScript's text item delimiters
	try
		set AppleScript's text item delimiters to extensionList
		set targetPaths to {}
		repeat with thisPath in hfsPaths
			tell thisPath's text items to if (length > 1) and (last item = "") then set end of targetPaths to thisPath's contents
		end repeat
	end try
	set AppleScript's text item delimiters to tid
	return targetPaths
end getPathsViaTID

on getPathsViaRepeatLoops(hfsPaths, extensionList)
	set targetPaths to {}
	repeat with thisPath in hfsPaths
		tell thisPath's contents
			repeat with thisExtension in extensionList
				if (it ends with thisExtension's contents) then
					set end of targetPaths to it
					exit repeat
				end if
			end repeat
		end tell
	end repeat
	return targetPaths
end getPathsViaRepeatLoops

-- Result: For 100 repetitions of each handler, getPathsViaRepeatLoops was about 1.7 x faster than getPathsViaTID!

It’s not what the text items are for in my example. AST list folder command returns the contents of a folder using CoreFoundation’s URL enumerator. I use the text items to create an regular expression to filter out names, like you can do with AST copy list command.

No it’s not, it’s not even a superset. It’s a bridge, it’s a hack into a runtime of another environment whose paradigm is completely different. It’s as pure as PyObjC is pure Python, which is not. ASObjC is more limited than PyObjC and basically one direction. Even if the AppleScript engineers have updated AppleScript many times to make ASObjC less like an alien in the AS language, it doesn’t make it pure AppleScript. It’s principally the same as the call method command back in AS-Studio (which was only limited to classes and didn’t work with instances).

No so much clever as not particularly well known — nor, I suppose, very often useful. If a list contains AppleScript objects, they can be referenced by class in the same way as, say, words or paragraphs in text, or files or folders in a folder belonging to the Finder or System Events.

set aList to {path to desktop, "aardvark", 17, {1, 2, 3}, 4, "hello", {a:"apple", b:"banana"}, 5.0, 7, "world"}

aList's records --> {{a:"apple", b:"banana"}}
aList's third integer --> 7
count aList --> 10
count aList's text --> 3
count aList's first text --> 8
-- etc.

DJ’s already pointed out that the delimiters in his script are only used to put together the regex. Everything else is done by his OSAX. On my machine, the speed’s about the same as the ASObjC solution further up this thread — that is, about ten times as fast as my vanilla script.

‘my extensionList’ refers to the extensionList variable in the run handler, not the local parameter variable of the same name. They contain the same list, so it’s not a problem. But correctly, either the ‘my’ should be omitted or the extension list not passed as a parameter.

A basic idea behind AppleScript is its “plug-in” architecture. There’s the core language, which is actually quite small but can do a lot. Then there’s the ability to add commands supplied by separate OSAXen and commands belonging to applications whose authors have included suitable scripting interfaces. Over the years have been added the ability to run shell scripts, simulate user actions in the GUI, and recently to access some of the system’s Objective-C frameworks. On the one hand, it’s a bewildering array of things to learn. On the other, it offers a vast choice of solutions from which an expert can select what he/she feels is the most appropriate. On the third hand ( :slight_smile: ), it can be approached from a number of different directions to suit people coming from different programming backgrounds. Complete beginners (English speakers, at least) should find the core language fairly easy to grasp. People familiar with Unix or languages like Python or Ruby can go straight to ‘do shell script’ and achieve a lot of what they want to do straight away in a way that they already know. Hard-core Objective-C programmers should be able to adapt to ASObjC without too much trouble if they need to. Once a start’s been made, you can “add on” any additional knowledge you need much as the language adds on extensions.

So “pure AppleScript” is a term rather like “thoroughbred mongrel”. In as far as it means anything, I’d personally regard “pure AppleScript” as being the core language and (when I’m in a flexible mood!) the StandardAdditions OSAX.

I could probably live with your inflexible definition. But I struggle with the need to Balkinize in the first place, especially with loaded terms like “pure”. The core language is not very useful by itself – that’s why the hack that is scripting additions was added before it was even released – and in many ways it’s stuck in a time warp. It’s the “impure” bits that have helped keep it alive.

Sorry, that was a typo left behind from when I first put the code together and had extensionList coded as a property rather than a local variable. I corrected the entry.

Of course. My mistake. I had been thinking of using text item delimiters as in my example and noticed its presence in yours, but didn’t look closely enough to see the difference in usage.

It’s clear from the posts above that there are multiple ways of getting the HFS paths or Applescript aliases of files of a folder. But as I mentioned earlier, I am not aware of any shell solutions to getting that information, other than to “cheat” with the osascript command. So for the fun of it and just so that it’s out there, I put together the following shell solution that is not quite pure since it requires an Applescript run script command, but the heavy lifting is done by the shell.

To get the HFS paths or Applescript aliases of all pdf files in a parent folder:


set hfsPaths to run script "{" & (do shell script "find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' -exec echo '\"'{}'\" as POSIX file as text' \\; | tr '\\n' ',' | sed -E 's/,$//'") & "}"

--or--

set applescriptAliases to run script "{" & (do shell script "find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' -exec echo '\"'{}'\" as POSIX file as alias' \\; | tr '\\n' ',' | sed -E 's/,$//'") & "}"

And to get the HFS paths or Applescript aliases of all pdf, txt, and jpg files in a parent folder:


set hfsPaths to run script "{" & (do shell script "find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 \\( -iname '*.pdf' -o  -iname '*.txt' -o  -iname '*.jpg' \\) -exec echo '\"'{}'\" as POSIX file as text' \\; | tr '\\n' ',' | sed -E 's/,$//'") & "}"

--or--

set applescriptAliases to run script "{" & (do shell script "find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 \\( -iname '*.pdf' -o  -iname '*.txt' -o  -iname '*.jpg' \\) -exec echo '\"'{}'\" as POSIX file as alias' \\; | tr '\\n' ',' | sed -E 's/,$//'") & "}"

Notes: 1) The -name primaries have been changed to -iname so that file extension searching will be case-insensitive. 2) Since this post was first submitted, the curly braces have been transferred from the do shell script command to the run script command so that an empty list will be returned in the case of no matching files. 3) This approach will fail if an HFS path of an item in the parent folder has a double-quote character in its name.

Here are the same solutions but with the curly braces incorporated into the do shell script command, and with additional examples in which all files are returned (i.e., no filtering is performed based on file name extension):

To get the HFS paths or Applescript aliases of all files in a parent folder:


set hfsPaths to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -exec echo '\"'{}'\" as POSIX file as text' \\; | tr '\\n' ',' | sed -E 's/,$// ; s/(.+)/\\1/')}\"")

--or--

set applescriptAliases to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -exec echo '\"'{}'\" as POSIX file as alias' \\; | tr '\\n' ',' | sed -E 's/,$// ; s/(.+)/\\1/')}\"")

To get the HFS paths or Applescript aliases of all pdf files in a parent folder:


set hfsPaths to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' -exec echo '\"'{}'\" as POSIX file as text' \\; | tr '\\n' ',' | sed -E 's/,$// ; s/(.+)/\\1/')}\"")

--or--

set applescriptAliases to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' -exec echo '\"'{}'\" as POSIX file as alias' \\; | tr '\\n' ',' | sed -E 's/,$// ; s/(.+)/\\1/')}\"")

And to get the HFS paths or Applescript aliases of all pdf, txt, and jpg files in a parent folder:


set hfsPaths to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 \\( -iname '*.pdf' -o -iname '*.txt' -o -iname '*.jpg' \\) -exec echo '\"'{}'\" as POSIX file as text' \\; | tr '\\n' ',' | sed -E 's/,$// ; s/(.+)/\\1/')}\"")

--or--

set applescriptAliases to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 \\( -iname '*.pdf' -o -iname '*.txt' -o -iname '*.jpg' \\) -exec echo '\"'{}'\" as POSIX file as alias' \\; | tr '\\n' ',' | sed -E 's/,$// ; s/(.+)/\\1/')}\"")

Hi bmose.

‘run script’ isn’t “cheating”, of course. :wink:

I find these to be faster and slightly more thorough:

set parent_folder_hfs_path to (path to downloads folder)

set hfsPaths to run script (do shell script "find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' | sed -E 's/\\\\|\\\"/\\\\&/g; s/^.*$/\"&\" as POSIX file as text,¬/; 1 s/^/{/; $ s/,¬$/}/'")

--or--

set hfsPaths to run script (do shell script "find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' | sed -E 's/\\\\|\\\"/\\\\&/g; s/^.*$/\"&\" as POSIX file as alias,¬/; 1 s/^/{/; $ s/,¬$/}/'")

‘find’ simply returns the relevant files’ POSIX paths. The ‘sed’ codes double-escapes any quotes or backslashes in them, enquotes them and adds the AppleScript code, inserts an opening brace at the beginning of the first line, and edits a closing brace onto the end of the last.

Nice tweaks! It’s more streamlined. I made one slight adjustment: I pulled your curly braces out of the sed command and put them in a wrapping echo command so that in the case where no matching files are found, an empty list rather than no result is returned. (Also, I used applescriptAliases for the second statement’s variable name :).)


set hfsPaths to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' | sed -E 's/\\\\|\\\"/\\\\&/g; s/^.*$/\"&\" as POSIX file as text,¬/; $s/,¬$//;')}\"")

--or--

set applescriptAliases to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' | sed -E 's/\\\\|\\\"/\\\\&/g; s/^.*$/\"&\" as POSIX file as alias,¬/; $s/,¬$//;')}\"")

Ah. Right! I hadn’t realised sed wouldn’t be triggered in such cases.

Oops! :rolleyes:

I thought I might submit this to Code Exchange, given that there is pretty much nothing out there about using the shell to get HFS paths and AppleScript aliases. One question: Is it really necessary to “doubly” escape the double-quote character in the first sed command? This seems to work just as well:


set hfsPaths to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' | sed -E 's/\\\\|\"/\\\\&/g; s/^.*$/\"&\" as POSIX file as text,¬/; $s/,¬$//;')}\"")

Under your ‘echo’ scheme, the ‘sed’ code’s a string embedded in an ‘echo’ string in a shell script represented by an AppleScript string. The ‘sed’ code edits text returned by ‘find’ which may contain quote or backslash characters. These characters have to be doubly escaped in the AppleScript text to be received correctly by ‘sed’, which then has to add enough escapage to any matches so that, after everything’s gone through ‘echo’, there’s enough escapage left to to doubly escape the characters in the path string(s) represented within the AppleScript text returned by the shell script. Simple really. :wink:

set parent_folder_hfs_path to (path to downloads folder)

set hfsPaths to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' | sed -E 's/\\\\/\\\\\\\\\\\\\\\\/g; s/\\\"/\\\\\\\\\"/g; s/^.*$/\"&\" as POSIX file as text,¬/; $ s/,¬$//')}\"")

--or--

set applescriptAliases to run script (do shell script "echo \"{$(find " & parent_folder_hfs_path's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 -iname '*.pdf' | sed -E 's/\\\\/\\\\\\\\\\\\\\\\/g; s/\\\"/\\\\\\\\\"/g; s/^.*$/\"&\" as POSIX file as alias,¬/; $ s/,¬$//')}\"")

Edit: Explanation rewritten.

or try to use the ‘satimage.osax’ with ‘list files’:
http://www.satimage.fr/software/en/dictionaries/dict_satimage.html#SatimageFileAdditions.listfiles

Another good one! An equivalent script using Satimage would look like this:

set extensionList to {"pdf", "zip", "dmg", "jpg"}
set sourceFolder to (path to downloads folder)

list files sourceFolder of extension extensionList without recursively

Here’s an extension of bmose’s idea which allows — er — extensions to be passed:

set extensionList to {"pdf", "zip", "dmg", "jpg"}
set sourceFolder to (path to downloads folder)

listFilesWithGivenExtensions(sourceFolder, extensionList)

on listFilesWithGivenExtensions(sourceFolder, extensionList)
	if (extensionList is {}) then
		set filterBlock to "-not -name '.*'"
	else
		set astid to AppleScript's text item delimiters
		set AppleScript's text item delimiters to "' -o -iname '*."
		set filterBlock to "\\( -iname '*." & extensionList & "' \\)"
		set AppleScript's text item delimiters to astid
	end if
	
	return (run script (do shell script "echo \"{$(find " & sourceFolder's POSIX path's quoted form & " -mindepth 1 -maxdepth 1 " & filterBlock & " | LC_ALL='en_GB' sed -E 's/\\\\/\\\\\\\\\\\\\\\\/g; s/\\\"/\\\\\\\\\"/g; s/^.*$/\"&\" as POSIX file as alias,¬/; $ s/,¬$//')}\""))
end listFilesWithGivenExtensions

I submitted to Code Exchange the run script/do shell script solution along with a discussion of other methods of batch-retrieving HFS paths and AppleScript aliases. I tried to beat into partial submission the backslashes in the run script/do shell script expression with a more liberal use of the & special character :slight_smile:

Hey Nigel can the AppleScript objects also be referenced as say:

set aList to {path to desktop, "aardvark", 17, {1, 2, 3}, 4, {{10,11,12}, {20,30,40}, {300, 400, 500}}, "hello", {a:"apple", b:"banana"}, 5.0, 7, "world", {a:"orange", b:"grapefruit"}, {c:{csub1:"strawberry", csub2:"raspberry"}, d:{dsub1:"tomato", dsub2:"potato"}}}

aList's second record --->? {a:"orange", b:"grapefruit"}
aList's third record ---->? {c:{csub1:"strawberry", csub2:"raspberry"}, d:{dsub1:"tomato", dsub2:"potato"}}

aList's second record of (aList's third record) ---->? d:{dsub1:"tomato", dsub2:"potato"}
aList's second list ---->? {{10,11,12}, {20,30,40}, {300, 400, 500}}
aList's third list of (aList's second list) ----->?  {300, 400, 500}

I guess for that last question, if that is true then that is probably going to help clear up the
“List of List” or “Lists of Lists” that I find in some dictionary’s.
That has never been clear to me.

thanks

Hi technomorph.

Your first, second, and fourth fetch lines are correct.

The fifth should just be:

third list of (aList's second list) -- With or without the parentheses.

… or …

aList's second list's third list

… or …

list 3 of list 2 of aList

… or whatever mixture of styles you prefer.

You can’t get an indexed item of a record because the point of a record is that the values in it are labelled rather than being in a particular order. So your third fetch line would have to be something like this:

d of (aList's third record) ----> {dsub1:"tomato", dsub2:"potato"}

“List of list” simply means a list containing lists, the implication being that it either only contains lists or is empty. What any lists in it might contain isn’t specified.

item 1 = {[b]INFO:[/b]{[i]BITRATE:[/i]"128000", [i]GENRE:[/i]"Alternative", [i]COMMENT:[/i]"missing 20180401", [i]RATING:[/i]"Search In Playlists", [i]PLAYTIME:[/i]"353", [i]IMPORT_DATE:[/i]"2018/4/29", [i]FLAGS:[/i]"10", [i]FILESIZE:[/i]"5582", [i]|color|[/i]:"1"}, PRIMARYKEY:{|key|:"Tekno/:Users/:kerry/:Music/:iTunes/:iTunes Media/:Music/:LCD Soundsystem/:All My Friends - EP/:All My Friends (Franz Ferdinand Version).m4p"}, LOCATION:{DIR:"/:Users/:kerry/:Music/:iTunes/:iTunes Media/:Music/:LCD Soundsystem/:All My Friends - EP/:", |file|:"All My Friends (Franz Ferdinand Version).m4p", VOLUME:"Tekno", VOLUMEID:"Tekno"}, MAININFO:{MODIFIED_DATE:"2018/5/9", MODIFIED_TIME:"47417", AUDIO_ID:"AQoAABEBERERIhEQEiERERFERlVCIjVnQxRWZCIiNVZ1QhETZlUiRVIRNENoQUVVImdlQgJWZiRTEQAWd4UVZ2QQJnZhRUIRERIiEREiEREQABJmRTMhERVYUhFFVUERJERERCJHVFVSE0VCEUZkiDFFZRJmZVITVWUTQyIRFXd1E1Z2MRWGcyVEIiM0M0MyIUdlQQFGZSFGMzEAATITZldBJWVlM0dVd2dnQyIREyIzIiMiISERERA2ZUEmZlUgNWZhRUERAEd4ckZmYxFnZyJFQ0RmdUIRIiIhBHYlVVQzIhEjMjIjMzIiMzIhERERFDERElQxERNDMxACMyERAA==", title:"Test Title 02", ARTIST:"Test Artist 02"}, ENTRYNO:1}

which breaks down to
item 1 having these records:
ENTRYNO: record of 1 item
INFO: record of 9 items
-----BITRATE
-----GENRE
-----COMMENT
-----RATING
-----PLAYTIME
-----IMPORT_DATE
-----FLAGS
-----FILE_SIZE
-----COLOR
PRIMARYKEY : record of … etc
-----subrecord1
-----subrecord2
LOCATION : …etc
-----subrecord3
-----subrecord4
-----subrecord5
…etc

All of the sub records all have different user names.

  1. Am I able to access the subrecords directly IE:
    from previous example:
    1a)
    d of aList
    1b)
    dsub of aList
    1c)
    dsub of (aList’s third record)
    1d) or do I have to drill right down into the subrecords like:
    dsub of d of (aList’s third record)

  2. I’m guessing I’m probably not able to ask for a recordkey name that’s deep in a list.
    But in my ITEM 1 it just contains records. Can I

2a)
BITRATE of item 1

2b) or do I have to
BITRATE of INFO of item 1

  1. If I have to drill down would I be better to set up 3b rather that 3a
    3a)
    BITRATE of INFO of item 1
    GENRE of INFO of item 1
    COMMENT of INFO of item 1
    RATING of INFO of item 1
    FLAGS of INFO of item 1
    3b)
    set ITEMINFO to INFO of item 1
    BITRATE of ITEMINFO
    GENRE of ITEMINFO
    COMMENT ITEMINFO
    RATING of ITEMINFO
    FLAGS of ITEMINFO

  2. I’m guessing there is probably other functions/libraries out there that will “filter” an
    array for me by “keys”

I’m not wanting all of the INFO from the item 1 just certain sub records
providing key say {“BITRATE”, “GENRE”, “COMMENT”, “RATING”, “FLAGS”}
in the key case is it possible to use a list of lists?
{“INFO” {“BITRATE”, “GENRE”, “COMMENT”, “RATING”, “FLAGS”}, “LOCATION” {“VOLUME”, “DIR”, “|file|”}}

thanks