I have a folder with somewhere between 6000-7000 jpgs in them. They are named sequentially like this:
1-1.jpg
1-2.jpg
1-3.jpg
2-1.jpg
3-1.jpg
3-2.jpg
…
what I want to do is get all the unique initial numbers (before the dash), create folders for them and then move the appropriate files into the newly created folders.
Experience tells me this is too big a task for AS to handle alone. I’m not well versed in shell script though I am pretty sure it’s the way I should go using something like awk. Can anyone lend a hand and get me started down the right path on this?
With my specially created test folder, this hybrid vanilla/Finder/mkdir script takes a nominal 1.5 seconds to create the relevant folders and about 35 seconds to move the files. (2 GHz G5 DP, Tiger.) Assuming it’s not something you’re going to want to do every few moments, that might be adequate.
My test folder contains 6592 JPEGs with names randomly generated in the prescribed manner. 444 folders are needed to match the names.
Caveats:
When the folder containing the thousands of files is first selected using ‘choose folder’, the system starts updating the ‘choose’ information for that folder, which can increase the running time of the rest of the script to between 80 and 90 seconds.
The single loop in the moveFiles() handler relies on the observed phenomenon that the Finder returns the names of the items in folders in lexical order. I haven’t checked recently, but going by past behaviour, I believe this is more likely to happen if the folders are closed at the time.
on getFolderNames(fileNameList)
script o
property fileNames : fileNameList
property folderNames : {}
end script
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "-" as Unicode text
set folderNamesLocal to o's folderNames -- for the 'is in' commands.
considering case
repeat with i from 1 to (count o's fileNames)
set thisPrefix to text item 1 of item i of o's fileNames
if (thisPrefix is in folderNamesLocal) then
else
set end of o's folderNames to thisPrefix
end if
end repeat
end considering
set AppleScript's text item delimiters to "," as Unicode text
set nameRun to folderNamesLocal as Unicode text
set AppleScript's text item delimiters to astid
return nameRun
end getFolderNames
on moveFiles(fileNameList, fileContainer, folderContainer)
script o
property fileNames : fileNameList
property folderNames : missing value
property part2 : missing value
end script
tell application "Finder" to set o's folderNames to name of folderContainer's folders
set part1 to "tell application \"Finder\" to move {file \"" as Unicode text
set part3 to "\"} of folder \"" & fileContainer & "\" to folder \"" as Unicode text
set part5 to "\" of folder \"" & folderContainer & "\"" as Unicode text
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "\", file \"" as Unicode text
set fileNameCount to (count fileNameList)
set i to 1
set j to 1
considering case
set o's part2 to {}
set part4 to item i of o's folderNames
set thisPrefix to part4 & "-"
repeat until (j > fileNameCount)
set thisFileName to item j of o's fileNames
if (thisFileName begins with thisPrefix) then
set end of o's part2 to thisFileName
set j to j + 1
else if (j > fileNameCount) then
else
run script (part1 & o's part2 & part3 & part4 & part5)
set o's part2 to {}
set i to i + 1
set part4 to item i of o's folderNames
set thisPrefix to part4 & "-"
end if
end repeat
run script (part1 & o's part2 & part3 & part4 & part5)
end considering
set AppleScript's text item delimiters to astid
end moveFiles
on main()
set fileContainer to (choose folder with prompt "Choose the folder containing the JPEGs.")
set folderContainer to (choose folder with prompt "Choose the folder where the new folders are to be created.")
tell application "Finder" to set fileNames to name of fileContainer's files -- *Much* faster than 'list folder without invisibles' with 6592 files!
set mkdirStr to ("mkdir -p " as Unicode text) & quoted form of POSIX path of folderContainer & "{" & getFolderNames(fileNames) & "}"
do shell script mkdirStr
moveFiles(fileNames, fileContainer, folderContainer)
end main
main()
Here’s another, faster (27 seconds overall) and possibly more dependable version that uses a shell script to move the files. It still uses the Finder to get the initial list of file names, as this seems to be faster than both the ‘list folder without invisibles’ and shell script alternatives “ at least, with that many files. The same caveat applies about the system taking time to “update”, but the dependancy on lexical order is no longer an issue.
on getFolderNames(fileNameList)
script o
property fileNames : fileNameList
property folderNames : {}
end script
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "-" as Unicode text
set folderNamesLocal to o's folderNames -- for the 'is in' commands.
considering case
repeat with i from 1 to (count o's fileNames)
set thisPrefix to text item 1 of item i of o's fileNames
if (thisPrefix is in folderNamesLocal) then
else
set end of o's folderNames to thisPrefix
end if
end repeat
end considering
set AppleScript's text item delimiters to astid
return o's folderNames
end getFolderNames
on moveFiles(folderNameList, fileContainer, folderContainer)
script o
property folderNames : folderNameList
property shellStringParts : {}
end script
set fileContainerPath to quoted form of POSIX path of fileContainer
set folderContainerPath to quoted form of POSIX path of folderContainer
set thisPrefix to beginning of o's folderNames
set beginning of o's shellStringParts to ("mv " as Unicode text) & fileContainerPath & "/" & thisPrefix & "-*.jpg " & folderContainerPath & "/" & thisPrefix
repeat with i from 2 to (count o's folderNames)
set thisPrefix to item i of o's folderNames
set end of o's shellStringParts to thisPrefix & "-*.jpg " & folderContainerPath & "/" & thisPrefix
end repeat
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to (" ; mv " as Unicode text) & fileContainerPath & "/"
set mvStr to o's shellStringParts as Unicode text
set AppleScript's text item delimiters to astid
do shell script mvStr
end moveFiles
on main()
set fileContainer to (choose folder with prompt "Choose the folder containing the JPEGs.")
set folderContainer to (choose folder with prompt "Choose the folder where the new folders are to be created.")
tell application "Finder" to set fileNames to name of fileContainer's files
set folderNames to getFolderNames(fileNames)
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "," as Unicode text
set mkdirStr to ("mkdir -p " as Unicode text) & quoted form of POSIX path of folderContainer & "{" & folderNames & "}"
set AppleScript's text item delimiters to astid
do shell script mkdirStr
moveFiles(folderNames, fileContainer, folderContainer)
end main
main()
Out of interest, here are my timings for the three ways to get a list of the 6592 file names in my test folder, timed after the system had updated:
-- fileContainer is an alias to a folder.
tell application "Finder" to set fileNames to name of fileContainer's files
--> 0.366 seconds
on getParagraphs(txt)
set paras to {}
set paraCount to (count txt each paragraph)
set i to 1
repeat until (i > paraCount)
set j to i + 3900
if (j > paraCount) then set j to paraCount
set paras to paras & paragraphs i thru j of txt
set i to j + 1
end repeat
return paras
end getParagraphs
set fileNames to getParagraphs(do shell script "ls -p " & quoted form of POSIX path of fileContainer & " | grep -v '/'") -- Files differentiated from folders.
--> 1.073 seconds
set fileNames to (list folder fileContainer without invisibles) -- Files not differentiated from folders.
--> 47.785 seconds!
-- However:
set fileNames to (list folder fileContainer) -- Files not differentiated from folders.
--> 0.147 seconds
choose folder with prompt "Sort JPG files in this folder:"
set sourceFolder to POSIX path of result
do shell script "cd " & quoted form of sourceFolder & "; /bin/ls *.jpg -p | /usr/bin/grep -v '/$' | /usr/bin/cut -d '-' -f 1 | /usr/bin/sort -u | /usr/bin/xargs /bin/mkdir -v"
set folderNames to paragraphs of result
Wow, Bruce! RIRedinPA said he wanted a shell approach and that certainly is one!
With my computer settled down after updating the test folder, your shell script’s just a tad slower than the equivalent section of my Finder/vanilla/shell script. (1.72 seconds, compared with 1.51 seconds). However, it’s not nearly as susceptible to the delaying effects of the update taking place while the script’s running “ so in practice, it could be up to three quarters of a minute faster!
I think I understand generally how it works. “cd” sets the “current directory” to the source folder; "ls lists all the files in it whose names end with “.jpg” and appends a slash to any folder names; “grep” excludes any names that end with a slash; “cut” reduces each surviving name to its first text item (delimiter “-”); “sort” simply omits duplicates from the result; and “xargs” passes each remaining fragment in turn to “mkdir”, which makes new folders in the “current directory”.
Is there any way, within the same single call, to specify a different destination from the source folder?
The “ls” part of the string has a “.jpg" wildcard and is followed immediately by a “grep” that excludes names marked with trailing slashes. Presumably that’s to exclude any folders in the source folder that might have “.jpg” extensions. I’ve tried planting such a folder in the source folder this morning. With the ".jpg” wildcard, this folder is listed, after an empty line, at the end of the other matching names and has a colon tagged onto it, not a slash. Without the wildcard, it’s listed alphabetically with the other names and has a slash, not a colon. I got round this by omitting the wildcard and inserting another “grep” to catch the ".jpg"s:
Is there a neater way of doing this, or am I missing a point somewhere?
choose folder with prompt "Sort JPG files in this folder:"
set sourceFolder to POSIX path of result
(* do shell script
---------------
1. Change current directory to sourceFolder
2. List contents of sourceFolder
3. Filter out everything besides JPG files
4. Cut up the filenames to get the leading number
5. Sort those numbers into a unique list
6. Make directories for each of those numbers
*)
do shell script "cd " & quoted form of sourceFolder & ¬
"; /bin/ls -p | " & ¬
"/usr/bin/grep '.jpg$' | " & ¬
"/usr/bin/cut -d '-' -f 1 | " & ¬
"/usr/bin/sort -u | " & ¬
"/usr/bin/xargs /bin/mkdir -v"
set folderNames to paragraphs of result
OK. Incorporating Bruce’s idea reduces my second script to this:
on moveFiles(folderNameList, fileContainerPath, folderContainerPath)
script o
property folderNames : folderNameList
property shellStringParts : {}
end script
set thisPrefix to beginning of o's folderNames
set beginning of o's shellStringParts to ("mv " as Unicode text) & fileContainerPath & "/" & thisPrefix & "-*.jpg " & folderContainerPath & "/" & thisPrefix
repeat with i from 2 to (count o's folderNames)
set thisPrefix to item i of o's folderNames
set end of o's shellStringParts to thisPrefix & "-*.jpg " & folderContainerPath & "/" & thisPrefix
end repeat
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to (" ; mv " as Unicode text) & fileContainerPath & "/"
set mvStr to o's shellStringParts as Unicode text
set AppleScript's text item delimiters to astid
do shell script mvStr
end moveFiles
on main()
set fileContainerPath to quoted form of POSIX path of (choose folder with prompt "Choose the folder containing the JPEGs.")
set folderContainerPath to quoted form of POSIX path of (choose folder with prompt "Choose the folder where the new folders are to be created.")
do shell script ("ls -p " & fileContainerPath & " | grep '.jpg$' | cut -d '-' -f 1 | sort -u")
set folderNames to paragraphs of result
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "," as Unicode text
set mkdirStr to ("mkdir -p " as Unicode text) & folderContainerPath & "{" & folderNames & "}"
set AppleScript's text item delimiters to astid
do shell script mkdirStr
moveFiles(folderNames, fileContainerPath, folderContainerPath)
end main
main()
But if the idea is to create the new folders in the original source folder, as per Bruce’s code “ which doesn’t sound like a bad idea “ the main() handler above could be changed to this:
on main()
set fileContainerPath to quoted form of POSIX path of (choose folder with prompt "Choose the folder containing the JPEGs.")
set folderContainerPath to fileContainerPath
do shell script ("cd " & fileContainerPath & " ; ls -p | grep '.jpg$' | cut -d '-' -f 1 | sort -u | xargs mkdir -pv")
set folderNames to paragraphs of result
moveFiles(folderNames, fileContainerPath, folderContainerPath)
end main
I’ve omitted the paths to the various shell commands for clarity. They don’t appear to make any difference. Is there any particular advantage to using them?
choose folder with prompt "Sort JPG files found in this folder:"
set sourceFolder to POSIX path of result
choose folder with prompt "Put sorted files in folder:"
set outputFolder to POSIX path of result
-- Escape double quotes for ruby
set ASTID to AppleScript's text item delimiters
set AppleScript's text item delimiters to "\""
set outputFolder to text items of outputFolder
set AppleScript's text item delimiters to "\\\""
set outputFolder to "" & outputFolder
set AppleScript's text item delimiters to ASTID
do shell script "cd " & quoted form of sourceFolder & ¬
"; /bin/ls -p | " & ¬
"/usr/bin/grep '.jpg$' | " & ¬
"/usr/bin/cut -d '-' -f 1 | " & ¬
"/usr/bin/sort -u | " & ¬
"/usr/bin/ruby -e '$stdin.readlines.each {|line| print \"" & outputFolder & "\"+line[0..-2].concat(0) }' | " & ¬
"/usr/bin/xargs -0 /bin/mkdir -v"
set folderNames to paragraphs of result
For the ruby code:
outputFolder is added to the front of the line
The newline is removed from the the end of the line
A null byte is added to the end of the line
The line is printed as is (no newline at the end, as compared to puts)
The last two items allow you to use the -0 option with xargs; This will save you from having to escape spaces in outputFolder. (Seeing that outputFolder is defined before the shell script, you could just escape spaces after escaping double quotes instead of using null bytes [I didn’t think of that until I was finished].)
Hi, Bruce. Sorry I’ve been so long getting back about this.
The shell script with the ruby code doesn’t do anything on my machine, apart from returning an empty Unicode text. It doesn’t create any new folders. (None that I can find, anyway!) I’m not at all familiar with ruby, so I can’t tell why not.
Am I right in thinking that the idea’s to a text that look something like this?
It seems to me that we could have a shorter text, dispense with xargs, and speed up the folder creation (by about 0.004 seconds!) by building a single line in the form:
. and feed that directly into mkdir, as in the early versions of my script. Is that easy?
Ah yes. I should have seen that myself. (In fact I did see it, but it didn’t register. :rolleyes: ) The code’s working brilliantly now.
Messing and guessing around with the ruby part of it this morning, I could get it to output the path format I was asking about, but this wasn’t interpreted as I wanted in the shell script’s internal flow. I now think that the system would normally expand the compressed format to the format produced by your script anyway, so it’s probably pointless to pursue that line any further.
The list of paragraphs returned this time contains full POSIX paths, not just folder names, which is easily accommodated with a few changes in the moveFiles() handler.
Thanks for your suggestions. I think the overall speed improvement over my original script isn’t very much, because most of the script’s running time is taken up with the actual physical process of moving the files. But it’s been an interesting exercise for me personally.
on moveFiles(fileContainerPath, folderPathList)
script o
property folderPaths : folderPathList
property shellStringParts : {}
end script
set wildcard to "-*.jpg " as Unicode text
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "/" as Unicode text
tell beginning of o's folderPaths to set beginning of o's shellStringParts to ("mv " as Unicode text) & fileContainerPath & text item -1 & wildcard & quoted form
repeat with i from 2 to (count folderPathList)
tell item i of o's folderPaths to set end of o's shellStringParts to text item -1 & wildcard & quoted form
end repeat
set AppleScript's text item delimiters to (" ; mv " as Unicode text) & fileContainerPath
set mvStr to o's shellStringParts as Unicode text
set AppleScript's text item delimiters to astid
do shell script mvStr
end moveFiles
on main()
set fileContainerPath to quoted form of POSIX path of (choose folder with prompt "Choose the folder containing the JPEGs.")
set folderContainerPath to POSIX path of (choose folder with prompt "Choose the folder where the new folders are to be created.") -- Not quoted.
-- Escape double quotes for ruby
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "\""
set folderContainerPath to text items of folderContainerPath
set AppleScript's text item delimiters to "\\\""
set folderContainerPath to folderContainerPath as Unicode text
set AppleScript's text item delimiters to astid
do shell script ("ls -p " & fileContainerPath & " | grep '.jpg$' | cut -d '-' -f 1 | sort -u | ruby -e '$stdin.readlines.each {|line| print \"" & folderContainerPath & "\"+line[0..-2].concat(0) }' | xargs -0 mkdir -v")
set folderPaths to paragraphs of result
moveFiles(fileContainerPath, folderPaths)
end main
main()
Piping this into mkdir causes a usage error, but pasting the same line into Terminal manually works. (That last comma causes a problem, but that can be avoided by using the -p option for mkdir.)