Hello:
I need to write a script that will allow me to sort PDFs after dropping them into a single trigger or activate folder into separate folders based on a suffix. That part I’ve been able to piece together and it works just fine, but I’d like to add a wrinkle.
Each of my PDFs might be named something like “005_blue_jan08.pdf”, with the 005 signifying page 5.
If someone drops a new PDF named “005_red_jan08.pdf” I would like it to replace the existing page 5 pdf, even if the middle of the older PDF was already named blue or red.
So I’d like the finder to be able to use the jan08 or feb08 or texas08 suffix to point to the final folder and also have it look inside the folder to see if there’s already an existing page 5. If there is, my current script will replace that file if the name is an exact match, but again, I’d like it to replace it based on the suffix and the page prefix, regardless if the middle of the name is an exact match or not.
If that’s too complicated, I’d like it to at least determine there is already a page 5 in the folder and move the newer PDF to a WARNING folder and let an employee do the replacing by hand.
With what we’re doing the last example would be better than having two page 5s occupy the final folder, so it’s kind of like a bad sci fi time travel principle where no two page 5s can occupy the same space at the same time.
Any points in the right direction would be appreciated.
Here’s what I have so far:
property jan08 : "laptop2:Users:laptop2:Desktop:sort test:jan08" as alias
property feb08 : "laptop2:Users:laptop2:Desktop:sort test:feb08" as alias
property texas08 : "laptop2:Users:laptop2:Desktop:sort test:texas08" as alias
property badnames : "laptop2:Users:laptop2:Desktop:sort test:badnames" as alias
property trigger : "laptop2:Users:laptop2:Desktop:sort test:trigger" as alias
on idle
tell application "Finder"
try
set theFiles to files of entire contents of trigger
on error tError
display dialog tError giving up after 10
return 30
end try
repeat with indvFile in theFiles
set fileName to name of indvFile as text
if fileName contains "_jan08.pdf" then
move indvFile to jan08 with replacing
else if fileName contains "_feb08.pdf" then
move indvFile to feb08 with replacing
else if fileName contains "_texas08.pdf" then
move indvFile to texas08 with replacing
else
move indvFile to badnames with replacing
end if
end repeat
end tell
return 30
end idle
property basicFolder : ""
property badnames : ""
property trigger : ""
on run
set basicFolder to ((path to desktop as Unicode text) & "sort test:")
set badnames to (basicFolder & "badnames:")
set trigger to (basicFolder & "trigger:")
end run
on idle
try
tell application "Finder" to set theFiles to files of entire contents of folder trigger
on error tError
display dialog tError giving up after 10
return 30
end try
repeat with indvFile in theFiles
set {name:fileName, name extension:Ex} to (info for indvFile as alias)
if Ex is missing value then set Ex to ""
if Ex is not "" then
set fileName to text 1 thru ((count fileName) - (count Ex) - 1) of fileName
set Ex to "." & Ex
end if
set {TID, text item delimiters} to {text item delimiters, "_"}
try
set {Num, col, suf} to text items of fileName
set text item delimiters to TID
tell application "Finder"
set currentfiles to name of files of folder (basicFolder & suf)
move indvFile to folder (basicFolder & suf) with replacing
end tell
if (fileName & Ex) is not in currentfiles then
repeat with i in currentfiles
if contents of i starts with Num and contents of i ends with (suf & Ex) then
do shell script "rm " & quoted form of ((POSIX path of (basicFolder & suf & ":")) & contents of i) -- this deletes the matching page with the other color!
end if
end repeat
end if
on error
set text item delimiters to TID
tell application "Finder" to move indvFile to folder badnames with replacing
end try
end repeat
return 30
end idle
Hi Curtis,
Stefan has beaten me to the post (literally) and I’m sure his code his more efficient but I thought I might as well post it!!
Hopefully I’ve understood what you want!
property jan08 : (path to desktop as string) & "sort test:jan08" as alias
property feb08 : (path to desktop as string) & "sort test:feb08" as alias
property texas08 : (path to desktop as string) & "sort test:texas08" as alias
property badnames : (path to desktop as string) & "sort test:badnames" as alias
property trigger : (path to desktop as string) & "sort test:trigger" as alias
tell application "Finder"
set theFiles to every file of trigger
repeat with this_file in theFiles
set fileName to the name of this_file
set page_no to characters 1 thru 3 of fileName
if fileName contains "_jan08.pdf" then
set thepath to jan08
my page_search(thepath, fileName, page_no)
move this_file to jan08 with replacing
else if fileName contains "_feb08.pdf" then
set thepath to feb08
my page_search(thepath, fileName, page_no)
move this_file to feb08 with replacing
else if fileName contains "_texas08.pdf" then
set thepath to texas08
my page_search(thepath, fileName, page_no)
move this_file to texas08 with replacing
else
move this_file to badnames with replacing
end if
end repeat
end tell
on page_search(thepath, fileName, page_no)
tell application "Finder"
set folder_list to every item of thepath
repeat with current_file in folder_list
set current_file_name to the name of current_file
set current_page_no to characters 1 thru 3 of current_file_name
if current_page_no is equal to page_no then
delete current_file
end if
end repeat
end tell
end page_search
Thank you very much for the responses.
I’m probably on too little sleep, so most of this went over my head.
I couldn’t get StefanK’s to run, but I did get Blend3’s to do so.
However I can’t figure where I can reinsert the little looping part to get the script to keep trying to kick on every 30 seconds or so.
However, on run, the script did replace several PDFs that had different middles in their names.
I assume I would throw the loop in front of and after the “on page_search” and “end page_search” commands.
Also, how about this. I really need one more step and that is when files are placed into an earlier trigger they need to copy to a remotely mounted hard drive and then go into the trigger above for sorting in our office. All the PDFs on the remote hard drive go to one huge folder with no sorting.
So, if I wanted to copy and then do this move thing above, should I try and include that in one long script or should I have one running to do the copy from the pre-trigger to the trigger and then run the second script to move from the trigger to their end folders.
Very interesting to see two methods that should achieve the same result and yet they look so different to me.
property jan08 : (path to desktop as string) & "sort test:jan08" as alias
property feb08 : (path to desktop as string) & "sort test:feb08" as alias
property texas08 : (path to desktop as string) & "sort test:texas08" as alias
property badnames : (path to desktop as string) & "sort test:badnames" as alias
property trigger : (path to desktop as string) & "sort test:trigger" as alias
property remote_drive : "YourRemoteHardDriveName" -->as alias
on idle
tell application "Finder"
set theFiles to every file of trigger
repeat with this_file in theFiles
duplicate this_file to remote_drive --> copy the file to remote drive
set fileName to the name of this_file
set page_no to characters 1 thru 3 of fileName
if fileName contains "_jan08.pdf" then
set thepath to jan08
my page_search(thepath, fileName, page_no)
move this_file to jan08 with replacing
else if fileName contains "_feb08.pdf" then
set thepath to feb08
my page_search(thepath, fileName, page_no)
move this_file to feb08 with replacing
else if fileName contains "_texas08.pdf" then
set thepath to texas08
my page_search(thepath, fileName, page_no)
move this_file to texas08 with replacing
else
move this_file to badnames with replacing
end if
end repeat
end tell
end idle
on page_search(thepath, fileName, page_no)
tell application "Finder"
set folder_list to every item of thepath
repeat with current_file in folder_list
set current_file_name to the name of current_file
set current_page_no to characters 1 thru 3 of current_file_name
if current_page_no is equal to page_no then
delete current_file
end if
end repeat
end tell
end page_search
Well everything has to be one step closer.
If I fill in the path to my remote drive where you indicated I get nothing.
So I was wondering is that because where it is in the script it’s also trying to check for a Jan08 folder and Feb08 folder and so on.
On the remote drive there is no folder hierarchy like that - just one jumbo folder.
Also, where would my repeat things pop in so it doesn’t just run on the push of a run button?
Hi Curtis,
is your remote drive mounted as a volume on your desktop? If so what is it called? Can you confirm that you’ve saved the script as an application script with the Stay Open check box checked? The way it should work is that when you launch the app it will check the trigger folder for any files, if it finds some it’ll first copy the file to your remote drive and then calculate which folder to copy too and if it needs to replace a page as per number. It will then repeat this procedure every 30 seconds.
Nik
Can’t set “Volumes/Blah/Input” to item 1 of {<> “01_cover_Feb.pdf” of <> “trigger” of <> “sort test” of <> “desktop” oof <> “curtis” of <> “users” of <> of appl
Wow.
Well not at work so can’t test the remote drive right now, but just changed the duplicate task to a different folder on my drive and it works great. So I’m sure i can fix this tomorrow.
Can you comment whether I have the following correct.
The structure for a folder on a local drive is “laptop2:Users:curtis:Desktop:sort test:trigger” with laptop2 being the name of my drive and the trigger folder inside a sort test folder on my desktop
The structure for a locally mounted server drive is “:volumes:disk 1:sort test:trigger” with disk 1 being the name of the drive and a trigger folder inside a sort test folder.
And the structure for a remote drive is “/Volumes/Blah_Mags/Input” with Blah_Mags being the name of the drive and Input being the name of the folder on it, correct? Use the slashes in front of the Volumes and before and after the name of the drive?
Or am I making that up?
What controls the timing of the delay on yours? I keep looking for a
end tell
return 30
end idle
Many thanks Nik. Now i want to try and study StefanK’s and learn the difference between the two since they look so different. Like to see two ways to approach the same challenge.
Hi Curtis,
Your path to a folder structure on your desktop is complete but you can shorten this by specifying the path to desktop as:
(path to desktop as string) & "sort test:trigger" as alias
As far as a volume that’s mounted on your desktop you can just reference this using the volume name:
set mounted_volume to "nameofyourmountedvolume" as alias
so if your mounted drive is called disk 1 then specify the path to the folder structures of that volume like this:
property mounted_volume : "disk 1:sort test:trigger" as alias
try this line in a new script editor doc on its own once you have the drive connected, if it’s the correct path you won’t get an error. I believe this should work for any volume that you can see mounted on your desktop.
As a default on idle handler will run every 30 seconds but if you’d like it to run more or less frequent then just change the return value just before the end idle code.
Wow. I’m running it right now on some dupes of back issues and it has sorted about 6000 of 10,000 total PDFs with almost no problems. So this is amazing.
I do have one tiny problem. The remote server on the other end does take the PDFs as they come in and move them to a series of folders for archiving off site for redundancy. It basically has a script running that grabs the file and starts its process. I can probably contact the author and get them to slow down the wait loop on its end, but so far the script has stopped and said "unable to copy file “blah blah blah”. In looking at it, the PDF might have been so big that the script on the other end started to grab it and move while it was still copying and the script you set up knew it hadn’t finished and got confused.
If that were to happen, is there a place that some type of code could be inserrted to move the PDF into the badnames folder and just keep going rather than just sitting there and stopping its task?
Thanks a bunch. Let me know if you want some bottled sunshine from Texas sent to the UK.
On the remote server, there is an input folder that starts grabbing files after a 20 second wait or something like that and there is a temp input folder that has no action. If we currently have trouble uploading files on some days we send to the temp input and then manually drag them from there when they’re done into the input folder.
Would much rather go straight to the hot folder input and cut out the manual work. I might be able to slow down the loop on the other end, but could the script screen the files and if they were say 10MB or smaller go straight to the input and if they were over that size get sent to the temp input? If that’s a possibility I could find a million other uses for that part of a script.
Hi Curtis
if I understand correctly the problem you’re having is that the script is trying to copy files from one volume to another and occasionally the file that it’s trying to copy hasn’t yet finished being copied from some other volume?
If this is your problem we should be able to put a handler into the script that checks the size of the file and then checks it again a few seconds later, if there is no change in file size then it’s safe to copy the file however if the file size has changed then wait a few more seconds and compare the file sizes again.
Nik
Yes. Except the software running on the other hand can’t be controlled by myself, and that’s the one eating things. I could probably get them to reset the wait loop another 10 seconds or so and that would clear it up. I was just wondering if I could do something to this script so that it wouldn’t stop when the other machine ate a file. I’d be fine with just moving it to some other error folder or into the bad names folder and trying it again later. I’d just hate for this script to stop because of something that was going wrong and it wasn’t its fault.
Hi Curtis,
is it a possibility that we set up a temp directory on the volume that you other program polling and then we could copy the files to this directory and once they’ve finished copying then move them out of the temp directory into the directory that your other program is polling? Hope this makes sense?
Nik