name screening

Hello:
I need to write a script that will allow me to sort PDFs after dropping them into a single trigger or activate folder into separate folders based on a suffix. That part I’ve been able to piece together and it works just fine, but I’d like to add a wrinkle.

Each of my PDFs might be named something like “005_blue_jan08.pdf”, with the 005 signifying page 5.
If someone drops a new PDF named “005_red_jan08.pdf” I would like it to replace the existing page 5 pdf, even if the middle of the older PDF was already named blue or red.

So I’d like the finder to be able to use the jan08 or feb08 or texas08 suffix to point to the final folder and also have it look inside the folder to see if there’s already an existing page 5. If there is, my current script will replace that file if the name is an exact match, but again, I’d like it to replace it based on the suffix and the page prefix, regardless if the middle of the name is an exact match or not.

If that’s too complicated, I’d like it to at least determine there is already a page 5 in the folder and move the newer PDF to a WARNING folder and let an employee do the replacing by hand.

With what we’re doing the last example would be better than having two page 5s occupy the final folder, so it’s kind of like a bad sci fi time travel principle where no two page 5s can occupy the same space at the same time.

Any points in the right direction would be appreciated.

Here’s what I have so far:

property jan08 : "laptop2:Users:laptop2:Desktop:sort test:jan08" as alias
property feb08 : "laptop2:Users:laptop2:Desktop:sort test:feb08" as alias
property texas08 : "laptop2:Users:laptop2:Desktop:sort test:texas08" as alias
property badnames : "laptop2:Users:laptop2:Desktop:sort test:badnames" as alias
property trigger : "laptop2:Users:laptop2:Desktop:sort test:trigger" as alias

on idle
	tell application "Finder"
		try
			set theFiles to files of entire contents of trigger
		on error tError
			display dialog tError giving up after 10
			return 30
		end try
		repeat with indvFile in theFiles
			set fileName to name of indvFile as text
			if fileName contains "_jan08.pdf" then
				move indvFile to jan08 with replacing
			else if fileName contains "_feb08.pdf" then
				move indvFile to feb08 with replacing
			else if fileName contains "_texas08.pdf" then
				move indvFile to texas08 with replacing
			else
				move indvFile to badnames with replacing
			end if
		end repeat
	end tell
	return 30
end idle

Hi curtis,

I hope I understood your “algorithm”
Try this:


property basicFolder : ""
property badnames : ""
property trigger : ""

on run
	set basicFolder to ((path to desktop as Unicode text) & "sort test:")
	set badnames to (basicFolder & "badnames:")
	set trigger to (basicFolder & "trigger:")
end run

on idle
	try
		tell application "Finder" to set theFiles to files of entire contents of folder trigger
	on error tError
		display dialog tError giving up after 10
		return 30
	end try
	repeat with indvFile in theFiles
		set {name:fileName, name extension:Ex} to (info for indvFile as alias)
		if Ex is missing value then set Ex to ""
		if Ex is not "" then
			set fileName to text 1 thru ((count fileName) - (count Ex) - 1) of fileName
			set Ex to "." & Ex
		end if
		set {TID, text item delimiters} to {text item delimiters, "_"}
		try
			set {Num, col, suf} to text items of fileName
			set text item delimiters to TID
			tell application "Finder"
				set currentfiles to name of files of folder (basicFolder & suf)
				move indvFile to folder (basicFolder & suf) with replacing
			end tell
			if (fileName & Ex) is not in currentfiles then
				repeat with i in currentfiles
					if contents of i starts with Num and contents of i ends with (suf & Ex) then
						do shell script "rm " & quoted form of ((POSIX path of (basicFolder & suf & ":")) & contents of i) -- this deletes the matching page with the other color!
					end if
				end repeat
			end if
		on error
			set text item delimiters to TID
			tell application "Finder" to move indvFile to folder badnames with replacing
		end try
	end repeat
	return 30
end idle

Hi Curtis,
Stefan has beaten me to the post (literally) and I’m sure his code his more efficient but I thought I might as well post it!!
Hopefully I’ve understood what you want!

property jan08 : (path to desktop as string) & "sort test:jan08" as alias
property feb08 : (path to desktop as string) & "sort test:feb08" as alias
property texas08 : (path to desktop as string) & "sort test:texas08" as alias
property badnames : (path to desktop as string) & "sort test:badnames" as alias
property trigger : (path to desktop as string) & "sort test:trigger" as alias

tell application "Finder"
	set theFiles to every file of trigger
	
	repeat with this_file in theFiles
		set fileName to the name of this_file
		set page_no to characters 1 thru 3 of fileName
		if fileName contains "_jan08.pdf" then
			set thepath to jan08
			my page_search(thepath, fileName, page_no)
			move this_file to jan08 with replacing
		else if fileName contains "_feb08.pdf" then
			set thepath to feb08
			my page_search(thepath, fileName, page_no)
			move this_file to feb08 with replacing
		else if fileName contains "_texas08.pdf" then
			set thepath to texas08
			my page_search(thepath, fileName, page_no)
			move this_file to texas08 with replacing
		else
			move this_file to badnames with replacing
		end if
	end repeat
end tell

on page_search(thepath, fileName, page_no)
	tell application "Finder"
		set folder_list to every item of thepath
		repeat with current_file in folder_list
			set current_file_name to the name of current_file
			set current_page_no to characters 1 thru 3 of current_file_name
			if current_page_no is equal to page_no then
				delete current_file
			end if
		end repeat
	end tell
end page_search

Thanks,
Nik

I’m sorry

So do I, Nik :wink:

Thank you very much for the responses.
I’m probably on too little sleep, so most of this went over my head.
I couldn’t get StefanK’s to run, but I did get Blend3’s to do so.
However I can’t figure where I can reinsert the little looping part to get the script to keep trying to kick on every 30 seconds or so.
However, on run, the script did replace several PDFs that had different middles in their names.

I assume I would throw the loop in front of and after the “on page_search” and “end page_search” commands.

Also, how about this. I really need one more step and that is when files are placed into an earlier trigger they need to copy to a remotely mounted hard drive and then go into the trigger above for sorting in our office. All the PDFs on the remote hard drive go to one huge folder with no sorting.

So, if I wanted to copy and then do this move thing above, should I try and include that in one long script or should I have one running to do the copy from the pre-trigger to the trigger and then run the second script to move from the trigger to their end folders.

Very interesting to see two methods that should achieve the same result and yet they look so different to me.

Thanks again.

/curtis

Hi Curtis,
Give this code a try:

property jan08 : (path to desktop as string) & "sort test:jan08" as alias
property feb08 : (path to desktop as string) & "sort test:feb08" as alias
property texas08 : (path to desktop as string) & "sort test:texas08" as alias
property badnames : (path to desktop as string) & "sort test:badnames" as alias
property trigger : (path to desktop as string) & "sort test:trigger" as alias
property remote_drive : "YourRemoteHardDriveName" -->as alias

on idle
	tell application "Finder"
		set theFiles to every file of trigger
		repeat with this_file in theFiles
			duplicate this_file to remote_drive --> copy the file to remote drive
			set fileName to the name of this_file
			set page_no to characters 1 thru 3 of fileName
			if fileName contains "_jan08.pdf" then
				set thepath to jan08
				my page_search(thepath, fileName, page_no)
				move this_file to jan08 with replacing
			else if fileName contains "_feb08.pdf" then
				set thepath to feb08
				my page_search(thepath, fileName, page_no)
				move this_file to feb08 with replacing
			else if fileName contains "_texas08.pdf" then
				set thepath to texas08
				my page_search(thepath, fileName, page_no)
				move this_file to texas08 with replacing
			else
				move this_file to badnames with replacing
			end if
		end repeat
	end tell
end idle

on page_search(thepath, fileName, page_no)
	tell application "Finder"
		set folder_list to every item of thepath
		repeat with current_file in folder_list
			set current_file_name to the name of current_file
			set current_page_no to characters 1 thru 3 of current_file_name
			if current_page_no is equal to page_no then
				delete current_file
			end if
		end repeat
	end tell
end page_search

Thanks,
Nik

Well everything has to be one step closer.
If I fill in the path to my remote drive where you indicated I get nothing.

So I was wondering is that because where it is in the script it’s also trying to check for a Jan08 folder and Feb08 folder and so on.
On the remote drive there is no folder hierarchy like that - just one jumbo folder.

Also, where would my repeat things pop in so it doesn’t just run on the push of a run button?

/cutis

Hi Curtis,
is your remote drive mounted as a volume on your desktop? If so what is it called? Can you confirm that you’ve saved the script as an application script with the Stay Open check box checked? The way it should work is that when you launch the app it will check the trigger folder for any files, if it finds some it’ll first copy the file to your remote drive and then calculate which folder to copy too and if it needs to replace a page as per number. It will then repeat this procedure every 30 seconds.
Nik

Well, I get an error immediately that says:

Can’t set “Volumes/Blah/Input” to item 1 of {<> “01_cover_Feb.pdf” of <> “trigger” of <> “sort test” of <> “desktop” oof <> “curtis” of <> “users” of <> of appl

is “Volumes/Blah/Input” a path you’re specifying? maybe you could try replacing

property remote_drive : "YourRemoteHardDriveName"

with

set remote_drive to choose folder

just to see if the remote_drive is the problem?
Nik

Wow.
Well not at work so can’t test the remote drive right now, but just changed the duplicate task to a different folder on my drive and it works great. So I’m sure i can fix this tomorrow.

Can you comment whether I have the following correct.

  • The structure for a folder on a local drive is “laptop2:Users:curtis:Desktop:sort test:trigger” with laptop2 being the name of my drive and the trigger folder inside a sort test folder on my desktop
  • The structure for a locally mounted server drive is “:volumes:disk 1:sort test:trigger” with disk 1 being the name of the drive and a trigger folder inside a sort test folder.
  • And the structure for a remote drive is “/Volumes/Blah_Mags/Input” with Blah_Mags being the name of the drive and Input being the name of the folder on it, correct? Use the slashes in front of the Volumes and before and after the name of the drive?

Or am I making that up?

What controls the timing of the delay on yours? I keep looking for a

end tell
	return 30
end idle

Many thanks Nik. Now i want to try and study StefanK’s and learn the difference between the two since they look so different. Like to see two ways to approach the same challenge.

/curtis

Hi Curtis,
Your path to a folder structure on your desktop is complete but you can shorten this by specifying the path to desktop as:

(path to desktop as string) & "sort test:trigger" as alias

As far as a volume that’s mounted on your desktop you can just reference this using the volume name:

set mounted_volume to "nameofyourmountedvolume" as alias

so if your mounted drive is called disk 1 then specify the path to the folder structures of that volume like this:

property mounted_volume : "disk 1:sort test:trigger" as alias

try this line in a new script editor doc on its own once you have the drive connected, if it’s the correct path you won’t get an error. I believe this should work for any volume that you can see mounted on your desktop.
As a default on idle handler will run every 30 seconds but if you’d like it to run more or less frequent then just change the return value just before the end idle code.

Wow. I’m running it right now on some dupes of back issues and it has sorted about 6000 of 10,000 total PDFs with almost no problems. So this is amazing.

I do have one tiny problem. The remote server on the other end does take the PDFs as they come in and move them to a series of folders for archiving off site for redundancy. It basically has a script running that grabs the file and starts its process. I can probably contact the author and get them to slow down the wait loop on its end, but so far the script has stopped and said "unable to copy file “blah blah blah”. In looking at it, the PDF might have been so big that the script on the other end started to grab it and move while it was still copying and the script you set up knew it hadn’t finished and got confused.

If that were to happen, is there a place that some type of code could be inserrted to move the PDF into the badnames folder and just keep going rather than just sitting there and stopping its task?

Thanks a bunch. Let me know if you want some bottled sunshine from Texas sent to the UK.

/curtis

Or here’s another idea.

On the remote server, there is an input folder that starts grabbing files after a 20 second wait or something like that and there is a temp input folder that has no action. If we currently have trouble uploading files on some days we send to the temp input and then manually drag them from there when they’re done into the input folder.

Would much rather go straight to the hot folder input and cut out the manual work. I might be able to slow down the loop on the other end, but could the script screen the files and if they were say 10MB or smaller go straight to the input and if they were over that size get sent to the temp input? If that’s a possibility I could find a million other uses for that part of a script.

Just curious.

/curtis

Hi Curtis
if I understand correctly the problem you’re having is that the script is trying to copy files from one volume to another and occasionally the file that it’s trying to copy hasn’t yet finished being copied from some other volume?
If this is your problem we should be able to put a handler into the script that checks the size of the file and then checks it again a few seconds later, if there is no change in file size then it’s safe to copy the file however if the file size has changed then wait a few more seconds and compare the file sizes again.
Nik

Yes. Except the software running on the other hand can’t be controlled by myself, and that’s the one eating things. I could probably get them to reset the wait loop another 10 seconds or so and that would clear it up. I was just wondering if I could do something to this script so that it wouldn’t stop when the other machine ate a file. I’d be fine with just moving it to some other error folder or into the bad names folder and trying it again later. I’d just hate for this script to stop because of something that was going wrong and it wasn’t its fault.

Hi Curtis,
is it a possibility that we set up a temp directory on the volume that you other program polling and then we could copy the files to this directory and once they’ve finished copying then move them out of the temp directory into the directory that your other program is polling? Hope this makes sense?
Nik