Speed Issue :?:

I have some scripts which automate our process of getting files into a Cumulus database and making them selectable from a browser in pdf and jpg format. Everything works but since I launched the scripts 5 weeks ago I already have 6,000 + files in a folder named “pdfs” and another 6000+ in a folder named “jpgs”.

Because of the process the pdf files are moved into the pdf folder without an extension and the jpgs are saved as “filename copy.jpg”. I have a sorter/cleaner script which goes through these two folders and corrects the file names so they end up being filename.jpg or filename.pdf. It looks for duplicates and, once again because of our process, moves the older file to the trash and renames the newer one or if there are no duplicates just renames the file. But it takes so long for this process to work.

Our naming convention is as such: AA022403_001.jpg/pdf
AA is a magazine code
022403 is the date (US style)
the last digets are ad codes

When I archive material I am basically taking all the ads from an issue of the magazine, converting them from pdf format (we go to print with pdf files) to jpg in PS (6.1) and then renaming them so they can all be accessed through HTML pages and Cumulus Web Publisher.

Here’s the script I’m using. I thought it would speed the process up by isolating the files from the issue I am working on instead of trolling through the whole folder. For the jpgs I know these are the correct files I need to rename because they’ll begin with a 2 letter mag code that matches the same code in a folder hierarchy from where the file came from and will end with " copy.jpg", for the pdfs they are files without the .pdf extension and also have matching mag code. OK, the makes sense to me hopefully it will to everyone else.

Here’s the script:[just the jpg portion, the pdf one is similar]

copy (name of every item of theJPGPath whose name starts with thisMag and name ends with “copy.jpg”) to jpgFilesList

	repeat with i from 1 to count of every text item of jpgFilesList
	set thisFile to text item i of jpgFilesList
	set thisFilePath to (theJPGPath as string) & thisFile as alias
	duplicate file thisFilePath to theFolderPath with replacing
	set fileLength to length of thisFile
	set fileName to text 1 thru (fileLength - 9) of thisFile & ".jpg"
	
	if the file fileName of theJPGPath exists then
		set theoldFilePath to (theJPGPath as string) & fileName as alias
		tell application "Finder"
			move theoldFilePath to trash
			set name of thisFilePath to fileName
		end tell
	else
		tell application "Finder"
			set name of thisFilePath to fileName
		end tell
	end if
end repeat

Like I said, I thought this would be faster but it just chugs along slow as molasses. Any tips, hints, tricks, advice, clues, admonishments, etc. etc. will be happily accepted.

Thanks

Steve

Ideally, fix those scripts so they name files correctly in the first place; that way the problem simply goes away. But if that isn’t an option, read on…

Trashing old files is extra work. Any reason you don’t just replace them with the new files?

This is sensible.

Any reason you duplicate instead of move? That’s gotta suck down performance. Probably more so when you duplicate files one at a time rather than all at once.

[snip rest of code]

Shouldn’t require all that mashing of file paths as strings/aliases. (Not sure what difference getting rid of it will make to speed, but it’ll certainly make the code easier to read.)

The code below is as clean as I can get, removing the old file-trashing bit and replacing the repeated duplicate command with a single move. Files are moved to a temporary folder for the renaming bit, before moving to the destination folder, replacing any old files there.

set magCode to "AA"
set oldSuffix to " copy.jpg"
set newSuffix to ".jpg"

set sourceFolder to alias "Macintosh HD:Users:has:TEST:in:"
set tempFolder to alias "Macintosh HD:Users:has:TEST:temp:"
set destinationFolder to alias "Macintosh HD:Users:has:TEST:out:"

tell application "Finder"
   --move files to be processed to temp folder
   set filesToProcess to every file of sourceFolder whose name starts with magCode and name ends with oldSuffix
   move filesToProcess to tempFolder with replacing returning filesToRename
   --rename files
   set trimBy to -(length of oldSuffix) - 1
   repeat with aFile in filesToRename
      set name of aFile to (text 1 thru trimBy of (get name of aFile)) & newSuffix
   end repeat
   --move files to destination folder
   move files of tempFolder to destinationFolder with replacing
   return
end tell

If it’s still too slow, you’ll need to resort to Perl or other fast language that has built-in filesystem features.

I did have an optimisation of your script to offer, but HAS’s temporary-folder idea’s too good. Here then is an optimisation (hopefully!) of his script. :wink: It gets the existing names of the files en masse (an effort to reduce the disk-access time), then creates all the new names in just a few moves. A loop is still required to rename the files, but this should now go slightly faster. Untested…

set magCode to "AA"
set oldSuffix to " copy.jpg"
set newSuffix to ".jpg"
set unlikelyTID to ";+@"

set sourceFolder to alias "Macintosh HD:Users:has:TEST:in:" 
set tempFolder to alias "Macintosh HD:Users:has:TEST:temp:" 
set destinationFolder to alias "Macintosh HD:Users:has:TEST:out:" 

tell application "Finder"
  --move files to be processed to temp folder 
  set filesToProcess to every file of sourceFolder whose name starts with magCode and name ends with oldSuffix
  move filesToProcess to tempFolder with replacing
  
  set astid to AppleScript's text item delimiters
  try -- in case anything goes wrong before the TIDs are reset
    set AppleScript's text item delimiters to {unlikelyTID}
    set oldNames to ((name of every file of tempFolder) as string) & unlikelyTID
    set AppleScript's text item delimiters to {oldSuffix & unlikelyTID}
    set newNames to text items of oldNames
    set AppleScript's text item delimiters to {newSuffix & unlikelyTID}
    set newNames to newNames as string
    set AppleScript's text item delimiters to {unlikelyTID}
    set oldNames to text items 1 thru -2 of oldNames
    set newNames to text items 1 thru -2 of newNames
  end try
  set AppleScript's text item delimiters to astid
  
  --rename files 
  repeat with i from 1 to (count oldNames)
    set name of file (item i of oldNames) of tempFolder to item i of newNames
  end repeat
  --move files to destination folder 
  move files of tempFolder to destinationFolder with replacing
  return
end tell