I’m trying to migrate a TON of MS Word files into some format we can display on our website. That part is EASY. But, the word files exist across approximately 300 old CD backups, with no discernable pattern to their individual folder hierarchies.
I tried to make an Automator script in OSX to search a CD for all word files, and I kinda made it work… The problem is, I can’t say “search for files on ”, I have to specify a CD name. Blah!
Second problem: these CDs go back 5 years. Our policy for requiring dot extensions for filenames goes back 3, tops. AND, the older files are all Word 5.1 docs. Which means that while the OS X finder can recognize these as word files, I can’t use Spotlight to search by extension, type, or creator code… It seems that this info doesn’t get correctly parsed into a format the OSX finder understands.
So, I’d love to learn how to do this myself, but perhaps it’s just easier/faster to pay someone to make* me a script that (a) finds all the word files (any version) on a CD, (b) copies all the files to a folder named whatever the CD was named on the local hard drive, (c) spits out the disc when it’s done, and (d) repeats a,b, and c until I tell it to stop. Did I mention that the older CDs have illegal characters (slashes, mostly) in some of the file names?
Is there a good, reliable place to hire scripter people? Is there a tool that does all this already? Or any hints on how to overcome the no dot extensions/problematic file typing issue?
This is a start. It depends on files having a doc extension.
property knownVolumes : {((name of (info for (path to startup disk)))), "Network", "Halo"}
on run
repeat
copyFromCD()
display dialog "Script finished." buttons {"Run again", "OK"} default button 2
if (button returned of result) is not "Run again" then exit repeat
end repeat
end run
on copyFromCD()
set otherVolumes to {}
repeat with thisDisk in (list disks)
if thisDisk is not in knownVolumes then set otherVolumes's end to (contents of thisDisk)
end repeat
if (count otherVolumes) is not 1 then ¬
display dialog "This script only works with one volume." buttons {"Cancel"} default button 1
try
set thisVolume to item 1 of otherVolumes
tell application "Finder" to make new folder at desktop with properties {name:thisVolume}
set copyTo to quoted form of POSIX path of (result as alias)
do shell script "find " & quoted form of ("/Volumes/" & thisVolume) & " -iname *.doc -exec cp {} " & copyTo & " \\;"
--tell application "Finder" to eject thisVolume
on error errorMsg number errorNum
display dialog "Error: (" & errorNum & "):" & return & return & errorMsg buttons "Cancel" default button 1 with icon caution
end try
end copyFromCD
I’m not sure what to do about files w/o a doc extenstion. When I have some more time I’ll see if the unix command “file” can help out.
--Destination for folders
set thePath to path to desktop from user domain
repeat -- Big loop per disk
set folderList to "" as list
-- get disk name and make destination folder
repeat
repeat
try
tell application "System Events" to set myRoot to name of item 1 of (disks whose ejectable is true)
set item 1 of folderList to myRoot
exit repeat
end try
end repeat
try
tell application "Finder" to set myFolder to make new folder at thePath with properties {name:myRoot}
exit repeat
on error theERR
display dialog theERR
tell application "Finder" to eject myRoot
delay 1
end try
end repeat
set testThen to (current date) -- time tester
repeat -- folderList loop
tell application "System Events"
set FolderRef to (item 1 of folderList) as string
set FolderContents to path of files of folder FolderRef whose kind is "Microsoft Word document"
set folderList to folderList & (path of folders of folder FolderRef)
end tell
--copy files
repeat with TheFile in FolderContents
-- Fails for multiple files with same name
try
tell application "Finder" to duplicate TheFile to myFolder
on error theERR
--display dialog theERR
end try
end repeat
-- Check for more folders to process
if length of folderList > 1 then
set folderList to items 2 thru ¬
(length of folderList) of folderList as list
else
exit repeat
end if
end repeat -- folderList loop
-- Eject disk
tell application "Finder" to eject myRoot
display dialog "Done in " & ((current date) - testThen) & " seconds. Next..." --time tester
end repeat -- End Big loop
This does not copy files with duplicated file names on the same disk. Uses file kind “Microsoft Word document” to choose documents, which hopefully will overcome the file extension problem.
Thanks! These two worked well, the second one especially… Still, it doesn’t find older Word 5 for Mac documents… But, I figured out that these files aren’t “Microsoft Word document”, but “Microsoft Word 1.x-5.x document” KIND. So, all I should* have to do is change the line above from this:
set FolderContents to path of files of folder FolderRef whose kind is “Microsoft Word document”
to this:
set FolderContents to path of files of folder FolderRef whose kind is (“Microsoft Word document” || “Microsoft Word 1.x-5.x document”)
Yes, Kai is right! I ended up pummeling myself to get the answer, but glad to know you had my back.
There were some other little tidbits that were annoying me that I hacked (changing the destination folder) that would be cool to automate or have selectable, but I’ve already cranked through 50 discs in an hour as-is, so you guys have turned maybe 4-5 full days of work into maybe 4-5 hours. THANKS SO MUCH!!!
vizit seems to be taken care of, but I’m still interested in using unix command.
I came up with this much, which works with what I threw at it (I know it’s not perfect). I’ll leave it here to give people something else to look at.
property copyBase : path to desktop from user domain
property knownVolumes : {"Halo"}
property findThis : "msword"
on run
repeat
copyFromCD()
display dialog "Script finished." buttons {"Run again", "OK"} default button 2
if (button returned of result) is not "Run again" then exit repeat
end repeat
end run
on copyFromCD()
set otherVolumes to {}
tell application "System Events" to get name of every disk whose ejectable is true
repeat with thisDisk in (result)
if thisDisk is not in knownVolumes then set otherVolumes's end to (contents of thisDisk)
end repeat
set thisVolume to item 1 of otherVolumes
try
tell application "Finder" to make new folder at copyBase with properties {name:thisVolume}
set copyTo to quoted form of POSIX path of (result as alias)
set thisVolume to quoted form of ("/Volumes/" & thisVolume)
set foundFiles to do shell script "find " & thisVolume & " -type f -exec file -iF :// {} \\; | grep " & findThis
set ASTID to AppleScript's text item delimiters
set AppleScript's text item delimiters to ASCII character 13
set foundFiles to every text item of foundFiles
set AppleScript's text item delimiters to ASTID
repeat with thisFile in foundFiles
set ASTID to AppleScript's text item delimiters
set AppleScript's text item delimiters to "://"
set thisFile to first text item of thisFile
set AppleScript's text item delimiters to ASTID
try
do shell script "cp " & quoted form of thisFile & " " & copyTo
on error errorMsg number errorNum
display dialog "Error: (" & errorNum & "):" & return & return & errorMsg buttons "Cancel" default button 1 with icon caution
end try
end repeat
--tell application "Finder" to eject thisVolume
on error errorMsg number errorNum
display dialog "Error: (" & errorNum & "):" & return & return & errorMsg buttons "Cancel" default button 1 with icon caution
end try
end copyFromCD