Search PDF files

Hello.

I got a folder tree that I need to search for pdf files…

Folder is setup like this
root
–year
–month-year
–day-month-year

What I need is to get ALL .pdf files into one folder… how would I do something like that
there is over 1600 files located in the foldertree… I’ve searched a bit and what I found did’t
help me…

Test this on smaller folder structures before moving on to the 1600 files. :wink:

global destination
on run
	set theFolder to choose folder with prompt "Choose Source Folder "
	set destFolder to choose folder with prompt "Choose Destination Folder "
	set destination to destFolder as alias
	tell application "Finder"
		set all_files to every file in folder theFolder
		set all_folders to every folder in folder theFolder
	end tell
	my movefiles(all_files)
	my searchfolders(all_folders)
end run
on movefiles(flenames)
	set n to count of items in flenames
	repeat with j from 1 to n
		tell application "Finder"
			set nwitem to item j of flenames as alias
			set k to kind of nwitem
			if k starts with "pdf" then
				move nwitem to destination
			end if
		end tell
	end repeat
end movefiles
on searchfolders(foldernames)
	tell application "Finder" to set n to count of items in foldernames
	repeat with j from 1 to n
		tell application "Finder"
			set nwitem to item j of foldernames as alias
			set subfiles to every file in nwitem
			set subfolders to every folder in nwitem
		end tell
		my movefiles(subfiles)
		my searchfolders(subfolders)
	end repeat
end searchfolders

Oh yeah, that worked like I hoped… Thanks alot for the fast reply

Hm, okay, this is working but taking a h*** of a time to finish, what I try to do is create an search application for a company, wich let’s the administration search for orders, when a new order is set, they get an mail, this mail is then printet out so people know what to send in the next delevery, all this is working great, maby I should save the name of the order in database file and let the user sync that file, then show PDF in a WebVew, from a mounted server, that might speed things up a bit… If somebody has some other idea for this please let me know, there are
like 40 new orders a day, so the database file will grow to, I don’t know if there is a limit for a thing like the sqlite3 database, but i guess google can tell me that :slight_smile:

(script editor is still mokking around as i write this :slight_smile: )

/mkh

I don’t know the best way to do what you want, but it occurred to me that there might be files with duplicate names, so I made a few changes. This script will end with a list of the paths to files that it couldn’t copy because of an error.

global destination, pdflist
on run
	set pdflist to {}
	set z to 0
	set theFolder to choose folder with prompt "Choose Source Folder "
	set destFolder to choose folder with prompt "Choose Destination Folder "
	set destination to destFolder as alias
	tell application "Finder"
		set all_files to every file in folder theFolder
		set all_folders to every folder in folder theFolder
	end tell
	my movefiles(all_files)
	my searchfolders(all_folders)
	pdflist
end run
on movefiles(flenames)
	set n to count of items in flenames
	repeat with j from 1 to n
		tell application "Finder"
			set nwitem to item j of flenames as alias
			set k to kind of nwitem
			if k starts with "pdf" then
				try
					move nwitem to destination
				on error
					set nw to nwitem as text
					set end of pdflist to nw
				end try
			end if
		end tell
	end repeat
end movefiles
on searchfolders(foldernames)
	tell application "Finder" to set n to count of items in foldernames
	repeat with j from 1 to n
		tell application "Finder"
			set nwitem to item j of foldernames as alias
			set subfiles to every file in nwitem
			set subfolders to every folder in nwitem
		end tell
		my movefiles(subfiles)
		my searchfolders(subfolders)
	end repeat
end searchfolders

This might be a little faster.
If you do not want to move the files but instead copy them then change
where it says mv to cp

Cheers,

Craig


-- pathToSearch should be the parent most folder in the hierarchy.
set pathToSearch to quoted form of POSIX path of ("path:to:PDF_folder:")
set destinationPath to quoted form of POSIX path of ("path:to:Destination_folder:")

do shell script "find " & pathToSearch & " -iname \"*.pdf\" -exec mv {} " & destinationPath & " \\;"

FYI,

I just tested this on a multi-level folder structure with 3,677 pdf files and it took 65 seconds.

Craig

Thanks to both of you, Craig you are the man, omg that was AWSOME, no i just need to figure out the shell syntax to add the file name to sqlite via that shell script to, then it will be fixed in one line, damm nice… thank you SO much…

As for you cwtnospam your script is great and I actualy think I can use it for one of my other projects, so all in all, super thx…

/mkh

Okay wait a second… we have someone using a find statement and neither Adam or Stefan came in with their mdfind versions? The world must be ending :stuck_out_tongue:

May I remind you of your own mdfind version

http://bbs.macscripter.net/viewtopic.php?pid=97912#p97912

In case anyone’s still interested in a pure ( :wink: ) Applescript method, I’ve made it a little faster:

global destination, pdflist
on run
	set pdflist to {}
	set z to 0
	set theFolder to choose folder with prompt "Choose Source Folder "
	set destFolder to choose folder with prompt "Choose Destination Folder "
	set destination to destFolder as alias
	tell application "Finder"
		set all_files to every file in folder theFolder whose name extension contains "pdf"
		set all_folders to every folder in folder theFolder
	end tell
	my movefiles(all_files)
	my searchfolders(all_folders)
	pdflist
end run
on movefiles(flenames)
	set n to count of items in flenames
	repeat with j from 1 to n
		tell application "Finder"
			set nwitem to item j of flenames as alias
			try
				move nwitem to destination
			on error
				set nw to nwitem as text
				set end of pdflist to nw
			end try
		end tell
	end repeat
end movefiles
on searchfolders(foldernames)
	tell application "Finder" to set n to count of items in foldernames
	repeat with j from 1 to n
		tell application "Finder"
			set nwitem to item j of foldernames as alias
			set subfiles to (every file in nwitem whose name extension contains "pdf")
			set subfolders to every folder in nwitem
		end tell
		my movefiles(subfiles)
		my searchfolders(subfolders)
	end repeat
end searchfolders

Okay okay you got me there (well played btw), but at least I used the regular find in the first incarnation of that script! =)

Thanks alot to all, I think I’ll go with Craig’s way to handle what I’m trying to do here, I ended up with using this in a mailRule, where all infomation is added to sqlite, ALL incomming mails for the account will allways have a pdf file
attached, so I used pdftotxt http://www.bluem.net/downloads/pdftotext_en/ to convert them to plain text files which then is imported to sqlite, that gave me the option to only “sync/push” the new db file to the users machine everytime
they get online on the localnet, and then using cURL to download the PDF (localy) to view in the application…

cwtnospam thx for the new update of that rutine, it sure helped me I one of my other projects…

Got one more question though, currently I’m using a WebView to show the PDF which is working fine, but should I be using
the actual PDFView for this, or is that only needed if I need some options that might be included when using that object
I haven’t checked anything about the PDFView yet sines the WebView was working fine…

appreciate all the help and for pointing out some things…

/mkh