Reading multiple files then parsing

Hello everyone,

first time poster here. I’ve dabbled in programming for a number of years but I’m definitely no pro!

Using tutotials from this excellent site and my own experience, I’ve managed to put together a little applescript that reads in a particular file from our advertising system and parses out the ad numbers from the file. It then duplicates a PDF file and renames it to the ad system. This is the method we use to get our newspaper advertisements ready.

However, I’m interested in doing the same thing with multiple files, but I just can’t seem to get the code to work with multiple files. I haven’t been able to find any good tutorials (but perhaps I just haven’t been using the right search terms). Anyway, this is what I have so far, and it works OK for my purposes:

(it looks for the term “cust” in the data file, and returns the next 8 characters which are the ad number I need parsed out)

property serialBeginning : “cust”
property serialEnd : ASCII character 22

–Read a chosen file and prepare to search for serialBeginning
set theContents to read (choose file)
set originalDelimiters to AppleScript’s text item delimiters
set AppleScript’s text item delimiters to {serialBeginning}

–Split the file into a list of strings that start with serialBeginning
–Ignore the first item, which is just the text before the first occurence
set theItems to text items 2 thru (count of text items of theContents) of theContents

–trim off excess
set serialArray to {}
repeat with nextItem in theItems
–Add text before serialEnd to serialArray
set serialArray to serialArray & text 1 thru 8 of nextItem
end repeat

–serialArray now contains a list of the serial numbers

set theFolder to choose folder with prompt “Select destination folder” without invisibles
tell application “Finder”
repeat with nextItem in serialArray
duplicate file “Ad Number” of folder theFolder
set the name of file “Ad Number copy” of folder theFolder to nextItem
end repeat
end tell

–set the finalOutput to "Processed ads " and item 1 of serialArray and " to " and the last item of serialArray

display dialog "Processed ads " & item 1 of serialArray & " to " & the last item of serialArray buttons {“OK”} default button 1

My attempts to work with multiple files have all ended up miserably and I haven’t saved any script attempts from them. I’d love a nudge in the right direction, if anyone can help!

Thanks for your time & consideration =)

I’m not sure I understand what you want to do. But if your script works then you can just put it in a repeat loop to run it multiple times like so…

repeat
	set theFile to choose file
	
	-- do your script
	
	display dialog "Do you want to continue with another file?" buttons {"Quit", "Continue"} default button 2
	set theButton to button returned of result
	if theButton is "Quit" then exit repeat
end repeat

Thanks for the reply!

Perhaps I should have gone into a bit more detail. We have a binary file that contains a list of all the advertisements running in our newspaper each day (the data file includes the advertiser, ad size, ad ID number, etc.). I wanted to read the file and parse the ad ID numbers out and create a dummy file that is used by our pagination software. This I was able to do just fine. However, I wanted the script to parse ALL the files in a folder, instead of pointing to each one. I don’t need to have a separate output for each file in the folder, I just want to be able to read them all in. I wasn’t having any luck reading all the files of the directory in at once and I kept getting various errors when I’d try to pass the file list to Applescript and then read each in turn.

Basically, what I’m hoping to find is a way to read all the files in a folder into one chunk of data and then parse the numbers out of that chunk. I just can’t get it working with more than one file at a time.

This will set the variable “allContents” to the contents of every file in a folder .

Note how I used this line “set aFile to aFile as string”. Before I did that the file path was being specified as “document file file.txt of folder desktop of folder username of folder Users etc.” This kind of file specification is difficult to work with so I coerced it to a string which made it become “startupdisk:Users:username:Desktop:file.txt” and thus made it much easier to work with. Maybe this was your problem. But note that now that it is a string I have to tell any commands that the string is actually a “file” path, as evidenced by the “read” statement where I had to tell it to “read file”.

Anyway, I hope this helps.

set theFolder to choose folder
tell application "Finder" to set theFiles to every file of theFolder

set allContents to ""
repeat with aFile in theFiles
	set aFile to aFile as string
	set allContents to allContents & (read file aFile) & return
end repeat

I see that regulus6633 has already posted a similar solution while I was composing this one. Since mine differs a bit in the fine details, I will go ahead and post it anyway.

Let me see if I understand.

  1. The file that you manually pick via choose file near the beginning of the script is the binary file that contains advertisement information.
  2. There are actually multiple instances of these binary advertisement info files (#1).
  3. All of these binary advertisement info files (#2) are stored in the same directory. Or if not all of them, at least all of them that you would like to process at one time.
  4. Because of the format of these binary advertisement info files (#1) and the way you parse them, you can concatenate the contents of all of the files (#2) in this directory (#3) together and meaningfully parse out the info you need for all the files in one go.

If that is the case, then try changing these lines:

--Read a chosen file and prepare to search for serialBeginning
set theContents to read (choose file)

to these lines:

--Read the files from a chosen folder
set theFolder to choose folder with prompt "Choose the directory containing the Advertisement Information file to process"
tell application "Finder" to set theFiles to every file of folder theFolder as alias list
set theContents to ""
repeat with theFile in theFiles
	set theContents to theContents & (read theFile)
end repeat
--prepare to search for serialBeginning

In comparison to regulus6633’s solution, I have Finder coerce the list of file objects to a list of (generic AppleScript) aliases. That way I do not have to coerce them to a string and file later to read them (read accepts a generic AppleScript alias just fine). Also, since you said the files were binary, I do not put in a return like regulus6633’s solution does. Other than that, the solutions take the same approach.

Based on the contents of the display dialog command at the end of your script, it looks like you are assuming that you will be able to tell which entrys have been processed by knowing the first entry from the file and the last entry from the file. When combining files together, this assumption may no longer hold, as it will depend on the on the ordering of the files as returned by every file of folder theFolder as well as the relative order of the entries between the files.

Say the first file in that list of files contains 00000312, 00000313, 00000314. Furthermore, the second file in that list contains 00000319, 00000320. And, the last file in the list contains 00000315, 00000316, 00000317, 00000318. The last dialog would show “Processed ads 00000312 to 00000318”. Even though one would normally think of the entire list as 00000312 to 00000320. This may not be what you expect. If you want to keep this summary report and have it be meaningful, you will probably have to sort the entries to find the first and last one. I suppose if you just need the first and last entry for the report at the end, then you do not need a full sort, just one pass through the list to pick out the smallest and largest. Though you would pretty much need a full sort if you wanted to make sure there are no gaps or duplicates in the list. The reason I bring this up is because, to me, the phrasing of the report implies that there are no gaps or duplicates. If you need a sorting handler, you might want to check out this unScripted tutorial on sorting.

One other thing, if the combined size of the advertisement info files is very large, it might be better to process the files one by one instead of reading them all in at once.

Model: iBook G4 933
AppleScript: 1.10.7
Browser: Safari 419.3
Operating System: Mac OS X (10.4)

Excellent, it worked! Thank you both very much for the feedback, suggestions and code. I ended up using the latter code and it worked quite well. I was close in my attempts but I see where I went wrong ( I too was trying to use a Finder tell but I didn’t have the wording quite down the way it should be).

This’ll be great, it’ll save a good hour of typing a week and will eliminate errors. Now I just have to rewrite a Basic parse for our classified ads and I’ll be all set!

Thanks again for all the help

I’m glad it worked… good luck!