Can a script solve this OE5 problem?

I’m a script dunce who’s wondering if Applescript can solve a specific problem I have - I’m running OE5.06 in Classic on OS X 10.2.6, and I’m trying to extract data from incoming emails generated by a formmail.cgi. The fields include name, address 1/2, email etc.

I’m aiming to output the data in tab-delimited text format.

Am I stupid or is this possible? If so is there someone who likes a challenge willing to write it for me?
javascript:emoticon(‘:D’)

You certainly can do it, I think everyone should have a script that processes incoming emails.

Processing each line kind of depends on how your CGI formats the email though. I have one that uses the common structure of:

fieldOne=some data typed in a field
fieldTwo=some data selected from a menu
fildThree=some data from a radio button

So that is what I used in the following script. You can change the way it processes each line to fit how your CGI formats your emails.

--this is the file we'll write the extracted data to.
property logFile : alias "Macintosh HD:Desktop Folder:text file.txt"

--set up a variable for the carraige return
property carraigeReturn : (ASCII character 13)

tell application "Outlook Express"
	--get a list of the current messages	
	--will return messages currently manually selected, or all messages being received
	--when the RULE is triggered. (see below)
	set lisCurrentMsgs to get the current messages
	
	--repeat with every current message
	--(in case more than one match the RULE on receipt)
	repeat with alsThisMsg in lisCurrentMsgs
		set strBody to the content of alsThisMsg --get the body of this message
		--store the current TID's
		set oldDelims to AppleScript's text item delimiters
		--set the TID's to return
		set AppleScript's text item delimiters to return
		--get the text items (every item delimited by return) of the body of this email
		set submittedItems to the text items of strBody
		--reset the TID's
		set AppleScript's text item delimiters to oldDelims
		repeat with thisItem in submittedItems
			--repeat with every text item of the body of this email (we split it up with return as a delimiter)
			if thisItem contains "Who:=" then --then this line contains the data for who
				set charCount to the (count of characters in (thisItem as string)) --get the character count for this line
				set whoField to characters 6 thru charCount of thisItem as string
			else if thisItem contains "Email:=" then --this this line contains the email address
				set charCount to the (count of characters in (thisItem as string)) --get the character count for this line
				set emailField to characters 8 thru charCount of thisItem as string
				--add as many else if's as you need and edit the appendText line below to accomodate
			end if
		end repeat
		--now create a tab delim'ed record from the harvested data and write to a file
		set appendText to whoField & tab & emailField as string --concatenate the data to form the record
		set getEOF to (get eof logFile) + 1 --get the end of the log file and add one (this is where we'll start writing)
		write appendText & carriageReturn to logFile starting at getEOF --write to the file
	end repeat
end tell

Save the script in OE’s Script Menu Items folder and set up a rule to match some criteria from the emails your CGI send. If the email address is unique, use that, or use the subject line. Make sure the rule action performs the script you just saved, and maybe moves the email or deletes it.

Hope this gets you on your way.

That is VERY helpful. Thanks a million, I think I am on my way…

Hi again,

I’ve done my newbie best to compile this from what you supplied and below is where I got to. However I’m getting an execution error: "the variable ‘nameField’ is not defined’.
Can you diagnose this and any other glaring errors? Much appreciated.

property logFile : “Macintosh HD/Desktop/mailoutput.txt”

property carriageReturn : (ASCII character 13)

on run
tell application “Outlook Express”

	set lisCurrentMsgs to get the current messages
	
	repeat with alsThisMsg in lisCurrentMsgs
		
		set strBody to the content of alsThisMsg
		
		set oldDelims to AppleScript's text item delimiters
		
		set AppleScript's text item delimiters to return
		
		set submittedItems to the text items of strBody
		
		set AppleScript's text item delimiters to oldDelims
		
		repeat with thisItem in submittedItems
			
			if thisItem contains "Name:=" then
				
				set charCount to the (count of characters in (thisItem as string))
				
				set nameField to characters 6 thru charCount of thisItem as string
				
			else if thisItem contains "Address1:=" then
				
				set charCount to the (count of characters in (thisItem as string))
				
				set address1Field to characters 8 thru charCount of thisItem as string
				
			else if thisItem contains "Suburb:=" then
				
				set charCount to the (count of characters in (thisItem as string))
				
				set suburbField to characters 8 thru charCount of thisItem as string
				
			else if thisItem contains "Town/City:=" then
				
				set charCount to the (count of characters in (thisItem as string))
				
				set towncityField to characters 8 thru charCount of thisItem as string
				
			else if thisItem contains "Country:=" then
				
				set charCount to the (count of characters in (thisItem as string))
				
				set countryField to characters 8 thru charCount of thisItem as string
				
			else if thisItem contains "Email:=" then
				
				set charCount to the (count of characters in (thisItem as string))
				
				set emailField to characters 8 thru charCount of thisItem as string
				
			else if thisItem contains "Phone:=" then
				
				set charCount to the (count of characters in (thisItem as string))
				
				set phoneField to characters 8 thru charCount of thisItem as string
				
			else if thisItem contains "findout method:=" then
				
				set charCount to the (count of characters in (thisItem as string))
				
				set findoutField to characters 8 thru charCount of thisItem as string
				
			else if thisItem contains "Other:=" then
				
				set charCount to the (count of characters in (thisItem as string))
				
				set otherField to characters 8 thru charCount of thisItem as string
				
			else if thisItem contains "PRODUCT CHOICE:=" then
				
				set charCount to the (count of characters in (thisItem as string))
				
				set productchoiceField to characters 8 thru charCount of thisItem as string
				
			end if
			
		end repeat
		
		set appendText to nameField & tab & address1Field & tab & suburbField & tab & towncityField & tab & countryField & tab & emailField & tab & phoneField & tab & findoutField & tab & otherField & tab & productchoiceField as string
		
		set getEOF to (get eof logFile) + 1
		
		write appendText & carriageReturn to logFile starting at getEOF
	end repeat
	
end tell

end run

The error appears to be generated when you start setting up the text to append. And it is due to the script never entering the if thisItem contains “Name:=” then block. Apparently the email that is generated by the CGI does not supply a line that contains “Name:=” so it never sets the variable “nameField”. I used that format only as an example, (Field Name:=). You will have to use what the CGI actually supplies as the data heading, and adjust the character count to trim it down.

So, if the CGI sending the email formats it like:

Your code would be:

if thisItem contains "Name=" then
set charCount to the (count of characters in (thisItem as string))
set nameField to characters 5 thru charCount of thisItem as string

You will have to open an email sent by the CGI to figure out how it is being formatted.

I’m not entirely sure on your property either. I’ve just never seen the path to a Mac file portrayed with “/”. Usually you would use a colon “:” but maybe you have a reason for that. Otherwise your code looks good.

If you are still having problems, post the actual email the CGI is sending. Feel free to substitute any sensitive information but leave the text the CGI sends as the data headers.

Best Regards

Ok, I think I’m close to getting this - but now I’m stumped again. I’ve copied a CGI email here as suggested:

Here is where I got to with the code, if I understood you correctly, but when I run it in ScriptEditor I get 'OE got an error: can’t make ‘MacintoshHD:Desktop:mailoutput.txt’ into a file. The filename looks ok to me - any ideas?



property logFile : "MacintoshHD:Desktop:mailoutput.txt"

property carriageReturn : (ASCII character 13)

on run
	tell application "Outlook Express"
		
		set lisCurrentMsgs to get the current messages
		
		repeat with alsThisMsg in lisCurrentMsgs
			
			set strBody to the content of alsThisMsg
			
			set oldDelims to AppleScript's text item delimiters
			
			set AppleScript's text item delimiters to return
			
			set submittedItems to the text items of strBody
			
			set AppleScript's text item delimiters to oldDelims
			
			repeat with thisItem in submittedItems
				
				if thisItem contains "Name:" then
					
					set charCount to the (count of characters in (thisItem as string))
					
					set nameField to characters 5 thru charCount of thisItem as string
					
				else if thisItem contains "Address1:" then
					
					set charCount to the (count of characters in (thisItem as string))
					
					set address1Field to characters 9 thru charCount of thisItem as string
					
				else if thisItem contains "Suburb:" then
					
					set charCount to the (count of characters in (thisItem as string))
					
					set suburbField to characters 7 thru charCount of thisItem as string
					
				else if thisItem contains "Town/City:" then
					
					set charCount to the (count of characters in (thisItem as string))
					
					set towncityField to characters 10 thru charCount of thisItem as string
					
				else if thisItem contains "Country:" then
					
					set charCount to the (count of characters in (thisItem as string))
					
					set countryField to characters 8 thru charCount of thisItem as string
					
				else if thisItem contains "Email:" then
					
					set charCount to the (count of characters in (thisItem as string))
					
					set emailField to characters 6 thru charCount of thisItem as string
					
				else if thisItem contains "Phone:" then
					
					set charCount to the (count of characters in (thisItem as string))
					
					set phoneField to characters 6 thru charCount of thisItem as string
					
				else if thisItem contains "findout method:" then
					
					set charCount to the (count of characters in (thisItem as string))
					
					set findoutField to characters 15 thru charCount of thisItem as string
					
				else if thisItem contains "Other:" then
					
					set charCount to the (count of characters in (thisItem as string))
					
					set otherField to characters 6 thru charCount of thisItem as string
					
				else if thisItem contains "PRODUCT CHOICE:" then
					
					set charCount to the (count of characters in (thisItem as string))
					
					set productchoiceField to characters 15 thru charCount of thisItem as string
					
				end if
				
			end repeat
			
			set appendText to nameField & tab & address1Field & tab & suburbField & tab & towncityField & tab & countryField & tab & emailField & tab & phoneField & tab & findoutField & tab & otherField & tab & productchoiceField as string
			
			set getEOF to (get eof logFile) + 1
			
			write appendText & carriageReturn to logFile starting at getEOF
		end repeat
		
	end tell
end run


Thanks for all the help.

Try changing the property of the logfile to:

Are the fields supplied by the CGI always the same and in the same order? If so, you could simplify your code significantly.

Jon

Assuming all the emails look like the sample you supplied, you could try this code:

Jon

property logFile : "Macintosh HD:Desktop Folder:mailoutput.txt"

The logFile property in the script you posted had the word “Folder” missing after “Desktop” And, unless your drive name is MacintoshHD, (without a space between Macintosh & HD) then your script would have a problem finding the file to write to it. Make sure the property correctly points to the file. Jonn’s suggestion for coding the property is a good one, you can use either. I see no reason your script wouldn’t work after fixing that property.

Thanks for clarifying the logfile point, I’ve addressed that - but it’s not quite resolved yet and I think it might be to do with the CGI headers? Forgive my ignorance, but you may have guessed I’m not a CGI expert either! The first set of sample code I posted didn’t show all the headers - e.g, on examining, I note in each case there’s a unique ‘id’ number. Is this relevant? I’ve pasted some complete code as an example here:



Return-Path: <xxxxxxxx@phosphor.host4u.net>
Received: from phosphor.host4u.net ([64.33.14.39])
	by pollux.host4u.net (8.11.6/8.11.6) with ESMTP id h6T0rZ715368
	for <info@xxxxxxxx.com>; Mon, 28 Jul 2003 19:53:35 -0500
Received: (from xxxxxxxx@localhost)
	by phosphor.host4u.net (8.11.6/8.11.6) id h6T0rUs06314;
	Mon, 28 Jul 2003 19:53:30 -0500
Date: Mon, 28 Jul 2003 19:53:30 -0500
Message-Id: <200307290053.h6T0rUs06314@phosphor.host4u.net>
To: info@xxxxxxxx.com
From: ()
Subject: Sample Request From
Status: RO

Below is the result of your feedback form.  It was submitted by
 () on Monday, July 28, 2003 at 19:53:30
---------------------------------------------------------------------------

Name: xxxxx xxxxxxxx

Address1: xxxxx xxxx xxxxxx

Suburb: xxxxxxx

Town/City: Auckland

Country: New Zealand

Email: xxxxxxxxxxxxx

Phone: xxxxxxxxxxxxx

findout method: Word of mouth

Other: (specify)

PRODUCT CHOICE: xxxxxxxx


I hope this is constructive.

The headers shouldn’t make any difference. When you say it “isn’t resolved yet”, what, exactly, do you mean? What is not working? What is the output you are getting? When you get the content of the message, does it include all of the headers?

Jon

Sorry Jon, ignore my last post, this is the situation.

I’ve saved the script into OE Script Menu items. I tried selecting three Sample Request emails from the relevant folder, and ran the script, but got ‘Script could not be run. End of file was reached.’

However, I also tried highlighting these same scripts, and then running the script from Script Editor. This generated a mailoutput.txt file on the desktop, content of which is:

Very close, but not quite what I need :slight_smile: I’m after the data but not the field names if poss. The name field is missing, ‘address1’, ‘Town/City’ and ‘Phone’ shows ok, but the rest just show fieldnames, not content.

I’m hoping to achieve a tab delimited format that will just open up in Excel in columns so I can edit it to use as a mailing list.

I hope this makes sense, and apologise about the ‘handholding’…

Did you try the script I posted above?

Jon

Yes, as supplied. If it needs customising I’m not sure how.

Hi,

Jon and I are approaching this differently, I just want you to be aware of that. Either way will work fine but I don’t think the time savings is all that significant. Here is my version again, which is different from Jon’s slightly, just so you know. I’ve updated it to work with what your CGI emails send you - you should be able to just copy and paste this into a new script - compile and save. I copied your text and emailed it to myself and ran this script. The result is a tab delimited text file with the cgi form components and without headers.

Warmest regards~

--this is the file we'll write the extracted data to. 
property logFile : ((path to desktop) as string) & "mailoutput.txt"

--set up a variable for the carraige return 
property carraigeReturn : (ASCII character 13)
tell application "Outlook Express"
	--get a list of the current messages    
	--will return messages currently manually selected, or all messages being received 
	--when the RULE is triggered.
	set lisCurrentMsgs to get the current messages
	
	--repeat with every current message 
	--(in case more than one match the RULE on receipt) 
	repeat with alsThisMsg in lisCurrentMsgs
		set strBody to the content of alsThisMsg --get the body of this message 
		--store the current TID's 
		set oldDelims to AppleScript's text item delimiters
		--set the TID's to return 
		set AppleScript's text item delimiters to return
		--get the text items (every item delimited by return) of the body of this email 
		set submittedItems to the text items of strBody
		--reset the TID's 
		set AppleScript's text item delimiters to oldDelims
		repeat with thisItem in submittedItems
			set thisItem to thisItem as string
			--repeat with every text item of the body of this email (we split it up with return as a delimiter) 
			if thisItem contains "Name: " then --then this line contains the data for who 
				set charCount to the (count of characters in (thisItem as string)) --get the character count for this line 
				set whoField to characters 6 thru charCount of thisItem as string
			else if thisItem contains "Address1: " then --this this line contains the email address 
				set charCount to the (count of characters in (thisItem as string)) --get the character count for this line 
				set addressField to characters 10 thru charCount of thisItem as string
			else if thisItem contains "Suburb: " then --this this line contains the email address 
				set charCount to the (count of characters in (thisItem as string)) --get the character count for this line 
				set suburbField to characters 8 thru charCount of thisItem as string --now get the characters after the hader
			else if thisItem contains "Town/City: " then --this this line contains the email address 
				set charCount to the (count of characters in (thisItem as string)) --get the character count for this line 
				set townCityField to characters 11 thru charCount of thisItem as string --now get the characters after the hader
			else if thisItem contains "Country: " then --this this line contains the email address 
				set charCount to the (count of characters in (thisItem as string)) --get the character count for this line 
				set countryField to characters 9 thru charCount of thisItem as string --now get the characters after the hader
			else if thisItem contains "Email: " then --this this line contains the email address 
				set charCount to the (count of characters in (thisItem as string)) --get the character count for this line 
				set emailField to characters 7 thru charCount of thisItem as string --now get the characters after the hader
			else if thisItem contains "Phone: " then --this this line contains the email address 
				set charCount to the (count of characters in (thisItem as string)) --get the character count for this line 
				set phoneField to characters 7 thru charCount of thisItem as string --now get the characters after the hader
			else if thisItem contains "findout method: " then --this this line contains the email address 
				set charCount to the (count of characters in (thisItem as string)) --get the character count for this line 
				set findoutField to characters 16 thru charCount of thisItem as string --now get the characters after the hader
			else if thisItem contains "Other: " then --this this line contains the email address 
				set charCount to the (count of characters in (thisItem as string)) --get the character count for this line 
				set otherfield to characters 7 thru charCount of thisItem as string --now get the characters after the hader
			else if thisItem contains "PRODUCT CHOICE: " then --this this line contains the email address 
				set charCount to the (count of characters in (thisItem as string)) --get the character count for this line 
				set productField to characters 16 thru charCount of thisItem as string --now get the characters after the hader
				--add as many else if's as you need and edit the appendText line below to accomodate 
			end if
		end repeat
		--now create a tab delim'ed record from the harvested data and write to a file 
		set appendText to whoField & tab & addressField & tab & suburbField & tab & addressField & tab & townCityField & tab & countryField & tab & emailField & tab & phoneField & tab & findoutField & tab & otherfield & tab & productField as string --concatenate the data to form the record 
		try
			set getEOF to (get eof logFile) + 1 --get the end of the log file and add one (this is where we'll start writing) 
			write appendText & carraigeReturn to logFile starting at getEOF --write to the file 
		on error --if the file does not exist yet, the last couple of lines will fail
			set logFile to open for access logFile with write permission --make the file with write permission
			write appendText & carraigeReturn to logFile starting at 0 --write to the file 
			close access logFile --close access
		end try
	end repeat
end tell

And to be even more clear, Mytzlscript’s solution is better if there might be changes in the format of the email (although there’s still room for optimization :wink: ). So, its up to you: speed or accuracy if the format changes.

Jon

We have liftoff. Thanks for all the help, you have no idea how much work you’ve saved me:-)

Jon,

One more newbie question - not sure if it’s a scriptable solution or not.

Now that I’ve got my huge text file of data, I need to merge with it another one to create a master list. I can do this ok in either Excel or Filemaker - but either way I’m getting loads of duplicate records showing up.

Surely there’s a simple way to sort/filter by duplicates in either app - or do I need to run a script? I just want to delete the duplicates as easily as possible.

Any input welcome.

To sort, just use the sort command in Excel, that’ll be faster than anything else. In terms of deleting duplicates, after you’ve sorted it in Excel, save the file as a text file and then you can use this script:

Jon