bundling up email messages into a PDF attachment

What I would like to do:

  • every day at a certain time, look for all items which have been added to a particular mailbox that day (ie where date received is newer than yesterday)
  • combine all these messages together into a digest
  • print the digest into a PDF file
  • forward said file as an attachment

Is this possible?

Thanks

I assume you are using automator.

When you save these file down, do you get odd characters in the text.

Hi,
Here is my first draft,
I have tested this script and I think it does what you want.

The reason I asked about odd characters in the text, is no matter what method I tried, Automator, Applescript…
I got lots of garbage from the bad html type emails.

I had the bases of the pdf conversion and emailer pretty much already,I could have posted a script sooner but was not really happy with the garbage in the end result.

So spent sometime trying to find a solution.

This works. Now saying that, I am sure someone will do it better :slight_smile:

The script gathers the selected emails. (You will need to change the get selection to what ever method you want to use to gather the emails)

It then gets each emails details and combines them in a similar format used by the Automator action and cleans any garbage characters out by writing out two helper files.
Then writes out the final pdf file, and deletes the two help files.
Formats the email and sends it. ( Actually I have set it here to SAVE in drafts, Change it to SEND)

Thats it, I wrote this on the fly kinda thing, so it most likely can slimmed down;)


(*	CHANGE EMAIL ADDRESSES *)
property theAddressTo : "TheirEmailAddress@mac.com"
property theAddressFrom : "YourEmailAddress@mac.com"

(*	Digest format lines  *)
set sendr to "FROM: "
set dte to "DATE: "
set sbj to "SUBJECT: "
set rcpt to "RECIPIENT: "
set body to ""
set demark to "-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*"
set theMessageString to ""
set theBody to ""
(*Path and file names for helper conversion files

	CHANGE PATH TO WHERE YOU WANT *)
set usr to POSIX path of (path to current user folder from user domain)

set textTidyFile to usr & "out-file.html"
set newFileName to ""
set textOutFile to usr & "out"

global textTidyFile, textOutFile, newFileName


tell application "Mail"
	set added_items to get selection
	repeat with i from 1 to number of items in added_items
		set this_item to item i of added_items
		set theMessage to {dm:demark & return & return, theDate:dte & date sent of this_item & return, theFrom:sendr & sender of this_item & return, theSubject:sbj & subject of this_item & return, theRecipient:rcpt & address of recipient of this_item & return & return, theContent:body & content of this_item & return & return}
		set theMessageString to theMessageString & (items of theMessage as string)
	end repeat
	(*	Cleanup any garbage from html emails *)
	set theMessageString to my textUtilStuff(theMessageString)
	
	(*	write out helper file *)
	my writeFile(theMessageString)
	(* convert text from helper file to pdf and then remove helper file (-D option in the convert command *)
	my processFile()
	
	(* mail out pdf file 
	Change SAVE to SEND in the _mailer() handler*)
	my _mailer()
end tell



on textUtilStuff(theBody)
	(* using -inputencoding utf-8, seems to stop some bits of garbled text from the html emails *)
	(* using -utf8 in the tidy solved the rest of the   garbled text from the html emails 
	
	The echo "d" >&2  stops the warnings *)
	
	set theBody to (do shell script "echo " & quoted form of theBody & "|textutil   -inputencoding utf-8 -convert html -stdin -stdout | tidy -iq -utf8 -output " & textTidyFile & ";echo \"d\" >&2 ")
	set theMessageString to (do shell script "textutil   -inputencoding utf-8 -convert txt  " & textTidyFile & " -stdout ")
	
	return theMessageString
end textUtilStuff

on writeFile(theMessageString)
	set t_message to (do shell script "echo " & quoted form of theMessageString & "> " & textOutFile)
end writeFile



on processFile()
	set thePOSIXFileName to textOutFile
	set terminalCommand to ""
	set convertCommand to "/System/Library/Printers/Libraries/./convert "
	set newFileName to thePOSIXFileName & ".pdf"
	set terminalCommand to convertCommand & "-i \"text/plain\" -D -f " & "\"" & thePOSIXFileName & "\"" & " -o " & "\"" & newFileName & "\"" & " -j \"application/pdf\""
	
	do shell script terminalCommand
	tell application "Finder" to delete (POSIX file textTidyFile as alias)
	
end processFile

on _mailer()
	
	set theDigest to (POSIX file newFileName) as alias
	tell application "Mail"
		set theSubject to "Combined pdf digest "
		set newMessage to make new outgoing message with properties {subject:theSubject, content:"New Digest" & return & return}
		tell newMessage
			
			set visible to false
			set sender to theAddressFrom
			make new to recipient at end of to recipients with properties {address:theAddressTo}
			(* change save to send to send*)
			tell content
				make new attachment with properties {file name:theDigest} at after the last paragraph
			end tell
			save --<<<<---------------- change save to send to send
			(* change save to send to send*)
		end tell
		
	end tell
	
end _mailer


Hey Mark - many many thanks for this.

On the selection, I’m trying to find something that will run as a scheduled job, so rather than get the selection, it would look for the number of new items since the last time it ran. Is this something like:

set added_items to messages of mailbox “Local/my_mailbox” where x

and x is date received = after yesterday

I can’t work out how to express the date.

My thinking behind this is that I subscribe to a bunch of mailing lists but rarely get round to reading them at my desk … if I can digest the day’s input together and send as a PDF to the kindle, I could catch up with them on the train …

As a side issue, is there a way of cutting the Yahoo mailing list boilerplate from each message before appending it? This is something like:

“to read your list go here
to look at new files go there” etc … it’s always the same for each message.

Again, many many thanks. It’s a great sample for working out Applescript as well.
Mark

Hi Mark,

I just saw your last post, while I came back here to post an update to the script which does away with the helper files.
Now the only file written out is the pdf.

I will read through your last post and see if I can help with your questions.

But note, the easiest way for me to test a script like this is to do it from selection, before I go into trying to get rules and schedules working.
Can you test the script ( select some emails and run the script, then look in drafts. )
to see how it works so far.

Many thanks

(*	CHANGE EMAIL ADDRESSES *)
property theAddressTo : "TheirEmailAddress@mac.com"
property theAddressFrom : "YourEmailAddress@mac.com"

(*	Digest format lines  *)
set sendr to "FROM: "
set dte to "DATE: "
set sbj to "SUBJECT: "
set rcpt to "RECIPIENT: "
set body to ""
set demark to "-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*"
set theMessageString to ""
(* CHANGE PATH TO WHERE YOU WANT *)
set usr to POSIX path of (path to current user folder from user domain)
(* will be used for the new pdf file *)
set textOutFile to usr & "out"
set newFileName to ""
global textOutFile, newFileName

tell application "Mail"
	set added_items to get selection
	set counter to (count of added_items)
	log counter
	if counter ≥ 1 then
		repeat with i from 1 to number of items in added_items
			set this_item to item i of added_items
			set theMessage to {dm:demark & return & return, theDate:dte & date sent of this_item & return, theFrom:sendr & sender of this_item & return, theSubject:sbj & subject of this_item & return, theRecipient:rcpt & address of recipient of this_item & return & return, theContent:body & content of this_item & return & return}
			set theMessageString to theMessageString & (items of theMessage as string)
		end repeat
		(*	Cleanup any garbage from html emails *)
		set theMessageString to my textUtilStuff(theMessageString)
		
		(* convert text to pdf  *)
		my processFile(theMessageString)
		
		(* mail out pdf file 
	Change SAVE to SEND in the _mailer() handler*)
		my _mailer()
	end if
end tell



on textUtilStuff(theMessageString)
	(* using -inputencoding utf-8, seems to stop some bits of garbled text from the html emails *)
	(* using -utf8 in the tidy solved the rest of the   garbled text from the html emails 
	
	Now Using "exit 0" instead of " echo "d" >&2  " stops the warnings, as I relised i can force a successful exit (0) *)
	
	set theMessageString to (do shell script "echo " & quoted form of theMessageString & "|textutil   -inputencoding utf-8 -convert html -stdin -stdout | tidy -iq -utf8 ; exit 0 ")
	set theMessageString to (do shell script "echo " & quoted form of theMessageString & "|textutil   -inputencoding utf-8 -convert txt  -stdin  -stdout ")
	
	return theMessageString
end textUtilStuff


on processFile(theMessageString)
	set terminalCommand to ""
	set convertCommand to "/System/Library/Printers/Libraries/./convert "
	set newFileName to textOutFile & ".pdf"
	set terminalCommand to convertCommand & "-i \"text/plain\" -o " & "\"" & newFileName & "\"" & " -j \"application/pdf\""
	do shell script "echo " & quoted form of theMessageString & "|" & terminalCommand
	return newFileName
end processFile

on _mailer()
	
	set theDigest to (POSIX file newFileName) as alias
	tell application "Mail"
		set theSubject to "Combined pdf digest "
		set newMessage to make new outgoing message with properties {subject:theSubject, content:"New Digest" & return & return}
		tell newMessage
			
			set visible to false
			set sender to theAddressFrom
			make new to recipient at end of to recipients with properties {address:theAddressTo}
			(* change save to send to send*)
			tell content
				make new attachment with properties {file name:theDigest} at after the last paragraph
			end tell
			save --<<<<---------------- change save to send to send
			(* change save to send to send*)
		end tell
		
	end tell
	
end _mailer

Ok update of the script.

Now looks at a mailbox instead of selection.
Looks for unread messages in the mailbox.
Only processes the unread messages.
Once processed and the email is sent, it marks the unread messages in it list as read.

Doing it that way with the unread/read messages, saves messing around with dates.

To schedule, just use iCal

(*    CHANGE EMAIL ADDRESSES *)
property theAddressTo : "TheirEmailAddress@mac.com"
property theAddressFrom : "YourEmailAddress@mac.com"

(*	Digest format lines  *)
set sendr to "FROM: "
set dte to "DATE: "
set sbj to "SUBJECT: "
set rcpt to "RECIPIENT: "
set body to ""
set demark to "-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*"
set theMessageString to ""
(* CHANGE PATH TO WHERE YOU WANT *)
set usr to POSIX path of (path to current user folder from user domain)
(* will be used for the new pdf file *)
set textOutFile to usr & "out"
set newFileName to ""

set theMailbox to "yourMailBox"
(* Note: if the mailbox is inside another mailbox, then write the name like a posix path 
example, I have a mailbox called "old games" inside a mailbox called "Chess". I want to point to the "old games" mailbox. I will use:

set theMailbox to "Chess/Old Games"
  *)
global textOutFile, newFileName

tell application "Mail"
	set added_items to messages of mailbox theMailbox whose read status is false
	set counter to (count of added_items)
	log counter
	if counter ≥ 1 then
		repeat with i from 1 to number of items in added_items
			set this_item to item i of added_items
			set theMessage to {dm:demark & return & return, theDate:dte & date sent of this_item & return, theFrom:sendr & sender of this_item & return, theSubject:sbj & subject of this_item & return, theRecipient:rcpt & address of recipient of this_item & return & return, theContent:body & content of this_item & return & return}
			set theMessageString to theMessageString & (items of theMessage as string)
		end repeat
		(*	Cleanup any garbage from html emails *)
		set theMessageString to my textUtilStuff(theMessageString)
		
		(* convert text to pdf  *)
		my processFile(theMessageString)
		
		(* mail out pdf file 
	Change SAVE to SEND in the _mailer() handler*)
		my _mailer(added_items)
	end if
end tell



on textUtilStuff(theMessageString)
	(* using -inputencoding utf-8, seems to stop some bits of garbled text from the html emails *)
	(* using -utf8 in the tidy solved the rest of the   garbled text from the html emails 
	
	Now Using "exit 0" instead of " echo "d" >&2  " stops the warnings, as I relised i can force a successful exit (0) *)
	
	set theMessageString to (do shell script "echo " & quoted form of theMessageString & "|textutil   -inputencoding utf-8 -convert html -stdin -stdout | tidy -iq -utf8 ; exit 0 ")
	set theMessageString to (do shell script "echo " & quoted form of theMessageString & "|textutil   -inputencoding utf-8 -convert txt  -stdin  -stdout ")
	
	return theMessageString
end textUtilStuff


on processFile(theMessageString)
	set terminalCommand to ""
	set convertCommand to "/System/Library/Printers/Libraries/./convert "
	set newFileName to textOutFile & ".pdf"
	set terminalCommand to convertCommand & "-i \"text/plain\" -o " & "\"" & newFileName & "\"" & " -j \"application/pdf\""
	do shell script "echo " & quoted form of theMessageString & "|" & terminalCommand
	return newFileName
end processFile

on _mailer(added_items)
	
	set theDigest to (POSIX file newFileName) as alias
	tell application "Mail"
		set theSubject to "Combined pdf digest "
		set newMessage to make new outgoing message with properties {subject:theSubject, content:"New Digest" & return & return}
		tell newMessage
			
			set visible to false
			set sender to theAddressFrom
			make new to recipient at end of to recipients with properties {address:theAddressTo}
			(* change save to send to send*)
			tell content
				make new attachment with properties {file name:theDigest} at after the last paragraph
			end tell
			save --<<<<---------------- change save to send to send
			(* change save to send to send*)
		end tell
		repeat with a from 1 to number of items in added_items
			set this_itema to item a of added_items
			
			set read status of this_itema to true
		end repeat
	end tell
	
end _mailer

You will need to be a bit more clearer on what you want with the yahoo stuff, I read it as you want some text stripped?.

If sou you will need to post me a copt of one of the emails and explain the parts you do not want.
Cheers

Hi Mark

The script fails at:

do shell script "echo " & quoted form of theMessageString & "|textutil -inputencoding utf-8 -convert html -stdin -stdout | tidy -iq -utf8 ; exit 0 "

“error “The command exited with a non-zero status.” number 255” … looks like the exit isn’t forceable (or an earlier command exited before it had a chance to force exit.

Mark

Most likely, I will have another look at it.

Do you get the same thing if you change the line to

do shell script "echo " & quoted form of theMessageString & "|textutil   -inputencoding utf-8 -convert html -stdin -stdout | tidy -iq -utf8 ;;echo \"d\" >&2 

"

Most annoyingly I cannot now get that error for love nor money.
The script just works!

This may solve it?.
The echo "d" >&2 part definitely suppress the warnings tidy throws up about changes its made.
And the 2>/dev/null I thing should redirect errors to Null ( else where, land of nod, other universe)?

set theMessageString to (do shell script "echo " & quoted form of theMessageString & "|textutil   -inputencoding utf-8 -convert html -stdin -stdout | tidy  -iq -utf8 2>/dev/null ;echo \"d\" >&2 ")

Hi Mark - still getting the same error - have tried putting exit 0 into both commands in the pipeline …

A bit more - I tried encasing the offending do shell script (and in fact the following one) in a try/end block, causing it to ignore the error code - this worked. However, a subsequent shell script also ends non-zero (and encasing that just hides the error - it fails to write the output file):

'|/System/Library/Printers/Libraries/./convert -i "text/plain" -o "/Users/mark/out.pdf" -j "application/pdf""
→ error “The command exited with a non-zero status.” number 255
Result:
error “Can’t make file "Mac 750GB:Users:mark:out.pdf" into type alias.” number -1700 from file “Mac 750GB:Users:mark:out.pdf” to alias

This could be a size issue - I’m running against a mailbox with a large number of unread entries. Running it with a mailbox with only 2 unread entries, it works.

Ok just testing it on a mailbox with over 5 thousand unread…
Oops time out…
Ok let me try 1000
Oops time out…
Ok let me try 500

By the way I did thing about a try block, but I wanted to see if I could avoid it.
I would need to process each message in a repeat loop rather than the whole message as 1.

The mailbox which is failing has about 750.

One issue (I suspect) on the scripts being caught is that the tidy command has an unusual error code - 0 for OK, 1 for warnings, 2 for errors, rather than the usual 0/1 for OK/error.

I love tidy, but ma…n,
This should not be so bloody hard.

I’m actually starting to think its down to the character length of the final theMessageString

I think you’re right. I’ve got it working by serialising out to files rather than passing around as a string.

So the first do shell script ends up as the following (and textutil should accept an input file rather than having to be catted to, but doesn’t want to play). I’ve removed the second reformat to text as I’m trying to work out whether I can get a prettier format by sticking with HTML. Note in some cases it barfs on utf8 also, so I took that out.

do shell script "cat " & the_file & “| textutil -convert html -stdout -stdin| (tidy -iq -utf8 -f /dev/null; exit 0) >” & out_file

… and the second one is (still playing with html, but it does take a ton of processing power - however for my purposes it feels like I can just send the kindle html anyhow. I plan to put a sed script before the textutil above, to strip out the boilerplate and possibly to pretty up the HTML some more.

set terminalCommand to convertCommand & “-f "” & out_file & "" -i "text/html" -o " & “"” & newFileName & “"”
do shell script terminalCommand

Will tidy up and post my changes. Many thanks for the persistence, Mark …