Remove duplicate lines from AppleScript variable

I have a long-working AppleScript that saves certain email messages to Notes (so that I can action at a later time… Notes is my data inbox).

Some emails come into Notes with many many (many) extra line breaks which makes them less pleasant to read. I got the idea that I could trim out the extra lines, but I’m getting an error.

Ideally, I never want more than one blank line between text.

Here’s the sed line that complains about not understanding the Message_Body input

How should this line be written?

And, for context, here is the entire script that works fine except for the line where I try to “do shell script”.

tell application "Mail"
    set theMessages to every message of mailbox "INBOX" of account "Gmail" whose flag index is 1
    repeat with i from 1 to number of items in theMessages
        
        set thisMessage to item i of theMessages
        set Message_Subject to subject of thisMessage
        set Message_Sender to sender of thisMessage
        set Message_Body to content of thisMessage
        set Message_Body to do shell script "sed '$!N;/^\\n$/{$q;D;};P;D;' " & Message_Body
        set Message_ID to message id of thisMessage
        set Message_Date to date received of thisMessage
        set flag index of thisMessage to 5
        
        tell application "Notes"
            tell account "iCloud"
                make new note at folder "Notes" with properties {creation date:Message_Date, body:"<pre style=\"font-family:System Font;font-size:14px;\">" & ¬
                    Message_Date & linefeed & ¬
                    "Subject: " & Message_Subject & linefeed & ¬
                    "Sender: " & Message_Sender & linefeed & linefeed & ¬
                    Message_Body & linefeed & ¬
                    "message://%3c" & Message_ID & "%3e" & ¬
                    "</pre>"}
                delay 2
            end tell
        end tell
        
    end repeat
end tell

Hi. Shell calls shouldn’t be placed inside a tell block, and are ideally only called once to edit a string, rather than looped through several iterations. I would also use TR for this, rather than sed; it’s considerably easier.

set theText to "extra 





returns between 
lines

must 

go"



do shell script "echo " & theText's quoted form & " | tr -s " & linefeed's quoted form

Alternatively, a text item delimiter solution could be used.

repeat until theText does not contain linefeed & linefeed
	set text item delimiters to linefeed & linefeed
	set theText to theText's text items
	set text item delimiters to linefeed
	set theText to theText as text
end repeat
theText

–Edited to correct the delimiter option

As I tested, the Mail.app doesn’t conflict with do shell script command. I think the problem with OP’s original script is that the sed command is written wrong:


set Message_Body to "extra 





returns between 
lines

must 

go"

tell application "Mail"
	set Message_Body to do shell script "echo " & Message_Body's quoted form & " | tr -s " & linefeed's quoted form
end tell

Some other applications conflict with do shell script command, but you can still redirect the tell statement:


tell application "Mail"
	tell scripting additions to set Message_Body to do shell script "echo " & Message_Body's quoted form & " | tr -s " & linefeed's quoted form
end tell

NOTE: Text item delimiters method is always preferable, because it is the fastest method to achieve the same. But the Marc Anthony’s script doesn’t work properly, so I provide here the proper script:


set Message_Body to "extra 





returns between 
lines

must 

go"

set cleanedText to {}
repeat with oItem in (Message_Body's paragraphs)
	if oItem as text ≠ "" then set end of cleanedText to oItem as text
end repeat

set {ATID, AppleScript's text item delimiters} to {AppleScript's text item delimiters, linefeed}
set Message_Body to cleanedText as text
set AppleScript's text item delimiters to ATID
return Message_Body

No shell command should ever be scoped inside another application, even if it may work work in this particular instance.

For what it’s worth, here’s a sed solution. There have to be line breaks after the ‘here’ lines, but the rest can be semicolon-separated if preferred.

set messageBody to "

extra 





linefeeds between, before, and after
lines

must 

go

"

set messageBody to (do shell script "sed '/^$/ {
	$! {
		:here
		N
		s/\\n$//
		$! t here
	}
}' <<<" & messageBody's quoted form)

Edit: Nicer context address at beginning of sed code.

Thank you all, for your help.

@KniazidisR - I was able to successfully drop your script into mine. Cheers.

I occasionally need to remove duplicate blank lines from a document and decided to run timing tests on Marc Anthony’s and KniazidisR’s text-item-delimiter scripts. Because the text I tested contained 4097 lines, I modified KniazidisR’s script to use script objects. The test results were 0.009 seconds for Marc Anthony’s script and 0.016 seconds for KniazidisR’s script, which are both excellent results. I’ve included my test of Marc Anthony’s script below.

use framework "Foundation"
use scripting additions

-- untimed code
set theText to "A line" & linefeed & linefeed & linefeed & linefeed
repeat 10 times
	set theText to theText & theText
end repeat

-- start time
set startTime to current application's CFAbsoluteTimeGetCurrent()

-- timed code
set cleanedText to removeBlankLines(theText)

on removeBlankLines(theText)
	set ATID to AppleScript's text item delimiters
	repeat until theText does not contain linefeed & linefeed
		set text item delimiters to linefeed & linefeed
		set theText to theText's text items
		set text item delimiters to linefeed
		set theText to theText as text
	end repeat
	set AppleScript's text item delimiters to ATID
	return theText
end removeBlankLines

-- elapsed time
set elapsedTime to (current application's CFAbsoluteTimeGetCurrent()) - startTime
set nf to current application's NSNumberFormatter's new()
nf's setFormat:("0.000")
set elapsedTime to ((nf's stringFromNumber:elapsedTime) as text) & " seconds"

-- result
elapsedTime --> 0.009 seconds

I also tested Nigel’s sed solution, and his script took 0.041 seconds.