I’d like to bring to attention some recent observations on the nuanced way the clipboard handles literal line ending characters for which I couldn’t find a previous discussion either in MacScripter or on the internet in general. Just to recap:
linefeed character:
Unicode code point = 10, AppleScript constant = linefeed, representation in text strings for this discussion = [LF]
carriage return character:
Unicode code point = 13, AppleScript constant = return, representation in text strings for this discussion = [CR]
First, let’s start with expected behavior. When text containing literal linefeed characters is saved to the clipboard with the standard additions command set the clipboard to and retrieved with the the clipboard command, the line ending characters are preserved in all the clipboard’s internal encodings:
set the clipboard to "v" & linefeed & "w" & return & "x" & return & "y" & linefeed & "z"
the clipboard --> "v[LF]w[CR]x[CR]y[LF]z"
the clipboard as «class ut16» --> "v[LF]w[CR]x[CR]y[LF]z"
the clipboard as «class utf8» --> "v[LF]w[CR]x[CR]y[LF]z"
Now, the unexpected behavior. When the same text is selected by the user and then copied to the clipboard by a Command-C key press or by clicking on an application menu’s Copy menu item (typically in the Edit menu), the following results are observed:
the text selected and copied by the user:
v[LF]w[CR]x[CR]y[LF]z
the clipboard --> "v[CR]w[CR]x[CR]y[CR]z"
the clipboard as «class ut16» --> "v[LF]w[LF]x[LF]y[LF]z"
the clipboard as «class utf8» --> "v[LF]w[CR]x[CR]y[LF]z"
Thus, for text copied to the clipboard via a Command-C key press or Copy menu item click, the plain version of the command, the clipboard, converts all line endings to carriage return characters, and the UTF16 version, the clipboard as «class ut16», converts all line endings to linefeed characters. Only the UTF8 version, the clipboard as «class utf8», preserves the original line ending characters. Incidentally, the ASObjC approach to retrieving clipboard contents via the NSPasteboard class also preserves the original line endings:
use framework "Foundation"
(current application's NSPasteboard's generalPasteboard()'s stringForType:(current application's NSPasteboardTypeString)) as text
--> "v[LF]w[CR]x[CR]y[LF]z"
While the particular form of line ending character will have no impact in most scripting situations, one situation where it may be critically important is in the processing of text strings by Unix shell commands such as echo, sed, etc. NSRegularExpression pattern matching could also be affected depending on the specifics of the search pattern. For these applications where line ending type may be important, one should consider using the UTF8 version, the clipboard as «class utf8», or ASObjC’s NSPasteboard class, for retrieving text copied to the clipboard by the user.