How to get HTML data from the clipboard into an Applescript variable

This is more a tip than a question, but I had to struggle with it, and didn’t find anything online about how to do it, so here goes:

When you copy text from a browser, it the clipboard contains a record with the text you copied as “«class HTML»”, “«class utf8»”, “«class ut16»”, string, and unicode text. You can view the record by running the code:

return the clipboard as record

The problem is that the HTML data - the data that contains the HTML tags as well as the text that was copied - is binary data. To create an Applescript variable with actual HTML in it, you will need to write the clipboard data to a text file and then read it back into a variable.


set thex to «class HTML» of (the clipboard as record)
-- This will trigger an error if you've copied something other than HTML data
set f to (path to "temp" from user domain as text) & "temp.txt"
-- Writes to a file named "temp.txt" in "Macintosh HD:Users:[userid]:Library:Caches:TemporaryItems:temp.txt"
set newFile to open for access file f with write permission
set eof of newFile to 0
write thex to newFile
close access newFile
set newFile to open for access file f
set theHTML to read newFile

Hi,

It looks like this is specific to Firefox. Safari does not have the “«class HTML»” key.

In Safari you can do something like this:

tell application "Safari"
	source of document of front window
end tell

Best wishes

John M

It doesn’t, it has «class weba». You can switch out «class weba» for «class HTML», and get a variable with code. The code returned from «class weba» has styles included in it, whereas «class HTML» data just returns HTML.

Firefox HTML («class HTML»)
“<meta http-equiv="content-type" content="text/html; charset=utf-8"><a onclick='s_objectID="http://www.macworld.com/article/143970/2009/11/core15_imac.html?lsrc=top_1_2\”;return this.s_oc?this.s_oc(e):true’ href="http://www.macworld.com/article/143970/2009/11/core15_imac.html?lsrc=top_1\" class="more">Learn more"

Safari HTML («class weba»)
“bplist00”_WebMainResource’
_WebResourceTextEncodingName_WebResourceData_WebResourceFrameName_WebResourceMIMEType^WebResourceURLUUTF-8Oä<span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Times; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><span class="Apple-style-span" style="color: rgb(51, 51, 51); font-family: ‘Lucida Grande’, ‘Lucida Sans Unicode’, Arial, Verdana, sans-serif; font-size: 12px; line-height: 16px; "><a class="more" href="http://www.macworld.com/article/143970/2009/11/core15_imac.html?lsrc=top_1\” onclick="s_objectID=“http://www.macworld.com/article/143970/2009/11/core15_imac.html?lsrc=top_1_2”;return this.s_oc?this.s_oc(e):true" style="color: rgb(0, 136, 204); text-align: right; text-decoration: underline; white-space: nowrap; padding-right: 10px; background-image: url(http://images.apple.com/global/elements/arrows/morearrow_08c.gif); background-repeat: no-repeat; background-attachment: initial; -webkit-background-clip: initial; -webkit-background-origin: initial; background-color: initial; zoom: 1; background-position: 100% 50%; ">Learn morePYtext/html_http://www.apple.com/startpage/

Anyone have any idea how to do the OPPOSITE?

I want to create an HTML link in Applescript that will paste into Postbox (Thunderbird.)

I tried

set the clipboard to {«class HTML»:"<a href=\"http://URL.com\">TEXT</a>"}

But when I paste, it just returns plain text.

I’ve used this for some time. I don’t know about Postbox, but this code will produce a link you can paste into any RTF-aware application.

-- http://macscripter.net/viewtopic.php?id=26498 post #13
on createRTFlink(linkURL, linkTitle)
	set HTML to "<font size=4><a href=\"" & linkURL & "\">" & linkTitle & "</a></font>"
	do shell script "echo " & quoted form of HTML & " | textutil -format html -convert rtf -stdin -stdout | pbcopy -Prefer rtf"
end createRTFlink

Thanks for sharing. It is the only solution I have found so far to the the actual HTML code from the clipboard.

Have you, by chance, found a way to do this without writing to a file?

Hey Michael,

You’ve figured out by now that it’s not necessary to write to a file ” yes?

Yes. Thanks for prompting me to update on this.

I have learned How to get HTML code from the clipboard into an Applescript variable without writing to a file, thanks to this post on the StackOverflow.com board:
How to get HTML data out of of the OS X pasteboard / clipboard?

Here’s the code, which I adapted for AppleScript:


set strCMD to "osascript -e 'the clipboard as «class HTML»' |   perl -ne 'print chr foreach unpack(\"C*\",pack(\"H*\",substr($_,11,-3)))'"
set strHTML to (do shell script strCMD)