Custom URL Offset

How can the below script be modified to handle any TLD (.com, .net, .org etc.)?

if pageURL contains "instapaper.com" then
	set siteURL to text ((offset of "%3A%2F%2F" in pageURL) + 9) thru ((offset of ".com%2F" in pageURL) + 3) of pageURL
	set s to s & "via <a href=\"" & pageURL & "\" target=\"_blank\">" & siteURL & "</a>"
else
	set siteURL to text ((offset of "http://" in pageURL) + 7) thru ((offset of ".com/" in pageURL) + 3) of pageURL
	set s to s & "via <a href=\"" & pageURL & "\" target=\"_blank\">" & siteURL & "</a>"
end if

The end results should produce a value in string s as follows:

via <a href …>sitename.yyy

Where yyy is the TLD, and whether the URL is coming from an Instapaper link or a standard link… Examples below:

Instapaper Link
http://www.instapaper.com/text?u=http%3A%2F%2Fmacscripter.net%2Fviewtopic.php%3Fid%3D14134

Standard Link
http://macscripter.net/viewtopic.php?id=14134

Hi,

try this, it unescapes the URL and uses text item delimiters to extract the host


set pageURL to "http://www.instapaper.com/text?u=http%3A%2F%2Fmacscripter.net%2Fviewtopic.php%3Fid%3D14134"

set unescapedURL to do shell script "perl -e 'use URI::Escape; print uri_unescape(\"" & pageURL & "\")';"
set oldDelims to text item delimiters
set text item delimiters to "http://"
set theSite to text item 3 of unescapedURL
set text item delimiters to "/"
set theSite to text item 1 of theSite
set AppleScript's text item delimiters to oldDelims
set s to "via <a href=\"" & pageURL & "\" target=\"_blank\">" & theSite & "</a>"


Getting an error on the line below “cant set delimiters on…”

set oldDelims to text item delimiters

Also, does this handle instapaper and non-instapaper formatted URLs ?

Are you using the code within an application tell block?
If yes, write AppleScript’s text item delimiters instead of text item delimiters

the script considers the text between after the second occurrence of http:// and the next slash.
If there is only one occurrence of http:// in the string write

set theSite to text item 2 of unescapedURL

Perfect, it works thank you!!!