How can I get a link from some text?

Hi everyone,

I’m trying to extract the link for some plain text. For instance:

set theText to "Check out this awesome site. http://macscripter.net/ AppleScript rocks!"

I want the result to be:

http://macscripter.net/

I’m hoping this is simple but I’ve tried every way I can find and still am having no luck.

Thanks for your help,
PeterVk.

Hi PeterVk

try this

tell application "Finder"
	set TheText to "Check out this awesome site. http://macscripter.net/ AppleScript rocks!"
	set _Result to (characters 30 thru -20 of TheText as text)
	display dialog _Result
end tell

Hi Budgie,

Thanks for the reply. It strikes me now that my initial question wasn’t phrased very clearly.

I need the script to work on any text. In other words, ‘theText’ is a variable (it’s coming into the script from a different application).

If I try:

set theText to "Check out this awesome site. http://macscripter.net/ AppleScript rocks!"
set theWords to every word of theText

The result is:

{“Check”, “out”, “this”, “awesome”, “site”, “http”, “macscripter”, “net”, “AppleScript”, “rocks”}

If the result wasn’t broken down this way I could run a repeat like:

repeat with theWord in theWords
	if theWord starts with "http" then
		tell application "Safari" to open location theWord
	end if
end repeat

Obviously that won’t work because of how the text is broken down. I’m still searching for a solution.

Cheers.

You want this to work for any sentence that contains a URL? Will the URL always start with “http://” end with a “/”? Otherwise checking that a URL is valid is not a trivial pursuit.

Hi Adam,

I can assume for now that every URL will begin with http but not that it will end with ‘/’.

I’m thinking that it will be easier to work with the characters of theText rather than the words. Still, I’m having trouble conceptualizing how to extract the links using the characters.

Thanks for the response.

You could just check for expected characters after the URL text and split with text item delimiters.

Something like:

-- John Maisey -- 29 Jan 2009 -- [url=http://www.nhoj.co.uk]www.nhoj.co.uk[/url] --

-- The text you're looking at.
set myText to "Check out this awesome site. http://macscripter.net/ AppleScript rocks!"
-- Holder for the result.
set myResult to {}
-- Items that you expect will be after the end of the URL. You may want to adjust this list.
set terminatorList to {" ", ">", ".", ",", ")", return, tab}
-- Split by 'http://" and only get the second to the last parts.
set myList to my textItemDelimiterSplit(myText, "http://", 2, -1)
-- Loop through list returned.
repeat with myItem in myList
	-- Loop through possible ending characters .
	repeat with myTerminator in terminatorList
		-- Split by the ending character.
		set myItem to my textItemDelimiterSplit(myItem, myTerminator, 1, 1)
	end repeat
	-- Add the result to an array.
	set myResult to myResult & ("http://" & myItem as text)
end repeat
-- Return the array.
return myResult

--TID splitter
on textItemDelimiterSplit(theText, theTID, theFrom, theTo)
	try
		set my text item delimiters to theTID
		set myReturn to text items theFrom thru theTo of theText
		set my text item delimiters to {""}
		return myReturn
	on error
		return {}
	end try
end textItemDelimiterSplit

Best wishes

John M

John:

I stand corrected - this was not such a simple task. I just tested your script it works beautifully. It’s exactly what I was after.

Thanks so much.

Incidentally, I also received a reply to my question from a friend on Twitter (http://twitter.com/golbin) which also works and is somewhat different. If you’re interested you can see it here: http://golbin.net/test/extractURL.applescript.txt. It doesn’t use text item delimiters.

Thanks again.

A word of warning. Your friend’s script fails for the example below and for any sentence that ends with a URL. John Maisey’s version gets them successfully no matter where they fall in the sentence. In the example below, add something after the last URL (beginning with a space) and the script works; e.g. “Check out this awesome site. http://macscripter.net/ http://studio.golbin.net AppleScript rocks! Also see http://www.journler.com and http://forums.macosxhints.com/. Got that?” It’s the " Got that" ending that saves it.

set theText to "Check out this awesome site. http://macscripter.net/ http://studio.golbin.net AppleScript rocks! Also see http://www.journler.com and [url=http://forums.macosxhints.com/]http://forums.macosxhints.com/"[/url]
set startWord to "http"

set urlList to {}
repeat
	set startOffset to the offset of startWord in theText
	if startOffset is 0 then exit repeat
	
	set offsetCount to startOffset
	set theURL4list to ""
	repeat
		set theURL to item offsetCount of theText
		if theURL is " " then
			set urlList to urlList & theURL4list
			exit repeat
		else
			set theURL4list to theURL4list & theURL
		end if
		set offsetCount to offsetCount + 1
	end repeat
	set theText to items offsetCount thru -1 of theText as string
end repeat

urlList