Focusing Safari tabs and waiting for page to load completely

Ok so… I’m always having issues with Safari reporting that a webpage has fully loaded when it hasn’t. I’ve searched everywhere for a tried and true method of doing this. Most answers I’ve found on the internet suggest using something like:

tell application "Safari"
	tell the front document
		set the_state to (do JavaScript "document.readyState")
		repeat until the_state is not missing value
		end repeat
	end tell
end tell

Or some similar variation. But this simply doesn’t work anymore.

The following method works for Safari… usually. Sometimes I get this weird connection invalid error. But I think I’ve worked it out where (hopefully) that shouldn’t be too much of a problem. I haven’t had the error occur for a while now so I think it’s good but I can’t really say that it’s been corrected it because, I don’t know if it has and I don’t know what was causing it in the first place.

So anyway, this is what I got. Basically this opens Safari to a specific URL. If Safari is closed, this launches it setting the first (and only) window/tab gets set to the url. If Safari is running but no window is present, then a new window get created and the url is loaded in that window. Then it checks for duplicate tabs and selects one to focus. Any additional tabs are left alone

The last part of the script waits for the source of the page to be loaded before continuing.

repeat until(length of source as text) is not 0

This seems to work better than anything else I ever tried. Tested on a site that with a verrrry slow loading page. To confirm the page is fully loaded, I added one last javascript function at the bottom to get the innerText of of the page. if a page is not fully loaded, this should return “”. Otherwise once the page loads, the text content of the page is
displayed. This last javascript command is not_needed for anything but testing purposes.

Some of this is real hacky but it seems to work. If anyone has any suggestions or improvements I’d love to hear your thoughts on this. And if anyone knows what and how to stop that weird connection invalid message that would be awesome. Thanks!

set theURL to "https://translate.google.com"

if application "Safari" is not running then
	launch application "Safari"
	tell application "Safari"
		open location theURL
	end tell
end if

if not (exists window of application "Safari") then
	tell application "Safari"
		open location theURL
	end tell
end if

set text item delimiters to "\n"
tell application "Safari" to tell window 1
	set tab_index to get the index of tabs whose URL contains theURL
	if tab_index is {} then
		open location theURL
	else
		set my_tab to tab_index
		set this_tab to do shell script "echo " & my_tab & " | head -n1 " as string
		set the_tab to this_tab as list
		set the current tab to tab the_tab
	end if
end tell

delay 1

tell application "Safari"
	tell front document
		activate
		get the source as text
		set the_source to the result
		repeat until the length of (get the source as text) is not 0
		end repeat
		delay 0.5
		set theText to do JavaScript "document.documentElement.innerText"
	end tell
end tell

It seems that a lot of code is useless.

I got the same behavior with :

set theURL to "https://translate.google.com"

tell application "Safari"
	activate
	open location theURL
	--end tell
	
	--set text item delimiters to linefeed
	
	--tell application "Safari"
	(*
	tell window 1
		set tab_index to get the index of tabs whose URL contains theURL
		if tab_index is {} then
			open location theURL
		else
			set my_tab to tab_index
			set this_tab to do shell script "echo " & my_tab & " | head -n1 " as string
			set the_tab to this_tab as list
			set the current tab to tab the_tab
		end if
	end tell
	*)
	delay 1
	
	--tell application "Safari"
	tell front document
		-- activate
(*
set the_source to source as text
set theLength to -1
repeat
	if (count the_source) = theLength then exit repeat
	set theLength to count the_source
end repeat
*)
		delay 0.5
		set theText to do JavaScript "document.documentElement.innerText"
	end tell
end tell

Yvan KOENIG running Sierra 10.12.5 in French (VALLAURIS, France) vendredi 19 mai 2017 20:49:59

But have you tried with a slow loading page?

The code is not useless. And you do not get the same behavior with your version. Go to any page that takes a long time to load. I cannot post the page I used in my tests because it would not be appropriate in a public forum. But without:

repeat until the length of (get the source as text) is not 0

or some other method of determining if the page is fully loaded, the final test:

set theText to do JavaScript "document.documentElement.innerText"

will fail on any slow loading page.

Also the version you posted will reload the webpage even if it’s already open in the browser. The version I posted checks for multiple tabs of the same page and then, if the page is already open in the browser, focuses one of the tabs on that page. If multiple tabs exist of the same page, the tab closest to the beginning of tabs is selected. If the page does not exist in the browser, then a new tab is created with that page.

So…

Not quite.

If I read well, the code you posted is the same than mine after removing the disabled instructions.
If you read carefully what I posted, you will see that it contain a piece of code which, if you enable it, wait for the achievement of the load process.

Yvan KOENIG running Sierra 10.12.5 in French (VALLAURIS, France) samedi 20 mai 2017 09:02:09

Oh my apologies. I only saw the text commented out and didn’t realize you had put updated code in there. Color me embarrassed. However the final test set theText to do JavaScript “document.documentElement.innerText” still fails on a slow loading webpage using this bit of code:

set the_source to source as text
set theLength to -1
repeat
   if (count the_source) = theLength then exit repeat
   set theLength to count the_source
end repeat

But this seems to work every time:

tell application "Safari"
	activate
	tell front document
		repeat until the length of (get the source as text) is not 0
		end repeat
		delay 0.5
		set theText to do JavaScript "document.documentElement.innerText"
	end tell
end tell

I wish I could post the url of the page I’m using to test this because it is probably one of THE slowest webpages in the history of the internet. :lol:

I was getting some “Connection is invalid” errors previously but those seem to have been resolved. I think the placement of activate has something to with it?

If you don’t want to make it visible by everybody, maybe you may send the link in a private message.

Yvan KOENIG running Sierra 10.12.5 in French (VALLAURIS, France) samedi 20 mai 2017 10:12:38

I didn’t even know we could do that! :lol:

Thanks.
I never got a site behaving this way.
My loop gave good results with every site I knew.

It fails with yours.

Yvan KOENIG running Sierra 10.12.5 in French (VALLAURIS, France) samedi 20 mai 2017 10:46:10

After playing around a little more I think I figured out what causes the Connection is invalid error. It seems that running the script without a delay present will cause the error to occur. But it’s weird because I was about to write that the error only occurs when Safari is running but no windows are open. Because that is the behavior I’ve been observing. But then I decided to check again and suddenly the error occurs if Safari is running, if it is not running, and if windows are present or not present. So maybe tomorrow the error will occur if the wind blows east… who knows?

But in any case, it seems that having a delay, even delay 0.1 nullifies the error. Although I’ve taken the safe route and made the delay much longer @ 0.2 seconds. :lol: Now as far as why this error occurs without a delay, I’m really not sure.

So in case anyone tries to use this and runs into the same problem, the delay (as far as I can tell) is critical.

Also there is no need to set a variable from the source as I had in my original post. From what I can tell, this can actually make the script hang if the window is closed before the variable is set.

Here is an updated version of the script with some of the unnecessary parts removed. I’ve left comments throughout for anyone new to Applescript. I know first hand how confusing it can be.

Any and all critiques are welcome.

set theURL to "https://translate.google.com/"

-- If Safari is not running.. obviously :)
if application "Safari" is not running then
	launch application "Safari"
	tell application "Safari"
		open location theURL
	end tell
end if

-- If Safari is running but no windows are present
if not (exists window of application "Safari") then
	tell application "Safari"
		open location theURL
	end tell
end if

-- This next part seems a little hacky to me but it works
-- There is probably a better way of doing this
-- But as I said it works so.. ¯\_(ツ)_/¯ 

set text item delimiters to "\n"

-- Our URL is being present in one or more of the tabs will result in a list of numbers
-- with each number representing an individual tab in the form of {1,2,3}
tell application "Safari" to tell window 1
	set tab_index to get the index of tabs whose URL contains theURL
	
	-- Our URL not being present in any of the tabs
	-- will result in an empty list or "{}" 
	-- So if our list is empty, open a new tab with our chosen URL
	if tab_index is {} then
		open location theURL
	else
		
		-- If the same URL is present in more than one tab
		-- the "tab_index" variable will be set to a list of numbers 
		-- We only need one of these numbers to focus a tab
		-- so to avoid any errors occurring we remove all but one number from our list
		set a_tab to do shell script "echo " & tab_index & " | head -n1 " as string
		
		-- AppleScript seems to like lists more than strings
		-- So we set our tab number back to the form of a list 
		set the_tab to a_tab as list
		
		-- And finally we focus the tab
		set the current tab to tab the_tab
	end if
end tell

-- A short delay is important here. 
-- No delay will result in an error
delay 0.2

tell application "Safari"
	tell front document
		activate
		
		-- Make certain the webpage has finished loading before continuing with the script
		repeat until the length of (get the source as text) is not 0
		end repeat
		
		-- Test if the page has fully loaded
		-- If successful, the source of the page will be displayed
		-- If failure, the result will be ""
		source
	end tell
end tell

Hey Guys,

I posted a couple of things to the Applescript Users List that are somewhat relevant to this discussion:

Getting Tab Index and Window ID in Safari

Safari - Bring a tab containing a given URL to the front

-Chris

set text item delimiters to "\n"
set theURL to "http://www.angelplayground.com/"
tell application "Safari"
	if not (exists front document) then reopen
	tell window 1
		if miniaturized is true then set miniaturized to false
		set tab_index to get the index of tabs whose URL contains theURL
		if tab_index is {} then
			open location theURL
		else
			set a_tab to do shell script "echo " & tab_index & " | head -n1 " as string
			set the_tab to a_tab as list
			set the current tab to tab the_tab
		end if
		activate
		delay 0.5
		tell current tab
			repeat until the length of (get the source as text) is not 0
			end repeat
			source
		end tell
	end tell
end tell

I think this version cuts out some of the unnecessary pieces in my previous posts. If anyone sees anything that could be better optimized or just better in general please let me know.

Also the url in the script now points to a family friendly slow loading webpage. Apparently they sell playground equipment. I got this page from a list of the slowest pages on the internet. This list can be found at http://internetsupervision.com/scripts/urlcheck/report.aspx?reportid=slowest

One thing I still don’t understand is the need for a short delay. It’s the only delay in the script now. The behavior i’ve observed is this:

Window exists
Safari is running normally with at least one browser window open:

  • No need for a delay.

Window does not exist
Safari does not have a browser window open:

  • A delay is necessary to avoid a “Connection is invalid” error.

I use a 0.5 second delay. But I’ve tested it successfully many times with a 0.1 second delay. I don’t understand what is happening that causes this error to occur. Or what magic happens in those 0.1 seconds to avoid the error. At first I thought it had something to with the window needing time to be created. But the window should have already been created by this point right?

I really have no idea so if someone can shed some light on that, I would really appreciate it.

Model: iMac late 2012
AppleScript: 2.5
Browser: Safari Version 10.1.1 (11603.2.5)
Operating System: Mac OS X (10.11.6)