Safari's source property doesn't update when more content is loaded

If I load the Facebook startpage in Safari and copy-paste the source code to a text editor it is 3 020 305 bytes/characters (according to BBEdit). If I then scroll down to the bottom ten times to load more content and copy-paste the source again it is STILL 3 020 305 despite much more content being loaded.

Can someone explain this?

I am writing a script where I want to know if I have reached the end of a feed like the Facebook feed and one way to detect could have been to compare the length of the source property but it doesn’t seem to update, neither in the browser nor if I look at the AS source property, when I load more content. Is there some way around this?

A simplified version of my code:

tell application "Safari"
	set oldSource to ""
	set newSource to ""
	open location myURL
	tell document 1 to repeat 15 times
		set newSource to source
		if newSource = oldSource then
			exit repeat
		else
			set oldSource to newSource
		end if
		do JavaScript JSJumpToEnd
		delay 0.5
	end repeat

Perhaps the content is already loaded but you don’t see it because you haven’t scrolled to it yet. Or Safari’s source property is simply not updated when TGE DOM changes. In that case, I’d use (out rather try to use) a JS event handler for DOMContentLoaded. Or read the content with JS after scrolling and waiting a bit.

Your simplified code is too simplified to explain the problem.

What is it your code is doing/not doing that you don’t expect/expect?

If you’re waiting to ensure the page has finished loading, the standard method for this is via JavaScript:

tell document 1 to repeat
        do JavaScript "document.readyState"
        if the result = "complete" then exit repeat
        delay 0.5
    end repeat
1 Like

It’s good to know this technique, as long as you also know that it’s not especially reliable.

1 Like

This is my solution for this problem:

tell application "Safari"
	set oldHeight to 0
	set newHeight to 0
	set loopDelay to 1
	activate
	open location myURL
	try
		tell document 1 to repeat
			set noLoops to noLoops + 1
			set newHeight to do JavaScript JSDocumentHeight
			if newHeight = oldHeight then
				tell application "System Events"
					set frontmostApplicationName to name of 1st process whose frontmost is true
					repeat with proc in (every process whose frontmost is true)
					end repeat
				end tell
				activate
				tell application frontmostApplicationName to activate
				delay loopDelay * 5
				
				set newHeight to do JavaScript JSDocumentHeight
				if newHeight ≠ oldHeight then
					set loopDelay to loopDelay + 1
				else
					exit repeat
				end if
			else
				set oldHeight to newHeight
			end if
			do JavaScript JSJumpToEnd
			delay 1
			do JavaScript JSJumpToEnd
			delay loopDelay
		end repeat
	on error errMsg
		log "Safari - errMsg: " & errMsg
		do shell script "kill " & caffeinatePid
	end try
end tell

I use this to scroll all the way down in Facebook feeds.

Basically, I jump to the end of the document, measure the height of the page, wait a few seconds, jump to the end again, measure the height again and compare the result. If the height has changed, I run the loop again.

If the height hasn’t changed, which is common because the network or Safari regularly stalls when the page is very long (it is common that it is 20 000+ px high) I make a long pause and try again. If succesful I increase the delay the script waits for the page to load and then loop again.

I also make sure that Safari is in front because otherwise it is downprioritized and fail to load pages.

So far this i pretty stable, but it can easily take an hour to jump down 250 times.

1 Like

You can loop the shell command

curl -k your_url

Depending on the size of the web page it may display the result after some delay. The source property of Safari has long been botched so much that it fails to refresh the new source code immediately upon opening a new page.
I still have old macOS systems with correspondingly old Safari versions (Safari 9 and even Safari 5). There, using source is buttery smooth and blazingly fast. It’s not the case with Safari 14.