Looking at your page, I saw that at the bottom was the word “Previous” then boxes with page index then the word “Next”
It was sufficient to make me search for the word “Previous” which was available once.
Read the embedded comments to see other operations used to extract the number of pages.
Looking at the address displayed when I opened page 2 I was able to understand how the site build addresses of pages.
This gave the ability to read the numerous pages and store their concatenated contents in the variable XMLbrut.
In front of every messages are the button “Reply” so I searched for “Reply” in the grabbed datas.
The first occurrence was :
[format] <a class="gmbutton2 reply" href="http://comments.deviantart.com/4/9074072/3953600988#reply\" onclick=" if (window.GMI) return GMI.up(this, ‘CComment’).replyClick() ">Reply [/format]
I hesitated two seconds between counting occurrences of “gmbutton2 reply” and counting those of “replyClick”
As counting them gave the same result I was quite sure that it’s the wanted value.
Every page (minus the last one) display 5 messages.
The last one display 2 of them.
The page count was 16.
15 * 5 + 2 = 77 which is what I got so it seems that I am right.
I apologize but I can’t guess what is resembling the first page when it is the unique one so I don’t guarantee the behaviour in this case.
# Defines the URL
set theURL to "http://gothicage.deviantart.com"
# Based upon the URL builds the name of the text file to store the count of messages
# Replace every colon by -
set fichier_num to my remplace(theURL, ":", "-")
# Replace every / by _
set fichier_num to "_" & my remplace(fichier_num, "/", "_") & ".txt"
set prefsFolder to path to preferences folder # The User's one
set chemin_num to (prefsFolder as text) & fichier_num
tell application "System Events"
if not (exists disk item chemin_num) then
make new file at end of prefsFolder with properties {name:fichier_num}
end if
end tell -- System Events
try
set oldCount to (read file chemin_num) as integer
on error
set oldCount to 0
end try
set XMLbrut to do shell script "curl -Ls -A 'Opera/9.70 (Linux ppc64 ; U; en) Presto/2.2.1' " & quoted form of theURL
# the word Previous appears before the indexes of pages so extract the first paragraph after it
set maybe to paragraph 1 of item 2 of my decoupe(XMLbrut, "Previous")
# Erase the strings "show-gotobox" so that we will be able to identify easily the meaningful occurrence of "gotobox"
set maybe to my remplace(maybe, "show-gotobox-", "")
# Extracts what is just after "gotobox-"
set maybe to item 2 of my decoupe(maybe, "gotobox-")
--> 573f129d950fb-293\\', \\'1\\', \\'16\\', \\'offset\\', .
set maybe to item 1 of my decoupe(maybe, "\\', \\'offset")
--> 573f129d950fb-293\\', \\'1\\', \\'16
set maybe to items -2 thru -1 of my decoupe(maybe, "\\', \\'")
--> {"1", "16"}
set nbPages to maybe's item 2 as integer
if nbPages > 1 then
repeat with nbp from 2 to nbPages
set newURL to theURL & "/?offset=" & (nbp - 1) * 5 & "#comments"
set moreXML to do shell script "curl -Ls -A 'Opera/9.70 (Linux ppc64 ; U; en) Presto/2.2.1' " & quoted form of newURL
set XMLbrut to XMLbrut & moreXML
end repeat
end if
set nbMessages to (count my decoupe(XMLbrut, "replyClick")) - 1
if nbMessages > oldCount then
write (nbMessages as text) to file chemin_num starting at 0 # Stores the count of messages
display dialog "There is a new message"
end if
#=====
on decoupe(t, d)
local oTIDs, l
set {oTIDs, AppleScript's text item delimiters} to {AppleScript's text item delimiters, d}
set l to text items of t
set AppleScript's text item delimiters to oTIDs
return l
end decoupe
#=====
(*
replaces every occurences of d1 by d2 in the text t
*)
on remplace(t, d1, d2)
local oTIDs, l
set {oTIDs, AppleScript's text item delimiters} to {AppleScript's text item delimiters, d1}
set l to text items of t
set AppleScript's text item delimiters to d2
set t to l as text
set AppleScript's text item delimiters to oTIDs
return t
end remplace
#=====
Yvan KOENIG running El Capitan 10.11.5 in French (VALLAURIS, France) vendredi 20 mai 2016 16:18:59