"Previous" Text Item Delimiters

Hi Everyone,

I’m stuck on the text item delimiters of AppleScript. What I am trying to do is best explained by an example. I have the following code (HTML/UTF8):

<div class="..." id="...">
<p class="...">...</p>
<div class="CLASS" id="..." style="...">KEYWORD</div>
</div>

The KEYWORD serves as an utensil for my script to get hold off and change the CLASS of that div. So what I did was I searched for the KEYWORD and minused X (some number) from that text item to move the start of that text item left to the CLASS. Something like this:


set w to text (x) thru (y - 30) of z

The thing is, my code has changed so that the length to the CLASS has changed. Now I can’t use that script anymore. Now I simply want something that allows me to search for the closest previous class=". Then from that point, to the next " getting me the class name (the thing in between the two quotation marks). Then let me replace that CLASS (the thing in between the two quotation marks) with anything I like. Not a global search/replace for the whole script because other divs might be using the class but shouldnt be using the replacement class.

I hope I got everyone confused. :slight_smile:

Huge thanks for all help.

Regards,

Max

Basically you want to find everything between two key points after first using the keyword to extract the portion of the text you’re interested in.

The beginning point is marked by this text: <div class="
The ending point is marked by this text: " id=

There’s 2 ways we can easily get the class. If you only need to know the value of the class then we can just use text item delimiters like so…

set theText to "<div class=\"...\" id=\"...\">
<p class=\"...\">...</p>
<div class=\"CLASS\" id=\"...\" style=\"...\">KEYWORD</div>
</div>"

set text item delimiters to "KEYWORD"
set a to item 1 of (text items of theText)
set text item delimiters to "<div class=\""
set b to item -1 of (text items of a)
set text item delimiters to "\" id="
set theClass to item 1 of (text items of b)
set text item delimiters to ""
theClass

But if you also want to know the exact position of the class within the original text (for example characters 100 thru 105 or whatever) then we can use a mixture of text item delimiters and a repeat loop moving backwards through the text looking for those two key points.

set theText to "<div class=\"...\" id=\"...\">
<p class=\"...\">...</p>
<div class=\"CLASS\" id=\"...\" style=\"...\">KEYWORD</div>
</div>"

set text item delimiters to "KEYWORD"
set a to item 1 of (text items of theText)
set text item delimiters to ""

set beginNumber to missing value
set endNumber to missing value
repeat with i from (count of a) to 1 by -1 -- count backwards
	try
		if text (i - 4) thru i of a is "\" id=" then set endNumber to i - 5
		if text (i - 11) thru i of a = "<div class=\"" then
			set beginNumber to i + 1
			exit repeat
		end if
	end try
end repeat

if beginNumber is missing value or endNumber is missing value then
	set theClass to "I couldn't find it"
else
	set theClass to text beginNumber thru endNumber of theText
end if
{beginNumber, endNumber, theClass}

Anyway, I hope one of those is what you were trying to actually do! :smiley:

I just noticed you want to replace the class with another value. So in that case we would use the begin and end numbers from the second script I posted above to do that…

-- continuing from the second script
set theText to "<div class=\"...\" id=\"...\">
<p class=\"...\">...</p>
<div class=\"CLASS\" id=\"...\" style=\"...\">KEYWORD</div>
</div>"
set beginNumber to 63 -- we got this from the above script
set endNumber to 67 -- we got this from the above script too

-- replace the old class with the new class
set newClass to "aNewClassName"
set newText to text 1 thru (beginNumber - 1) of theText & newClass & text (endNumber + 1) thru -1 of theText

Hank that worked perfectly. Thank you so much. I ended up using the second script. :smiley: Thanks again.

Great!

The important part is understanding how it works so you can use that knowledge. With text items delimiter and repeat loops we can find anything we want within some text. So if you have questions about how it works let me know… otherwise good luck!

To be honest, I made my code example up. So was sitting for about 15 staring at your codes and figuring the ± positions of text. Then I went on counting the darn spots of the classes and stuff … but it eventually worked perfectly. The thing is, it’s just all pure logic. I find it amazing.

Yep, and we can use a little more logic to make it even better if you want to get a little fancier. The thing is, inside the repeat loop we check 2 if statements on every loop. But we don’t need to check both if statements every time. We know that we don’t need to look for the beginNumber until after the endNumber has been found. And also once the endNumber is found we no longer need to look for it.

So we can test for those 2 conditions and thus we only evaluate one if statement on each loop. Try this code. Now technically we added another if statement to check for the endNumber, but that’s a simple evaluation… much simpler than the other if statements… so I would say that this logic is even better than the original logic.

set theText to "<div class=\"...\" id=\"...\">
<p class=\"...\">...</p>
<div class=\"CLASS\" id=\"...\" style=\"...\">KEYWORD</div>
</div>"

set text item delimiters to "KEYWORD"
set a to item 1 of (text items of theText)
set text item delimiters to ""

set beginNumber to missing value
set endNumber to missing value
repeat with i from (count of a) to 1 by -1 -- count backwards
	try
		if endNumber is missing value then
			if text (i - 4) thru i of a is "\" id=" then set endNumber to i - 5
		else
			if text (i - 11) thru i of a = "<div class=\"" then
				set beginNumber to i + 1
				exit repeat
			end if
		end if
	end try
end repeat

if beginNumber is missing value or endNumber is missing value then
	set theClass to "I couldn't find it"
else
	set theClass to text beginNumber thru endNumber of theText
end if
{beginNumber, endNumber, theClass}

Thats even better. Thanks again.