Text items delimiters defaulting to ""

NB: The following has been tested on Mac OS X 10.4.7 only.

I recently made a simple handler to remove all letters from a string except those contained in the inputted alphabet:

on removeCharactersNotInAlphabet(inputText, alphabet)
	set text item delimiters to ""
	set negativeText to inputText
	repeat with thisChar in alphabet
		set text item delimiters to thisChar
		tell text items of negativeText to set negativeText to beginning & ({""} & rest)
	end repeat
	repeat with thisChar in negativeText
		set text item delimiters to thisChar
		tell text items of negativeText to set negativeText to beginning & ({""} & rest)
		tell text items of inputText to set inputText to beginning & ({""} & rest)
	end repeat
	set text item delimiters to ""
	return inputText
end removeCharactersNotInAlphabet

As can be seen, the text item delimiters are set a particular character (or more correctly, a reference to a particular character in a string, as created by the repeat with # in # loop), and then the text items of the inputted text are coerced back into their original type (string, Unicode text) by means of the ‘kai method’.
My dilemma with this simple script is to why the text items are seemingly temporarily reset back to “” when the text item delimiters are set as a textual reference.

Here are some example outlining this behaviour:
This works as expected:

set text item delimiters to space
text items of "a string of words" as string
--> "a string of words"

set text item delimiters to a reference to space -- Don't worry, I know this is the same as the above
text items of "a string of words" as string
--> "a string of words"

This, however, doesn’t:

set a to space
set text item delimiters to a reference to a
text items of "a string of words" as string
--> "astringofwords"

Is this really the expected behaviour, or am I simply mistaken?

Hi, Qwerty.

A reference to a value is the same as the value, simply because there’s no way of getting a reference to it. A reference to a variable is a reference:

set valueRef to a reference to space

set a to space
set variableRef to a reference to a

{valueRef, variableRef}
--> {" ", a of «script»}

Similiarly, the reference in each of your repeats is to a character (or item) in the string (‘item # of #’), not to the value of the character (“z”, say).

It seems that AppleScript’s text item delimiters are happy to be references when dividing text, but have to be literal string or Unicode text values when joining text items, otherwise it’s as if there’s no delimiter at all. That’s an interesting discovery. (Checking quickly, it’s the same in 10.2.8. I don’t know how far it goes back.) But, unless it’s documented somewhere, it’s probably not safe to rely on it! :slight_smile:

Hi guys.

I believe it has long (and possibly ever) been thus. It was certainly the behaviour back in Mac OS 9.

In a ‘repeat with x in y’ form of loop, the technique I’ve always used (as with other references that aren’t automatically evaluated/coerced to a meaningful value) is to explicitly get the contents:

set results to {}
set tid to text item delimiters
repeat with |ref| in {space}
	set text item delimiters to |ref|
	set end of results to words of "a string of words" as string
	set text item delimiters to |ref|'s contents
	set end of results to words of "a string of words" as string
end repeat
set text item delimiters to tid

results --> {"astringofwords", "a string of words"}

:slight_smile:

Hi, kai.

Yes indeed. But the thing that was baffling Qwerty wasn’t so much that the reference didn’t work as a delimiter, but that it worked one way but not the other ” separating texts, but not joining them. If you change words to text items in your script, you get exactly the same result!

Yeah, I really just jumped in to add my note about the history of this behaviour ” although the fact that the splitting/joining algorithms consider/ignore non-text values respectively does suggest a certain bugginess.

I’d speculate that, to split text into its constituent text items, some implicit, internal coercion to text is performed as necessary on the current value of AppleScript’s text item delimiters. While this is evidently not the case when joining text items (when non-text values are apparently ignored altogether), perhaps it should be.

This might help to demonstrate the one-way coercion effect further:


set total_nonsense to "
It's my basket that the scriptwriter's wordplays about falsehoods anoint the coyest of character actors.

Maypoles apart, they are stringlike databases (though there are 13.14159265359 or 23600 listless postscripts).
	
Announces, surreally as ever, casebooks on a white spaceship.

preferences: the chaffinches watch yardsticks."

set edit_list to {ask, case, character, class, data, false, hours, inches, list, May, ounces, no, pi, real, reference, string, white space, word, yards, yes}

set tid to text item delimiters
repeat with i in edit_list
	set text item delimiters to i (* or: i's contents *)
	set total_nonsense to text items of total_nonsense as string
end repeat
set text item delimiters to tid

display alert "Total nonsense:" message total_nonsense

Result:

My goodness, kai! No wonder you’re usually up most of the night! :lol:

:lol: Very clever, kai! (and funny)

Thank you both for your responses. :slight_smile:

I guess it deserves to be simply put to rest with all those other nuances that makes AppleScript all the more so mysterious. (yet more interesting!) :wink: