Coercing text from Unicode to Roman

I’m having what I would guess is a problem with unicode to text conversion.

I’ve seen some posts here suggesting the use of �class utf8� to convert unicode, but I think what I’m really looking for is something to convert to Roman. Someone else suggested class ktxt but that gives an error when it runs.

Below is the code I want to run. It works fine on an U.S. system, but if I switch my language to Japanese (and logout/login for that change to take effect), the script no longer works correctly.

The “contains” command erroneously reports that NOTHING in list_a is in list_b.

My guess is that when I ask the Finder for the list of processes, that list is returned as Unicode and therefore can’t be compared with elements that are not.

If anyone has any ideas about fixes or workarounds, it would be greatly appreciated! (I’ve thought about ‘echoing’ the text with a shell script and using the result or writing it to a file and reading it back (since this seems to make the text usable) but I’m hoping someone knows of a simple coercion.

Thanks,
Colin.


tell application "Finder" to set list_a to ((name of every process) as list)
set list_b to {"Dock", "Dummy", "Finder", "System Events", "wsxd"}

set common_list to {}
set unique_list to {}

repeat with i in list_a -- for every item on list_a, see if its in list_b
	
	set list_item to (i as text) -- text of the item
	
	if (list_b contains list_item) then -- if the item is on list_a
		set common_list to common_list & list_item -- add its name to the common list
		
	else if list_item is not equal to "" then -- else, if the item is not there and is not empty,
		set unique_list to unique_list & list_item -- put it on the unique list
		
	end if
	
end repeat

choose from list common_list

What if you use this?

set list_b to {"Dock" as Unicode text, "Dummy" as Unicode text, "Finder" as Unicode text, "System Events" as Unicode text, "wsxd" as Unicode text}

When you switch to japanese, what do you see in “list_a”? (eg, add a “return list_a” before calling the Finder and see the result)

Yep. That compares properly now. Thanks!

But… the other list I use in the program comes from a file and I’ll have to parse the list of items and convert it every time the program runs, which will likely be a little slow for the users. Is there any way to tell whether a block of text is unicode or regular 8-bit text?

If I can say “as Unicode text” to coerce regular text, is there a phrase I can use to coerce unicode into regular text? (I tried “plain text” but that just changes the word to “string.”)

Is there any way to set the text handling mode of a script? (I’m thinking something similar to “AppleScript’s Text Item Delimiters.”)

Thanks,
Colin.

You know the ktxt trick:

set x to «class ktxt» of ("unicode stuff" as unicode text as record)

There are very fast routines to iterate over lists which you could use if the ktxt stuff doesn’t work for you…

Thanks JJ! That worked great! I had tried ktxt before but it wouldn’t work. Little did I know that the esoterics of AppleScript required I coerce the item into a ‘record’ first.

Off-Topic:

This does bring up my biggest peave with AppleScript: I’ve been coding AppleScript for 10 years or so, and have built some pretty complex scripts for clients. Yet still I get caught by some bizarro syntax required to get at a property.

I have a well worn copy of “The Tao of AppleScript” which has been a fine reference book, but I have yet to completely get my head around what is going on in AppleScript when, say, it requires a conversion to ‘record’ before it can be converted to text.

Is there a reason I should have known to convert to ‘record’ first?

If you can explain that part, or recommend a book that can explain what Apple was thinking when they designed AppleScripts syntax and grammer structure, that would be great!

Thanks again for your insight!
-Colin.

As long as I know, you (or me) shouldn’t know nothing about the “as record” stuff. Fortunatelly, there are some folks out there which allways attempt to coerce anything to everything, which is fine for us :lol:

As you see, Unicode text contains two things: a string (aka «class ktxt») and style information (aka «class ksty», contains bolds, font size, etc.). I don’t know if this is stated somewhere in some obscure place… :?:

Okay, I rewrote the program so it now works entirely in Unicode. It works fine on Japanese systems except that process names with Japanese characters in them do not display correctly in the ‘choose from list’ command.

If I try to display Kanji characters in Display dialog, it seems to work fine, but choose from list fails. I would guess this is a limitation of choose from list, but I thought I’d ask again to see if you had any magical ways around this too!

choose from list {“”}

Thanks again for all your help!
-Colin.

P.S. This is for Quit Anything, btw: http://frozenheads.com/quit_anything_x, in case that helps give the problem some context.

Hi again,

I just thought I’d share some of the strangeness of Unicode support in AppleScript.

(NOTE: In the example below I’m going to use the word “Kanji” for “a word with Japanese characters in its name.”

If I type:

tell application "Kanji" to activate

or

set process_name to "Kanji"
tell application process_name to activate

everything works fine. The process correctly comes to the foreground.

However, if I type this:

set process_name to "Kanji" as Unicode text
tell application process_name to activate

it fails (a dialog pops up asking: Where is process ?).

I thought «class ktxt» returned 8-bit text, but it really just returns unstylized text. If you ask for «class ktxt» of “Kanji” you still get “Kanji”, but you will now be able to correctly tell that process something. So, this works:

set process_name to "Kanji" as Unicode text
set process_name to «class ktxt» of (process_name as record)
tell application process_name to activate

Strange but true (at least: true in 10.3.2).
-Colin.

That was what I thought… I didn’t tested, but thought that “class ktxt” was really 16-bit text (unicode, two-byte, etc)

I really don’t understand pretty good what is going on with all this Unicode stuff… Teorically, tell app “x” doesn’t require “Unicode text” for the call. Some apps require Unicode text for its parameters, but will accept ASCII as input… Also, AS’s class “unicode text” includes style information… :?:

Yes, there seems to be style text information in with the unicode text, too and that has to be removed by the aforementioned process before trying to ‘tell’ the application anything. I don’t want to use Unicode Text, but the process by which I get the process names forces Unicode Text upon me. So I have to get the ‘class ktxt’ of it before I can do anything else with it.

Also (!!):

One of my japanese users supplied me with this unusual (I think, anyway) fix for the problems I was having with ‘choose from list.’

http://bbs.applescript.net/viewtopic.php?p=24246

-Colin