Help -- Need to Convert Unicode text into Format Suitable for Terminal

I am trying to write an AppleScript that will take the CD audio tracks reported by iTunes and encode them using a UNIX based encoder of my choice (e.g. Lame, OGG, FLAC etc). My program works great with regular non-unicode tracks. But if the CD audio tracks have names in unicode, the script fails.

It looks like the terminal does not support unicode. Instead you must somehow convert the unicode into a format that the Terminal needs.

For example, my script fails with the following CD album:

Name of Album: Dvořák: Symphony #9, Slavonic Dances
Name of Track: Dvořák: Slavonic Dance #2 In E, Op. 46/2

Here is a track reported in Finder or iTunes:
(Note: Mac OS X replaces : in the name with /)

Dvořák/ Symphony #9, Slavonic Dances:7 Dvořák/ Slavonic Dance #7 In C, Op. 46/7.aiff

But in Terminal, if you use “ls” you get:

/Volumes/Dvor\314\214a\314\201k:\ Symphony\ #9,\ Slavonic\ Dances/7\ Dvor\314\214a\314\201k:\ Slavonic\ Dance\ #7\ In\ C,\ Op.\ 46:7.aiff

Has anyone written a handler (or subroutine) that will convert unicode text into a format suitable for the Terminal (with all the escpae sequences).

In my case, since iTunes can give me the album name and track name in unicode, I would prefer to convert the actual name in unicode (not the name reported in the Finder) into a format suitable for the Terminal.

Any help would be greatly appreciated.

RobK
P.S. I am running Tiger 10.4.10

AppleScript: 1.10.7
Browser: Safari 419.3
Operating System: Mac OS X (10.4)

Moving to OS X…

Edit: Can you post what you have started?

Unix dates from the days when ASCII was the norm and most Unix tools aren’t Unicode aware, so any Unicode text should be encoded as UTF8 for compatibility when dealing with the Unix shell (UTF8 is a superset of ASCII).

When passing shell commands via ‘do shell script’ or Terminal’s ‘do script’ command, the command string will be encoded as UTF8 automatically. If writing text to file that will be read by a shell script, use write txt to fileRef as «class utf8». There might or might not also be decomposition issues relating to your accented characters; I dunno offhand how those are handled by the filesystem and by iTunes. If you’re not familiar with Unicode and Unicode encodings, I’d recommend reading up on the basics, e.g. start here.

If you want more specific help then you’ll need to do as Bruce says and post your source code.

HTH

hhas,

MANY THANKS!! You must have ESP or other special powers. My AppleScript writes Bash commands to a file that is later executed using a “do shell script” command. All I needed to do was add “as «class utf8»” to my write statements and now my script can read the CD Audio files, rip them, wite the encoded MP3 file using a Unicode filename, import the MP3’s into iTunes and even tag them in iTunes with Unicode data.

My script actually writes two sets of tags. The first set if tags is stored in the MP3 file using LAME. The second set of tags is written into iTunes. iTunes stores the metedata separately from the MP3 file since this provides more flexibility. (e.g. LAME only supports certain genres. With iTunes, you can use any genre).

When I send iTunes Unicode tags, iTunes handles them flawlessly. But when I send LAME unicode tags, the tags get messed. (Not a big deal since the tags in iTunes is what poeple actually see and use). Maybe I will have to convert the Unicode tags into a string before I send them to LAME.

I thought about posting the code but it is already 8 pages of code. It is too much code to post in a forum. But with your ESP, you gave me what I needed to know – Add “as «class utf8»” to the write statements!

Thanks again!

RobK