Scandinavian letters å ä ö doesn't work in shell script path

Is there a way to have Scandinavian letters “Ã¥”, “ä”, and “ö” work in a shell script path?

The following script works just fine if you replace the “ö”.

set SourceFile to "/Volumes/HD2/sjungarna_inspelningar/210 - Picassokören/210 - Picassokören.ptf"
set DestinationFile to "/Volumes/HD2/sjungarna_inspelningar/210 - Picassokören/210 - Picassokören.ptf.zip"
set SourceFile to quoted form of SourceFile
set DestinationFile to quoted form of DestinationFile

do shell script "zip -r -j " & DestinationFile & " " & SourceFile & "* -x *.DS_Store"

Hello.

On Snow Leopard there is the propertylist file ~/.MacOsX/Environment.plist in there I have specified no_NO.UTF-8 as the value for the LC_ALL variable to make it work at all times. They have changed this into a settings file that resides under the regime of launchd from Lion onwards. You’ll have to google it.

To make this work for you in a shell script you should prepend the do shell script command with LC_ALL=sv_SE.UTF-8.

See man locale and the locale files resides under /usr/share/locale if you are curious. :slight_smile:

Hi McUsrII,
and thanks for your answer!
Could you show me how my script would look like with your addition inserted?

Hei Søta bror! :slight_smile:

Here you are, the syntax isn’t full intutitve, but it works like this: When you set a variable in front of a command, it takes effect for the following command, and its subcommands. for instance:

CFLAGS=”arch=x86_64 make

If it should fail for subcommands, that is, commands that is called by the initial command, then it is because the flagging of variables for export aren’t working like on my machine, if you discover this, then you can use “export LC_ALL=sv_SE.UTF-8;” instead in front of your command. (Observe the semi-colon!)
:slight_smile:

set SourceFile to "/Volumes/HD2/sjungarna_inspelningar/210 - Picassokören/210 - Picassokören.ptf"
set DestinationFile to "/Volumes/HD2/sjungarna_inspelningar/210 - Picassokören/210 - Picassokören.ptf.zip"
set SourceFile to quoted form of SourceFile
set DestinationFile to quoted form of DestinationFile

do shell script "LC_ALL=sv_SE.UTF-8 zip -r -j " & DestinationFile & " " & SourceFile & "* -x *.DS_Store"

holy räksmörgås!

I can’t get it to work anyway.
This script:

set SourceFile to "/Volumes/HD2/sjungarna_inspelningar/210 - Picassokören/210 - Picassokören.ptf"
set DestinationFile to "/Volumes/HD2/sjungarna_inspelningar/210 - Picassokören/zip-filer/210 - Picassokören.ptf.zip"
set SourceFile to quoted form of SourceFile
set DestinationFile to quoted form of DestinationFile

do shell script "LC_ALL=sv_SE.UTF-8 zip -r -j " & DestinationFile & " " & SourceFile & "* -x *.DS_Store"

generates this error:
error " zip warning: name not matched: /Volumes/HD2/sjungarna_inspelningar/210 - Picassokören/210 - Picassokören.ptf*

zip error: Nothing to do! (/Volumes/HD2/sjungarna_inspelningar/210 - Picassokören/zip-filer/210 - Picassokören.ptf.zip)" number 12

Hello.

I believe the issue to be that the filenames were encoded differently when you first made your script with those names. I think it really should have had, but if you are using ISO8859-1 encoding, then the names will be different in UTF-8 encoding because the accented characters has a different encoding (number value), although the will work perfectly in finder and Applescript, since the values are there interpreted int UTF-16, (loose or compressed, that is with acctents encoded as separated codepoints, or embededd into to UTF-16 character as a single codepoint.)

I think you shold open a terminal window for starters, and check what you get when you enter echo $LC_ALL
you should also check out your System Preferences, and verify that you are using Swedish as your locale settings.

I also wonder what version of Mac Os X you have.

I don’t know how to exactly solve those issues for you, but I’d start with the system preferences, and then I’d figure out how to set the locale settings globally, that works perfectly for me. When I have entered the values into the system property list files, which makes all shells, and do shell scripts adhere to this, then there are no issues really.

It works like magic! :slight_smile:

Edit
It may also very well be that it is the zip tool you are using that are causing the trobule, you should really try to use do shell script on some accented files with the copy command (cp) to see if that works ok.

And you’ll allways have the automator file compression service at hand if everything else fails.
Though I think the labor spent figuring it out, and making it work, is well worth the effort, because of the nature of your language, you’ll end up in situations like this all the time if you don’t deal with it once and for all.

McUsrII,

I’m afraid I don’t get much of what you’re talking about here, sorry.
I’m on Lion (10.8.2). System Preferences > Language & Text > Region: Region is set to “Sweden (Swedish)” - is it that pref that you’re talking about?

What do you mean by “how to set the locale settings globally”?

BTW, all my applescripts are fired off from FileMaker Pro. You’re talking about encoding, do you mean the text encoding that is in the script text in AppleScript Editor? I know FIleMaker exports in UTF-16, don’t know if it has anything to do with something while executing the AppleScript from FM though…

When I write “echo $LC_ALL” in Terminal, nothing happens.

Hello.

I think the first thing to look at, is to try just simple copy (cp) operations of files with accented characters in a do shell script, and see if that is successful. If it is, then it is the zip utillity that uses iso-8859-1 or iso-8859-15, that will render the characters wrongly on your machine/os.

This works just fine:

do shell script "cp -p " & "/Users/Tobias/Desktop/tést.pdf" & " " & "/Users/Tobias/Desktop/tést_copy.pdf"

Is that the right test?

Honestly, not really. :slight_smile:

We may not like it, but é has a more prominent place in the world languages, and therefor more likely to be encoded correctly. Now, please retry with a filename like åskasnsräksmørbrød or something, and see if that turns up correctly!

:smiley:

This works as well!:

do shell script "cp -p " & "/Users/Tobias/Desktop/åskasnsräksmørbrød.pdf" & " " & "/Users/Tobias/Desktop/åskasnsräksmørbrød_copy.pdf"

Hello.

Well done. :slight_smile:

Then I think it is the zip utility that is at err. Please either read the manual, or find and alterntative to zip your files with. I think there is tons of utilities to zip with out there. The one I really recommend, is ditto, which creates zip files, and is shipped with your Mac. please read man ditto (You can also pass it parameters that doesn’t create any resource files, so that the zip can easily be opened on other machines.) There are some examples here, that Stefan has written.

[b]Edit[/b[
if you then get bad results, I can inform you that the reason for that is, is that every character you see is a bytsequence, and the archieve utility encodes those byte sequences according to locale. The code points lie on several bitplanes, the ones that are not on the first bitplane, (Ã¥ for instance in Norwegian) needs several values, and if the locale isn’t set, then those values are likely to be interpreted errantly from our point of view. The fix then, is to figure out where the Environment.plist file is located on your machine. (google Environment.plist Mac os x snow lion). set a variable there: LC_ALL to sv_SE.UTF-8, and restart your machine. Then all programs should work correctly. As the underlying libraries that interprets bytes into codepoints/characters, then know how to do so correctly.

Thanks again for your reply!

Based on Stefans script this is what I came up with, and it works just fine:

set source to "/Volumes/HD2/sjungarna_inspelningar/210 - Picassokören/210 - Picassokören.disc"
set destination to "/Volumes/HD2/sjungarna_inspelningar/210 - Picassokören/zip-filer/210 - Picassokören.disc.zip"

do shell script "/usr/bin/ditto -c -k -rsrc " & quoted form of source & " " & quoted form of destination

Happy räksmörgås to you McUsrII !
:smiley:

Ahh.

Glad we figured it out that zip doesn’t play totally well with utf-8.