Howto replace non-ASCII chars?

Hi there,
i am trying to replace special German characters in a string with ordinary ASCII chars.

After some digging I borrowed some code to do the needed work.
The problem with it is, that it doesn’t replace chars like ‘ü’.

Someone got an idea how to solve this Problem with appleScript?

Tanks in advance and Greetings from Germany,
Thomas

I tried to simplify my code as much I could:


set myFile to choose file
tell application "Finder" to set fileName to name of myFile
set fileName to snr(fileName, "ü", "ue")
display dialog fileName

on snr(the_string, search_string, replace_string)
	return my list_to_string((my string_to_list(the_string, search_string)), replace_string)
end snr

on list_to_string(the_list, the_delim)
	my atid(the_delim)
	set the_string to (every text item of the_list) as string
	my atid("")
	return the_string
end list_to_string

on string_to_list(the_string, the_delim)
	my atid(the_delim)
	set the_list to (every text item of the_string) as list
	my atid("")
	return the_list
end string_to_list

on atid(the_delim)
	set AppleScript's text item delimiters to the_delim
end atid

Is this what you are looking for?


set searchItems to {{"ü", "ue"}, {"é", "e"}}

set stringToSearch to "hoügiv anédMü"

repeat with i from 1 to count of searchItems
	set searchItem to item i of searchItems
	set stringToSearch to do shell script "echo " & quoted form of stringToSearch & " | sed 's/" & item 1 of searchItem & "/" & item 2 of searchItem & "/g'"
end repeat

Hi Craig,
thank you for your post. Your solution just works fine, thanks! :slight_smile:

I just figured out that the script is not working if I am picking a file name from my disk. :frowning:

How can that be? Is the code page different?


set myFile to choose file
tell application "Finder" to set fileName to name of myFile

set searchItems to {{"ü", "ue"}, {"Ü", "Ue"}, {"ö", "oe"}, {"Ö", "oe"}, {"ä", "ae"}, {"Ä", "Ae"}, {"é", "e"}, {" ", "_"}}

set stringToSearch to fileName

repeat with i from 1 to count of searchItems
	set searchItem to item i of searchItems
	set stringToSearch to do shell script "echo " & quoted form of stringToSearch & " | sed 's/" & item 1 of searchItem & "/" & item 2 of searchItem & "/g'"
end repeat
display dialog stringToSearch

I am perplexed. I will keep looking into it but hopefully
someone will pipe in whose knowledge goes a bit deeper
than mine. :stuck_out_tongue:

Regards,

Craig

See if this works.


set myFile to choose file

tell application "Finder" to set stringToSearch to (name of myFile)
set stringToSearch to characters of stringToSearch

set searchItems to {"ü", "Ü", "ö", "Ö", "ä", "Ä", "é", ","}
set replaceItems to {"ue", "Ue", "oe", "oe", "ae", "Ae", "e", "_"}


set cleanString to ""

repeat with i from 1 to count of stringToSearch
	set thisC to item i of stringToSearch
	if {thisC} is in searchItems then
		set cleanString to cleanString & item i of replaceItems
	else
		set cleanString to cleanString & thisC
	end if
end repeat

cleanString

Thanks again.
Your second solution works also with file names, I just tried :slight_smile:

For readers who might benefit from this script as well:
Depending of your defaults it might be necessary to activate
the ‘considering case’ option.

cheers


considering case
	
	set myFile to choose file
	
	tell application "Finder" to set stringToSearch to (name of myFile)
	set stringToSearch to characters of stringToSearch
	
	set searchItems to {"Ü", "ü", "Ö", "ö", "Ä", "ä", "é", ","}
	set replaceItems to {"Ue", "üe", "Oe", "Ae", "ae", "Ae", "e", "_"}
	
	set cleanString to ""
	
	repeat with i from 1 to count of stringToSearch
		set thisC to item i of stringToSearch
		if {thisC} is in searchItems then
			set cleanString to cleanString & item i of replaceItems
		else
			set cleanString to cleanString & thisC
		end if
	end repeat
end considering
cleanString

The sed method also works if you read the file name as string:

tell application “Finder” to set stringToSearch to name of myFile as string

I could not explain why however.

. if your machine runs Tiger.
In Leopard there is only one text class, therefore there is no difference between string, Unicode text and text