Convert to lower case | error message

Hi, back to Original Poster’s issue…
I don’t know the problem, but if you modify your code as follows then post the error result,
it will be easier for us all to pinpoint the problem rather than speculate…


try
	set the_list to (do shell script "echo " & quoted form of (get the clipboard) & " | tr [:upper:] [:lower:]")
on error
	tell application "TextEdit"
		make new document
		set text of document 1 to "echo " & quoted form of (get the clipboard) & " | tr [:upper:] [:lower:]"
	end tell
end try

That way we’ll all know what string pattern (** as seen by the shell **) has tripped a wire.

Model: PPC G4 Dual-800Mhz
AppleScript: 1.10.7
Browser: Safari 533.19.4
Operating System: Mac OS X (10.4)

Hello.

The erratic behaviour may very well have been quotes in the text on the clipboard, please do try this: :slight_smile:

It may be an overkill, but this parsing of the clipboard, should work with any do shell script command line you can imagine. (Please don’t prove me wrong. :wink: )

# set AppleScript's text item delimiters to ""
# set the clipboard to "Nik's \"BLOG\""
# set theText to the clipboard

set textWithEscapedQuotes to escapeQuotesInDoShellScriptInput for theText

set the_list to (do shell script "echo " & quoted form of textWithEscapedQuotes & " | tr [:upper:] [:lower:]")

set textWithUnescapedQuotes to unescapeQuotesFromDoShellScriptOutput for the_list

to unescapeQuotesFromDoShellScriptOutput for someText
	tell (a reference to text item delimiters)
		local originalTIDs, textWithUnescapedQuotes, temporaryVar
		set {originalTIDs, contents of it} to {contents of it, "\\\""}
		set temporaryVar to text items of someText
		set contents of it to "\""
		set textWithUnescapedQuotes to temporaryVar as text
		set contents of it to "\\'"
		set temporaryVar to text items of textWithUnescapedQuotes
		set contents of it to "'"
		set textWithUnescapedQuotes to temporaryVar as text
		set contents of it to originalTIDs
		# We reset the text item delimiters as we found them.
		return textWithUnescapedQuotes
	end tell
	
end unescapeQuotesFromDoShellScriptOutput

to escapeQuotesInDoShellScriptInput for someText
	tell (a reference to text item delimiters)
		local originalTIDs, textwithEscpaedQuotes, temporaryVar
		# we save the original text item delimiters, and sets them to '
		set {originalTIDs, contents of it} to {contents of it, "'"}
		set temporaryVar to text items of someText
		# we break down the text, to items between '
		set contents of it to "\\'"
		# we set new text item delimiters to \\'
		set textWithEscapedQuotes to temporaryVar as text
		# we reassemble the text with the new quotes.
		set contents of it to "\""
		# we repeat the steps for "
		set temporaryVar to text items of textWithEscapedQuotes
		set contents of it to "\\\""
		set textWithEscapedQuotes to temporaryVar as text
		# reassmbles he text
		set contents of it to originalTIDs
		# We reset the text item delimiters as we found them.
		return textWithEscapedQuotes
		# finally we return the text with escaped quotation mark.
	end tell
end escapeQuotesInDoShellScriptInput

I made two handlers from the original code above, and updated the post with it, as it is more reusable then, when you deal with the clipboard, and quotes, that may break the do shell script. This is mostly necessary if you echo the contents of a variable into a pipe. I think <<< ' & quoted form of input
should be a safe approach, but the circumstances isn’t always such, that this approach is feasible, therefore the handlers.

I recommend ASObjC Runner or Satimage OSAX, which both are considering diacritics

I agree with StefanK

But here is a quick and dirty way to roll your own, when you know what diacriticals you need to be able to handle,
as you can see, I have added æøå to the end of the ascii 7-bit alphabet, in the sed “translate” commands.

to makeUpper for sometext
	tell (do shell script "echo " & quoted form of sometext & "  | sed 'y/abcdefghijklmnopqrstuvwxyzæøå/ABCDEFGHIJKLMNOPQRSTUVWXYZÆØÅ/'")
		return it
	end tell
end makeUpper

to makeLower for sometext
	tell (do shell script "echo " & quoted form of sometext & "  | sed 'y/ABCDEFGHIJKLMNOPQRSTUVWXYZÆØÅ/abcdefghijklmnopqrstuvwxyzæøå/'")
		return it
	end tell
end makeLower

For that, you don’t need sed, tr will do just fine:

do shell script "tr '[a-zöüä]' '[A-ZÖÜÄ]' <<<'erhältlich'"

Why build a script with a short subset of non ASCII characters when we may use existing tools able to treat every unicode characters.

I named ASObjCRunner and Python. Stephank added Satimage OSAX;

All of them make an excellent job.

KOENIG Yvan (VALLAURIS, France) samedi 31 août 2013 22:56:42

First of all because it’s possible and we show in many topics multiple ways to skin the cat, you can do this with php as well. Then some scriptwriters, including me, avoid the use of extra 3rd party software. So that means that there is only Python left to use for me. Then tr command does support multi byte characters, the problem that global variables [:upper:] and [:lower:] are simply equal to A-Z and a-z. So it’s not that tr is not supporting Unicode, it’s that, according to the Standards, those variables doesn’t contain extended characters. Then at last, tr is a lot (more than 10 times) faster than python.

First let me say, that I have no trouble using third party tools, and that I started out by just showing a quick and dirty alternative, which DJ Bazzie Wazzie had a more elegant solution to.

Now, the problem with Python, at least in the past where broken installations, so that is one thing I wouldn’t use, for that reason first and foremost. The second thing about Python, is that it is more typing, and therefore more error prone. So, I think frankly that the tr command is a better alternative here, than Python. :slight_smile:

The problem with tr is that we must define the list of extended characters to translate in the command.
When I receive documents, I don’t know which extended characters are embedded so the tr scheme is perfectly useless. I will not build a list of every possible extended chars.
I know that it’s not a problem with texts written in English but English is not ruling the world.
So for a non English user, only a scheme treating the Unicode encoding(s) is relevant.

I never used PHP. In fact I try to remain far from Unix tools as long as there is no other scheme available.

If someone is fair enough to post the PHP convertcase incantations I will be glad to try them.

For me there is no problem with Python’s convertcase commands.
Tested handler is available in a library file from which I extract it when needed.
I posted it here some days ago.

KOENIG Yvan (VALLAURIS, France) dimanche 1 septembre 2013 16:01:00

In PHP you could use something like:

do shell script "php -r 'echo mb_strtoupper(fgets(STDIN), \"UTF-8\");' <<<'größe'"

Python is about 2 times faster than PHP.

I’m not trying to say that I’m right, just explaining why I’m using tr, not trying to say why someone else should. In my language we only have a few more characters than English so the list is very small. I understand why someone else would use 3rd party software or use python. The reason for me not using 3rd party software is because I need to write scripts for hundreds of macs of all kind of different OS. All my scripts needs to run on a out-of-the-box mac.

My problem is that I receive files sent by users from numerous countries so I must have tools able to treat all of them.

When I write a script for an asker, I use the python handler.
When I write for myown use, I use the ASObjC Runner handler because this application is always open during the boot process on my mac.

Thanks for the PHP code.

KOENIG Yvan (VALLAURIS, France) dimanche 1 septembre 2013 16:45:21

Searching on the Net I found the other instructions to convert case with PHP.
So I wrote :

do shell script "php -r 'echo mb_strtoupper(fgets(STDIN), \"UTF-8\");' <<<'größe'"
quoted form of result
"php -r 'echo mb_strtolower(fgets(STDIN), \"UTF-8\");' <<<" & result
do shell script result
quoted form of result
"php -r 'echo mb_ucfirst(fgets(STDIN), \"UTF-8\");' <<<" & result
do shell script result
quoted form of result
"php -r 'echo mb_ucwords(fgets(STDIN), \"UTF-8\");' <<<" & result
do shell script result

The conversion to lower works well but the two others fail.

tell current application
do shell script “php -r ‘echo mb_strtoupper(fgets(STDIN), "UTF-8");’ <<<‘größe’”
→ “GRÖßE”
do shell script “php -r ‘echo mb_strtolower(fgets(STDIN), "UTF-8");’ <<<‘GRÖßE’”
→ “größe”
do shell script “php -r ‘echo mb_ucfirst(fgets(STDIN), "UTF-8");’ <<<‘größe’”
→ error "
Fatal error: Call to undefined function mb_ucfirst() in Command line code on line 1" number 255

If I don’t prefix the command name with mb_ I get :

tell current application
do shell script “php -r ‘echo mb_strtoupper(fgets(STDIN), "UTF-8");’ <<<‘größe’”
→ “GRÖßE”
do shell script “php -r ‘echo mb_strtolower(fgets(STDIN), "UTF-8");’ <<<‘GRÖßE’”
→ “größe”
do shell script “php -r ‘echo ucfirst(fgets(STDIN), "UTF-8");’ <<<‘größe’”
→ "
Warning: ucfirst() expects exactly 1 parameter, 2 given in Command line code on line 1"
do shell script “php -r ‘echo ucwords(fgets(STDIN), "UTF-8");’ <<<’
Warning: ucfirst() expects exactly 1 parameter, 2 given in Command line code on line 1’”
→ "
Warning: ucwords() expects exactly 1 parameter, 2 given in Command line code on line 1"
end tell

Puzzling.

KOENIG Yvan (VALLAURIS, France) dimanche 1 septembre 2013 17:17:16

Perhaps your list is small, but that’s exactly the kind of logic that so many programmers used to avoid offering proper Unicode support for a long time. I’m used to that sort of thing justified by English-speakers, but it troubles me more to hear it from others who have seen the problems it can cause.

You seem adamant about this, so I guess you never have to script non-Apple applications. Nonetheless, it seems a bit odd that in your quest to support all Macs, you’re happy to use language-specific hacks that, by definition, won’t run properly on lots of out-of-the-box Macs.

That’s the whole point of my post: You don’t have to convince me of your right, for me (including my clients) the tr solution can be perfect when I’m programming in AppleScript. For anyone else and global solution and not the restriction of using plain AppleScript there are many solutions out there. If it’s any consolidation, in C I use the standard Unicode functions. :cool:

To me what is good is not starting too many processes. So, that’s always the objective.

On my side, I like to learn so, at this time my point is to know what am I doing wrong when trying to convert only the first character of a sentence or the first character of every words using PHP.

Also, it would be fine if one day, Unix become Unicode aware.

KOENIG Yvan (VALLAURIS, France) lundi 2 septembre 2013 15:39:48

Aren’t you looking for this?

do shell script "php -r 'echo mb_strtoupper(fgets(STDIN), \"UTF-8\");' <<<'größe'"
quoted form of result
"php -r 'echo mb_strtolower(fgets(STDIN), \"UTF-8\");' <<<" & result
do shell script result
quoted form of result
"php -r 'echo ucfirst(fgets(STDIN));' <<<" & result
do shell script result
quoted form of result
"php -r 'echo ucwords(fgets(STDIN));' <<<" & result
do shell script result

If you want multibyte support I suggest you’re using multibyte substring command to get the first character, uppercase the one unicode character string and concatenate that with the rest of the original string. Unfortunatly PHP doesn’t have a built-in solution today. It’s a bit cumbersome in PHP but the only method to uppercase words that begins with extended characters like the dutch word for 1 (één).

Thanks.

Now I understand what was wrong in my attempts and I know that PHP isn’t the tool which I will use.

KOENIG Yvan (VALLAURIS, France) mardi 3 septembre 2013 11:05:02

Here’s one I wrote a while back. Still gets the magic done.

--Upper / Lower 1.0.3
--Panah Neshati

on uppercase(a)
	set b to ASCII number of a
	set b to b - 32
	set b to ASCII character b
	return b
end uppercase

on lowercase(a)
	set b to ASCII number of a
	set b to b + 32
	set b to ASCII character b
	return b
end lowercase

lowercase("A")

uppercase("a")

Sorry to hear yours is broken, that sucks. Hope mine works for you. By the way, it uses no Python or PHP.

Panah

Model: MacBook Pro
AppleScript: 2.2.4
Browser: Firefox 23.0
Operating System: Mac OS X (10.8)