To loose underscores

alastor933 · July 21, 2012, 9:59am

I like my variables camelCased: underscores, for me, reduce legibility.
When a copied or downloaded script has just a handful of lines changing them by hand is not that bad.
But every now and then I encounter a longer script I want to unravel, and then it becomes quite unpleasant.
So I did something about it.

A complication: there’s a convention that uses a prefixed underscore to label private handlers
(handlers which the main script should not call), and those should be kept.

After a bit of tinkering I came up with this:

(*
	When doing an entire script this can be somewhat speeded up
	by first checking if a line contains underscores at all
*)
set theText to "		property _private_handler_ : {}"

stripUnderscores(theText)

----------------------------------------------------------------------
-- Convert variable names with underscores to camelCase convention
-- Prefixed underscores are retained - they designate private handlers
-- parameter:	<text> a line of code containing underscores
-- returns:		<text> processed line
----------------------------------------------------------------------
on stripUnderscores(theCode)
	
	set doneList to {} -- collect parsed text
	
	set i to 1 -- must control loop counter within loop
	
	repeat
		
		try
			set x to text item i of theCode
			set y to text item (i + 1) of theCode
			set z to text item (i + 2) of theCode
		on error -- end of text
			set end of doneList to {x, y} -- more underscores are unlikely
			exit repeat
		end try
		
		-- we want to keep prefixed underscores
		if (x is not space) and (y is "_") then
			-- skip the underscore
			-- uppercase the next character
			set end of doneList to {x, uppercase z} -- using Satimage osax
			-- skip to next triplet
			set i to i + 3
		else
			-- nothing going on, just continue
			set end of doneList to x
			set i to i + 1
		end if
		
	end repeat
	
	return doneList as text
	
end stripUnderscores

It’s simple, and obviously not very fast, having to work its way through the entire script text one text item at a time.
Not a problem, really, as you wouldn’t use this every 10 minutes. Yet, there may be a more efficient way of doing this.

Also, there might be some pitfall I missed: something that gets lost when using “text items of”.
Note that this would be used on a compiled script (using osadecompile).

I welcome your thoughts.

McUsr · July 21, 2012, 11:15am

Oh do you!

It is a nice script, elegant and simple, but maybe slow. It is not for me though.

It is funny how different we are!

I tend to use under scores, when I do serious stuff I mean, when I really DO something, then the underscores suddenly pops up in the variable names! No offence, but I I don’t look at CamelCase, as any different than any other variable name, just harder to type!

By the way, when I suspect my variable name, to be so general, that it may conflict with a dictionary word, then I tend to put an underscore, in front of it. I find that more readable: _var compared to |var|.

It is a shame though, that there isn’t any decent editor on Os X, at least that I know of, that that support camel casing. I tend to like Dromedar casing better :D. That is; just one uppercase in the variable name.

You asked for it!

Edit:
There are actually code, in which I find underscores, misplaced, but I haven’t seen that in a long time, and it is even longer since I copied that code, but never say never, I may put your script to good use one day!

alastor933 · July 21, 2012, 12:20pm

Yes, those piped vars are totally dreadful. They tend to show up a lot when reading plists with System Events.

While on the subject of how we write scripts: are there any tools (in the conceptual sense, not the actual application) you use before actually writing the script? You have posted some very long ones - I could not write those without first “organising my thoughts” in some diagram or other (the comments in my script above are the polished-up pseudocode I started with).

Ha! So they are in your muscle memory, not your head!

Nigel_Garvey · July 21, 2012, 12:52pm

Hi, Alastor933.

You’re using ‘text items’ with no reference at all to ‘text item delimiters’, which is never a good idea.

It would probably be faster actually to get the text items of the code using an underscore as the delimiter. You’d then only have to look at what’s either side of the gaps:

set theText to "		property _private_handler_ : {}"

stripUnderscores(theText)

on stripUnderscores(theCode)
	if (theCode contains "_") then
		set astid to AppleScript's text item delimiters
		set AppleScript's text item delimiters to "_"
		set textItems to theCode's text items
		repeat with i from 2 to (count textItems)
			set previousTI to item (i - 1) of textItems
			if (((count previousTI) > 0) and (character -1 of previousTI is in " ({")) then
				set item i of textItems to "_" & item i of textItems
			else
				set idList to (id of item i of textItems)
				if (idList is not {}) then
					set initial to beginning of idList
					if (initial > 96 and initial < 123) then set item 1 of idList to initial - 32
					set item i of textItems to string id idList
				end if
			end if
		end repeat
		set AppleScript's text item delimiters to ""
		set theCode to textItems as text
		set AppleScript's text item delimiters to astid
	end if
	
	return theCode
end stripUnderscores

DJ_Bazzie_Wazzie · July 21, 2012, 2:48pm

or a single perl command, I use perl because sed is not gnu compliant in OS X and therefore the u instruction won’t work.

set x to "tell application \"Finder\"
set window_name to name of window 1
end tell

on example_function()

end

on _another_example()

end
"

do shell script "echo " & quoted form of x & " | perl -pe 's/([^ ])_([a-z])/\\1\\u\\2/g'"
--or remove the first _ 
do shell script "echo " & quoted form of x & " | perl -pe 's/([^ ])_([a-z])/\\1\\u\\2/g ; s/([ ])_([a-z])/ \\2/g'"

probably the fastest solution

alastor933 · July 21, 2012, 2:51pm

Hi, Nigel.
So true. There’s me just diving in again…

Ahh… (more diving in).

I added some comments to your core code to see if I can follow its workings. Take a look, if you will, and explain what I didn’t get:

set theText to "set this to {_that, _andmore}"
set moreText to "on test(_that)"

-- the 2nd condition tests against a list to allow code like the samples above
if ((count previousTI) > 0) and (character -1 of previousTI is in " ({") then
	-- (skip empty string)
	-- when the preceding item ends with a space (or a few others) we want the underscore
	-- so we add it back in to the front of the current item
	set item i of textItems to "_" & item i of textItems
else
	-- when not a space we had an in-word underscore
	-- between preceding and current item
	
	-- turn current item into list of ASCII codes (deprecated jargon ;)
	set idList to (id of item i of textItems)
	
	-- does this skip case where underscore was ultimate char?
	if (idList is not {}) then
		-- I understand this uppercases the 1st char
		-- just for fun, or is there a reason not to use Satimage? Speed?
		set initial to beginning of idList
		if (initial > 96 and initial < 123) then set item 1 of idList to initial - 32
		-- and back to (now titlecased) string
		set item i of textItems to string id idList
	end if
end if

Nigel_Garvey · July 21, 2012, 5:08pm

Hi. Yes. The text item’s empty if the underscore’s the last character in the text. You also get empty text items when there are two or more underscores together, which I’ve seen once or twice!

I did the capitalisation with id codes just because it occurred to me to do it that way. I didn’t use Satimage because not everyone has it.

The script still needs some adjustments to prevent the zapping of underscores in literal strings and (possibly) comments!

McUsr · July 21, 2012, 5:47pm

Hello!

It happens that I plan a script, but I never spend a whole lot of time, planning up front, something that may change when I plan for it anyway. I see it more important to get something up and running, so prototyping is my way of doing things. Ofcourse, when I reckognize something as complicated and complex, then I plan that part, with diagrams, but there is no formal flowchart to be written, just telling you!

The underscores I beiieve is not in my muscle memory. It is just that, when I perceive what I am doing as difficult, and complecs, and the stuff is buried deep deep down. Then I guess some associatins to the Standard C library turns up.

alastor933 · July 22, 2012, 8:18am

Nigel, muinto obrigado for your different take on my problem. I’m tempted to time both methods.

McUsr: well, I guess it must be some nonstandard wiring in my brain. I get lost quickly in a long script, so when it’s not mine I have to spend quite a bit of time to form a mental image of what it does, and how.

McUsr · July 22, 2012, 10:41am

Something that may work if you want to undestand it, is to insert logstatments…for starters.and watch what handlers are called…

it is laboious enough to write large scripts, than also supplying them with diagrams, other than the ones getting into the notebook.

You know, you don’t have to understand word in order to use it. and it should be the same way with scripts, as long as they don’t do anything malicious, and does what they are supposed to do.

I feel it is still important to avoid the most finicky constructs, so that at least I understand what the code does, after having been parted with it for a while.

I may use different scheming approaches, one I like, is the uml state diagram, there are as a matter of fact two different stte diagrams I like, and I like SDL, and normal flowcharting.

Though seldom take the time to make full models, only covering wihat I find to be problem areas.

I am a “bottoms up man” I feel I have a better chance of making a good solution, by working from bottoms up. and then reiterate the solution. I don’t believe in planning everything, like with the waterfrall method or anything. I frankly see such approaches as waiste of time on small projects, as reality will change while you develop and learn your problem anyway.

it is kind of an artform, having a hunch for what to plan, and what not.