Converting half of a paragraph to lower case

With some context, I use TextEdit quite a bit to edit source code from old Assemblers. A lot of the times the source is in all upper case. Usually a paragraph will have a comment and the comment always starts with a semicolon.

In each paragraph then, I locate the semi-colon, then convert all the text after the semi-colon to lower case with the paragraph obviously ending with a return mark.

I can put paragraphs into a variable and find the semi-colon, but I don’t know how to continue after the semi-colon is found.

  set j to count of para

  repeat with i in para
    set var to (ASCii number of i)
    set str to str & (ASCii character var)
    if var = 58 or var = return then exit repeat
  end repeat

  repeat with k in para (from i to j)   // ***
    set var to (ASCii number of k)
    if var = return then exit repeat
    if var > 63 and var <90 then set var to var + 32
    set str to str & (ASCii character var)
  end repeat

where the 3 stars are, I would like variable k to start where variable i left off. But k and i are characters and not ascii values.

How do I do this?

Here is a simple way to convert each paragraph’s ‘comment’ to lower case.

I took the first three paragraphs of this page as sample text. I used linefeeds to reform each paragraph but you can change that to return if you like.

set tx to "With some context, I use TextEdit quite a bit to edit source code from old Assemblers. A lot of the times the source is in all upper case; USUALLY A PARAGRAPH WILL HAVE A COMMENT AND THE COMMENT ALWAYS STARTS WITH A SEMICOLON.
In each paragraph then, I locate the semi-colon, then convert all the text after the semi-colon; TO LOWER CASE WITH THE PARAGRAPH OBVIOUSLY ENDING WITH A RETURN MARK.
I can put paragraphs into a variable and find the semi-colon, but I don’t know how; TO CONTINUE AFTER THE SEMI-COLON IS FOUND."

set ptx to paragraphs of tx

set lowerTx to {}
set text item delimiters to ";"
repeat with para in ptx -- each paragraph 
	set regCase to text item 1 of para -- regular case
	set upCase to text item 2 of para -- upper case
	
	-- convert from ';' 
	set qupCase to quoted form of upCase
	set lowCase to do shell script "printf " & qupCase & " | tr \"[[:upper:]]\" \"[[:lower:]]\" "
	
	-- recombine each paragraph	
	set para to {regCase, lowCase} as text
	set end of lowerTx to para -- add to list of modified paragraphs
end repeat

set text item delimiters to linefeed
lowerTx as text

Basically, the script breaks the text first into paragraphs, and then each paragraph into two segments, split on the ;. Of course, if your text contains any other semi-colons, it will affect the output.

It then cycles through the paragraphs, using tr to convert each second segment to lower case.

It then rejoins the segments and reassembles the paragraphs.

Thanks, you gave me enough to figure it out. I am on an older OS and Applescript, which does not support a shell script or ObjC.

A couple of errors I was getting is that “item 1” gives an error on an empty line and “item 2” gives an error if the semi-colon does not exist in the paragraph. This is my revised Applescript that works for me. I only have one minor problem in that it leaves a semi-colon after every paragraph that does not include a semi-colon within the paragraph.

set newdoc to ""
set cr to (ASCII character 13)

tell application "TextEdit"
	set txt to text of document 1
end tell

set allpara to paragraphs of txt
set text item delimiters to ";"

repeat with eachpara in allpara
	set postpara to ""
	
	repeat 1 times
		if (count of eachpara) = 0 then exit repeat
		set prepara to text item 1 of eachpara
		
		repeat 1 times
			if (count of prepara) = (count of eachpara) then exit repeat
			set tempstr to text item 2 of eachpara
			
			repeat with k in tempstr
				set var to (ASCII number of k)
				if var > 63 and var < 90 then set var to var + 32
				set postpara to postpara & (ASCII character var)
			end repeat
		end repeat
		
		set newdoc to newdoc & {prepara, postpara}
	end repeat
	
	set newdoc to newdoc & cr as text
	--	display dialog {prepara, postpara} as text
end repeat

set the clipboard to newdoc as text

1 Like

The issue regarding the extra semicolons is that the text item delimiters are set to ‘;’, so each time you have an as text coercion, the text item delimiter would be inserted there. This apparently includes the following line which apparently functions with an implicit as text. Even though postpara might be empty, the delimiter will still be inserted.

set newdoc to newdoc & {prepara, postpara}

A solution would be to set the delimiters at the beginning and again at the end of the second loop Set them to ‘;’ at the beginning, and depending upon whether the paragraph contains a semicolon or not: set them to either ‘""’ if not, or leave them alone if so. There are probably other approaches to take but this one is simple enough.

if length of postpara is 0 then
	set text item delimiters to ""
end if

So here would be the entire script. Hopefully it works as intended.

set newdoc to ""
set cr to (ASCII character 13)

tell application "TextEdit"
	set txt to text of document 1
end tell

set allpara to paragraphs of txt

repeat with eachpara in allpara
	set postpara to ""
	repeat 1 times
		set text item delimiters to ";"
		if (count of eachpara) = 0 then exit repeat
		set prepara to text item 1 of eachpara
		
		repeat 1 times
			if (count of prepara) = (count of eachpara) then exit repeat
			set tempstr to text item 2 of eachpara
			repeat with k in tempstr
				set var to (ASCII number of k)
				-- display dialog k & tab & var
				if var > 64 and var < 91 then set var to var + 32
				set postpara to postpara & (ASCII character var)
			end repeat
		end repeat
		if length of postpara is 0 then
			set text item delimiters to ""
		end if
		set newdoc to newdoc & {prepara, postpara}
	end repeat
	
	set newdoc to newdoc & cr as text
end repeat
-- set the clipboard to newdoc as text
newdoc

Unrelated notes:

What OS are you running? Apple introduced shell scripts to applescript in 10.1 with applescript version 1.8 (in 2001). And there is no ASObjC in the above script.

Note that if you specify an applescript version at the top of your script, eg use AppleScript version "1.8", then you must follow that with use scripting additions or else the script will generate a syntax error (Expected end of line, etc. but found identifier.). FWIW, by including the use scripting additions line, I was able to run a shell script regardless of the specified version of applescript.

Regarding the ASCII commands, I’m sure that you are aware that they have been deprecated for some time and that using id and character id are recommended. That said, the range of letters from A-Z are contained between ASCII 65 and ASCII 90 so I modified the numbers accordingly in the script above.

Thanks for all your help.
Your shell script works now as well as the delimiters. I didn’t have to use “use scripting additions”.

With regards to “id” and “character id”, “id” works but “character id” does not. It works if I change “character id” to “ascii character”. Is this correct?

		repeat with k in tempstr
				set var to id of k
				if var > 64 and var < 91 then set var to var + 32
				set postpara to postpara & (character id of var)
			end repeat

IamRob has a solution (or close to one) and his computer doesn’t support ASObjC. However, just for practice, I wrote an ASObjC solution. The timing result with 1025 paragraphs of text was 125 milliseconds.

use framework "Foundation"
use scripting additions

set theString to "This is a line 1 ; THIS IS A COMMENT ; THIS IS ANOTHER COMMENT
This is line 2

This is line 4 ; THIS IS A COMMENT"

set newString to lowercaseComments(theString)

on lowercaseComments(theString)
	set theString to current application's NSString's stringWithString:theString
	set theParagraphs to (theString's componentsSeparatedByString:linefeed)
	set newParagraphs to current application's NSMutableArray's new()
	repeat with aParagraph in theParagraphs
		if (aParagraph's containsString:";") is true then
			set firstSubstring to (aParagraph's stringByReplacingOccurrencesOfString:"(.*?);.*" withString:"$1" options:1024 range:{0, aParagraph's |length|()})
			set secondSubstring to (aParagraph's stringByReplacingOccurrencesOfString:".*?;(.*)" withString:"$1" options:1024 range:{0, aParagraph's |length|()})'s lowercaseString()
			set aParagraph to current application's NSString's stringWithFormat_("%@;%@", firstSubstring, secondSubstring)
		end if
		(newParagraphs's addObject:aParagraph)
	end repeat
	return (newParagraphs's componentsJoinedByString:linefeed) as text
end lowercaseComments

Not exactly. Remove the ‘of’ when using character id but keep it when using id.

set var to 95
character id var
--> "_"

id of "_"
--> 95

Or in the context of the script:

set postpara to ""
set tempstr to "USUALLY"

repeat with k in tempstr
	set var to (id of k)
	if var is greater than 64 and var is less than 91 then set var to var + 32
	set postpara to postpara & (character id var)
end repeat

postpara
--> "usually"

I’m glad that you got the shell script version working. You don’t always need the use scripting additions line but the do shell script command is part of the standard additions. Somewhere —I think— I read that there are a couple of scenarios in which you may see different results when the use… command is absent. Offhand, I do not recall the details.

I thought I had tried “character id var”. But all works now. Here is the final result. Using length instead of count seems to be a little faster as well. The shell script seems about the same speed as this one. 6 secs for 1800 lines. Using AS 2.2.1 on OSX 10.7.

set newdoc to ""
set cr to (character id 13)

tell application "TextEdit"
	set txt to text of document 1
end tell

set allpara to paragraphs of txt

repeat with eachpara in allpara
	set postpara to ""

	repeat 1 times
		set text item delimiters to ";"
		if (length of eachpara) = 0 then exit repeat
		set prepara to text item 1 of eachpara
		
		repeat 1 times
			if (length of prepara) = (length of eachpara) then exit repeat
			set tempstr to text item 2 of eachpara

-- replace these next 5 lines with the commented shell script to do the same thing

			repeat with k in tempstr
				set var to (id of k)

				if var > 64 and var < 91 then set var to var + 32
				set postpara to postpara & (character id var)
			end repeat

-- convert from ';' 
--	set qupCase to quoted form of tempstr
--	set postpara to do shell script "printf " & qupCase & " | tr \"[[:upper:]]\" \"[[:lower:]]\" "
	
		end repeat

		if length of postpara is 0 then
			set text item delimiters to ""
		end if

		set newdoc to newdoc & {prepara, postpara}
	end repeat

	set newdoc to newdoc & cr as text
end repeat

-- set the clipboard to newdoc as text
newdoc

Looks good and works well.

The problem with the shell version is that each time it’s called, there is time lost to it launching. So basically, each paragraph causes it to be re-launched. Someone competent at writing shell scripts could likely integrate it into a single call that handles all the looping in one go. I would expect such a script to be faster.

This is using AppleScript and Javascript:

tell application "TextEdit"
	set texto to text of document 1
end tell

set nutext to ""
set prgrphs to paragraphs of texto
set AppleScript's text item delimiters to {";"}
repeat with aprgrph in prgrphs
	try
		set aprgrph to contents of aprgrph
		set nutext to nutext & text item 1 of aprgrph
		set ti2js to text item 2 of aprgrph
		set ti2 to (run script "t='" & ti2js & "'; t.toLowerCase();" in "JavaScript")
		set nutext to nutext & ti2 & return
	end try
end repeat

set AppleScript's text item delimiters to {""}

tell application "TextEdit"
	set text of document 1 to nutext
end tell

I’m glad to hear that the revised AppleScript is working for you. Regarding the issue with leaving a semi-colon after paragraphs that don’t include a semi-colon within them, you can add a condition to check if the paragraph already ends with a semi-colon before adding one. Here’s how you can modify the script to address this:

applescriptCopy code

set theText to "Text to process here"

-- Split the text into paragraphs
set paragraphList to paragraphs of theText

-- Process each paragraph
repeat with currentParagraph in paragraphList
    set theLength to length of currentParagraph
    
    -- Check if the paragraph ends with a semi-colon
    if theLength > 0 and last character of currentParagraph is not ";" then
        -- Add a semi-colon to the end of the paragraph
        set currentParagraph to currentParagraph & ";"
    end if :grin: :blush:
    
    -- Display the modified paragraph
    display dialog currentParagraph
end repeat

This modification checks if the last character of each paragraph is a semi-colon before adding one. If the paragraph already ends with a semi-colon, it won’t add another one. This should prevent unnecessary semi-colons from being added to paragraphs that already contain them.

I’m glad to hear that the revised AppleScript is working for you. Regarding the issue with leaving a semi-colon after paragraphs that don’t include a semi-colon within them, you can add a condition to check if the paragraph already ends with a semi-colon before adding one. Here’s how you can modify the script to address this:

applescriptCopy code

set theText to "Text to process here"

-- Split the text into paragraphs
set paragraphList to paragraphs of theText

-- Process each paragraph
repeat with currentParagraph in paragraphList
    set theLength to length of currentParagraph
    
    -- Check if the paragraph ends with a semi-colon
    if theLength > 0 and last character of currentParagraph is not ";" then
        -- Add a semi-colon to the end of the paragraph
        set currentParagraph to currentParagraph & ";"
    end if
    
    -- Display the modified paragraph
    display dialog currentParagraph
end repeat

This modification checks if the last character of each paragraph is a semi-colon before adding one. If the paragraph already ends with a semi-colon, it won’t add another one. This should prevent unnecessary semi-colons from being added to paragraphs that already contain them. :blush: :grin:

It’s rather off topic, but as there is a recent topic about Swift, this shows the power of Swift.

Unlike the common Regular Expression syntax it’s possible to modify the captured groups and even to modify the string in place

var text = "code1; COMMENT1\ncode2; COMMENT2\ncode3; COMMENT3"
text.replace(/;[^\r\n]+/) { $0.output.lowercased() }
print(text) // "code1; comment1\ncode2; comment2\ncode3; comment3" 

This should prevent unnecessary semi-colons from being added to paragraphs that already contain them. :star_struck:

FWIW, a shortcut suggestion for anyone on macOS Monterey or newer. An advantage of this solution is that it is run by way of the app’s Services menu on selected text. A disadvantage is that it is abysmally slow with a large document.

Lowercase Comments.shortcut (22.9 KB)

My test text:

This is a LINE without a semicolon
This is a LINE with a semicolon but no comment;
This is a LINE with one semicolon; A Comment
This is a LINE; with TWO semicolons; A COMMENT

The above LINE is blank; END