Confusing String Character Replacement

mmac · April 20, 2013, 3:38pm

Ok I’ll try to make this somewhat intelligible. A lot of the time I have extremely long random strings like:
09 qh sf kb09 gz 85wx85 2bs9yj 09 cts9yje7 ud
Only they’re much longer.

My problem is that I need to replace all of the spaces with something like:
:1/
However; with each consecutive space, I want the number to increase by one, so that the string will look like:
09:1/qh:2/sf:3/kb09:4/gz:5/85wx85:6/2bs9yj:7/09:8/cts9yje7:9/ud:10/

I have a working script that uses a lot of text item delimiters and lists, but it’s incredibly long, slow, and inefficient. If it’s possible, I’m trying to get my script to only use strings.

I think I’m almost there. I know what line is causing the problem in my new script, but I dont know how to fix it.

set thePrompt to display dialog "Enter the String here." default answer ""
set returnedInput to text returned of thePrompt
set newVar to returnedInput

considering case
	set NU to 1
	set CountString to count newVar
	set theInputString to text of newVar
	repeat CountString times
		set currentLetter to item NU of theInputString
		if currentLetter is equal to " " then
			set ReplaceLetter to ":" & NU & "/"
			set currentLetter to ReplaceLetter --??? This is the problem. :NU/ won't replace " ".
		end if
		set NU to NU + 1
	end repeat
	return theInputString
end considering

Any help is appreciated.

Model: iMac
Operating System: Mac OS X (10.8)

Marc_Anthony · April 20, 2013, 4:43pm

Hi. TIDs are usually fairly efficient, and I’m not sure why you’d need more than one.

set textList to {}
set suppliedText to (display dialog "Enter the String here." default answer "09 qh sf kb09 gz 85wx85 2bs9yj 09 cts9yje7 ud")'s text returned
set AppleScript's text item delimiters to space

repeat with counter from 1 to (count suppliedText's text items)
	set textList's end to suppliedText's text item counter & ":" & counter & "/"
end repeat

set AppleScript's text item delimiters to ""

textList as text

mmac · April 20, 2013, 8:37pm

Well thanks for making me feel like a complete idiot Truthfully I don’t really remember why I’d needed multiple TID’s. I made the script along time ago when I first started using applescript. I had it doing something like, putting in “,/” then changing the comma to the colon and the number. For some reason it would freeze and lag a lot. I don’t know… it was too long ago. But thank you so much! This is so fast and simple!

While I have you… do you have any idea how to get this string to look something like this: 09:1,8/qh:2/sf:3/kb09:4/gz:5/85wx85:6/2bs9yj:7/cts9yje7:9/ud:10/

Notice the repeated 09, between the :7 and :9 was deleted from the list and a ,8 was added at the 09:1. I would need the script to do this for any of the random repeated characters.

I doubt this is possible, but if you can think of anything to get me on the right track I’d really appreciate it.

Marc_Anthony · April 21, 2013, 12:16am

Hi. I’m not sure I understand the reasoning for the “,8” being added after the “09:1”, but you can always use an if consideration, where consistent logic exists. In the case of the repetitive 09, you could use a negative consideration to ensure that suppliedText’s text item counter is not in textList; it will likely be necessary for you to split the statement in the repeat block into two parts, rather than concatenate them as I did. Good luck.

kel1 · April 21, 2013, 12:35am

Hi,

I did this on the fly, so it is probably slow. It works, even if the last entry was used before.


set t to "09 qh sf kb09 gz 85wx85 2bs9yj 09 cts9yje7 ud 09"
set new_text to ReplaceText(t, space, ":" & space & "/")
-- add dummies at beginning and end
set new_text to "/" & new_text & ":"
set new_text to ReplaceText(new_text, space, return)
set text_list to (paragraphs of new_text)
set c to count text_list
repeat with i from 1 to (c - 1)
	set search_item to item i of text_list
	if search_item is not "" then
		-- get matching indices
		set index_string to (i as string)
		repeat with j from (i + 1) to c
			considering case
				if item j of text_list is search_item then
					set index_string to (index_string & "," & j)
					-- clear list_item
					set item j of text_list to ""
				end if
			end considering
		end repeat
		-- append index_string to search_item
		set item i of text_list to search_item & index_string
	end if
end repeat

-- remove empty items
set new_text to text_list as string
if character -1 of new_text is ":" then -- append the count
	set append_text to (c as string) & "/"
else
	set append_text to "/"
end if
set new_text to (text 2 thru -1 of new_text) & append_text

on ReplaceText(t, s, r)
	set tids to AppleScript's text item delimiters
	set AppleScript's text item delimiters to s
	set temp_list to text items of t
	set AppleScript's text item delimiters to r
	set new_string to temp_list as string
	set AppleScript's text item delimiters to tids
	return new_string
end ReplaceText

Needs more testing, I think, but my brain is starting to ache.

Editted: added ‘considering case’

gl,

Model: MBP
AppleScript: 2.2.3
Browser: Safari 536.26.17
Operating System: Mac OS X (10.8)

DJ_Bazzie_Wazzie · April 21, 2013, 10:21am

Or using awk, it’s still pretty fast.


set theString to "09 qh sf kb09 gz 85wx85 2bs9yj 09 cts9yje7 ud"
set bashList to replaceSubstring(theString, space, linefeed)
set bashList to do shell script "awk '{print $0\":\"NR\"/\"}' <<< " & quoted form of bashList
return replaceSubstring(bashList, return, "")

on replaceSubstring(str, src, trg)
	set {oldTID, AppleScript's text item delimiters} to {AppleScript's text item delimiters, src}
	set l to text items of str
	set AppleScript's text item delimiters to trg
	set str to l as string
	set AppleScript's text item delimiters to oldTID
	return str
end replaceSubstring

EDIT: there is only one problem when the the string contains linefeeds or returns. When the text contains linefeeds or returns you should split this up into an array of text item and do this action on each text item.

DJ_Bazzie_Wazzie · April 21, 2013, 10:38am

Solution to my problem in previous post, a bit more code but still fast:

set theString to "09 qh sf kb09 gz 85wx85 2bs9yj 09 cts9yje7 ud
09 qh sf kb09 gz 85wx85 2bs9yj 09 cts9yje7 ud
09 qh sf kb09 gz 85wx85 2bs9yj 09 cts9yje7 ud"
set lst to explode(theString, linefeed)
set spaceCounter to 0
repeat with p in lst
	set bashList to replaceSubstring(p, space, linefeed)
	set bashList to do shell script "awk 'BEGIN{i=" & spaceCounter & "}; {print $0\":\"++i\"/\"}' <<< " & quoted form of bashList without altering line endings
	set spaceCounter to spaceCounter + (number of paragraphs of bashList) - 1
	set contents of p to replaceSubstring(bashList, linefeed, "")
end repeat
return implode(lst, linefeed)

on replaceSubstring(str, src, trg)
	set {oldTID, AppleScript's text item delimiters} to {AppleScript's text item delimiters, src}
	set lst to text items of str
	set AppleScript's text item delimiters to trg
	set str to lst as string
	set AppleScript's text item delimiters to oldTID
	return str
end replaceSubstring

on explode(str, sep)
	set {oldTID, AppleScript's text item delimiters} to {AppleScript's text item delimiters, sep}
	set lst to text items of str
	set AppleScript's text item delimiters to oldTID
	return lst
end explode

on implode(lst, sep)
	set {oldTID, AppleScript's text item delimiters} to {AppleScript's text item delimiters, sep}
	set str to lst as string
	set AppleScript's text item delimiters to oldTID
	return str
end implode

EDIT: You could wonder why I’m not setting ORS and RS to space in SED and just print out the right value after each record. The reason is that the end of the line misses it’s space counter. This is also a bit confusing because in the TS example output he does that while he doesn’t in his example code. And AWK still forces an newline at the end which is considered as an space. Don’t know exactly why but print and printf have this behaviour. Almost the excact same script in awk as the TS example code would only be ‘BEGIN{RS=" ";ORS=""}; {printf $0":"NR"/"}’ for th eentire string including newlines, which is fast an small.

kel1 · April 21, 2013, 11:28am

Hi,

I was using this with the sed tutorial using line numbering (‘=’) also. I don’t know if that would end up being quicker though.

gl,

DJ_Bazzie_Wazzie · April 21, 2013, 2:26pm

In this particular situation there wouldn’t be much of a difference in using sed or awk. If you’re thinking in complete solutions, awk would be much better than sed in this case.

mmac · April 21, 2013, 6:13pm

Thanks everybody for your help. I really appreciate it a lot! I wish I had asked for help earlier rather that spending days trying to figure this out on my own. Kel 1, thank you. Your script does exactly what I need it to, so I I’m going to stick with yours, but I really do appreciate everyone’s help. Now I just have to get to work making a script to reverse this.

Nigel_Garvey · April 22, 2013, 12:26pm

A bit late, but here’s a method very similar to kel1’s. Just a little more compact and very slightly faster:

set t to "09 qh sf kb09 gz 85wx85 2bs9yj 09 cts9yje7 ud 09 ud sf 09 qh sf kb09 gz 85wx85 2bs9yj 09 cts9yje7 ud 09 ud sf 09 qh sf kb09 gz 85wx85 2bs9yj 09 cts9yje7 ud 09 ud sf 09 qh sf kb09 gz 85wx85 2bs9yj 09 cts9yje7 ud 09 ud sf"

set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to space
set text_list to t's text items

set done_list to {}
repeat with i from 1 to (count text_list)
	set this_text to item i of text_list
	if (this_text is in done_list) then
		repeat with j from 1 to i
			if (item j of done_list is this_text) then
				set item j of text_list to item j of text_list & "," & i
				set item i of text_list to missing value
				exit repeat
			end if
		end repeat
	else
		set item i of text_list to this_text & ":" & i
	end if
	set end of done_list to this_text
end repeat

set AppleScript's text item delimiters to "/"
set new_text to (text_list's text as text) & "/"
set AppleScript's text item delimiters to astid

return new_text

mmac · April 23, 2013, 1:14am

Thanks Nigel, I can understand this really well. I’m horrible at text manipulation, but this, for the most part, is easy for me to follow. And with the size of the strings I sometimes deal with, every fraction of a second makes a difference. Thanks again.

kel1 · April 23, 2013, 3:13am

Hi,

Thanks Nigel for optimizing. I didn’t have the time today.

Hi mmac,

Concerning reversing the process, you could get your list of indices with a regular expression and unix or vanilla AppleScript. But, maybe you should post again for the different question. Quickly, suppose you extract the index list somehow from this:

“09:1,8,12/qh:2/sf:3/kb09:4/gz:5/85wx85:6/2bs9yj:7/cts9yje7:9/ud:10/uD:11/”

and end up with this:

“:1,8,12/”

You can eventually change this to:

“{1,8,12}” – uses curly braces

to get your AppleScript list, something like this:


set the_text to "{1,8,15}"
set index_list to run script the_text
--> {1, 8, 15} -- AppleScript list
item 3 of index_list
--> 15

gl,

Nigel_Garvey · April 23, 2013, 8:39am

Here’s one attempt:

set new_text to "09:1,8,11,14,21,24,27,34,37,40,47,50/qh:2,15,28,41/sf:3,13,16,26,29,39,42,52/kb09:4,17,30,43/gz:5,18,31,44/85wx85:6,19,32,45/2bs9yj:7,20,33,46/cts9yje7:9,22,35,48/ud:10,12,23,25,36,38,49,51/"

set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "/"
set discretes_list to new_text's text items
set discretes_count to (count discretes_list) - 1

set AppleScript's text item delimiters to {":", ","}
set text_list to {}
repeat with i from 1 to discretes_count
	set item i of discretes_list to text items of item i of discretes_list
	-- This just builds text_list to the required length. The correct values will go in below.
	set text_list to text_list & rest of result
end repeat

repeat with i from 1 to discretes_count
	set this_discrete to item i of discretes_list
	set this_text to beginning of this_discrete
	repeat with j from 2 to (count this_discrete)
		set item (item j of this_discrete) of text_list to this_text
	end repeat
end repeat

set AppleScript's text item delimiters to space
set old_text to text_list as text
set AppleScript's text item delimiters to astid

return old_text

Edit: And here’s a slightly short, minutely faster variant:

set new_text to "09:1,8,11,14,21,24,27,34,37,40,47,50/qh:2,15,28,41/sf:3,13,16,26,29,39,42,52/kb09:4,17,30,43/gz:5,18,31,44/85wx85:6,19,32,45/2bs9yj:7,20,33,46/cts9yje7:9,22,35,48/ud:10,12,23,25,36,38,49,51/"

set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to {"/", ","}
-- This just gets a list of the required length. The correct values will go in below.
set text_list to text items 1 thru -2 of new_text

set AppleScript's text item delimiters to "/"
set discretes_list to new_text's text items
set discretes_count to (count discretes_list) - 1

set AppleScript's text item delimiters to {":", ","}
repeat with i from 1 to discretes_count
	set this_discrete to text items of item i of discretes_list
	set this_text to beginning of this_discrete
	repeat with j from 2 to (count this_discrete)
		set item (item j of this_discrete) of text_list to this_text
	end repeat
end repeat

set AppleScript's text item delimiters to space
set old_text to text_list as text
set AppleScript's text item delimiters to astid

return old_text

kel1 · April 23, 2013, 8:00pm

Hi,

One thing I was thinking was that commas might be special characters only within the tags. Mainly, it separates the integers. But, it looks like commas won’t be used in the entries anyway, in this case, I think.

kel,

kel1 · April 23, 2013, 10:33pm

I was thinking that we could get a list of indices like this:


set new_text to "09:1,8,11,14,21,24,27,34,37,40,47,50/qh:2,15,28,41/sf:3,13,16,26,29,39,42,52/kb09:4,17,30,43/gz:5,18,31,44/85wx85:6,19,32,45/2bs9yj:7,20,33,46/cts9yje7:9,22,35,48/ud:10,12,23,25,36,38,49,51/"

set t1 to (offset of ":" in new_text) + 1
set t2 to (offset of "/" in new_text) - 1
set index_list to run script ("{" & (text t1 thru t2 of new_text) & "}")

assuming that “:” and “/” are special characters.

Man, I remember now doing this problem before. ‘offset’ wasn’t that much faster than the other way, but I can’t remember the other way.

Thinking…

Nigel_Garvey · April 24, 2013, 12:09am

The entries all seem to consist of alphanumeric characters only, but commas could be accommodated thus:

set new_text to "09:1,8,11,14,21,24,27,34,37,40,47,50/qh:2,15,28,41/sf:3,13,16,26,29,39,42,52/kb09:4,17,30,43/gz:5,18,31,44/85w,,,x85:6,19,32,45/2bs9,yj:7,20,33,46/cts9yje7:9,22,35,48/ud:10,12,23,25,36,38,49,51/"

set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "/"
set discretes_list to new_text's text items 1 thru -2
set discretes_count to (count discretes_list)

set text_list to {}
repeat with i from 1 to discretes_count
	set AppleScript's text item delimiters to ":"
	set {this_text, these_positions} to text items of item i of discretes_list
	set AppleScript's text item delimiters to ","
	set these_positions to text items of these_positions
	set item i of discretes_list to {this_text, these_positions}
	-- This just builds text_list to the required length. The correct values will go in below.
	set text_list to text_list & these_positions
end repeat

repeat with i from 1 to discretes_count
	set {this_text, these_positions} to item i of discretes_list
	repeat with j from 1 to (count these_positions)
		set item (item j of these_positions) of text_list to this_text
	end repeat
end repeat

set AppleScript's text item delimiters to space
set old_text to text_list as text
set AppleScript's text item delimiters to astid

return old_text

kel1 · April 24, 2013, 12:41am

Alright, thanks. That was bothering me a little.

kel1 · April 24, 2013, 12:48am

Hi Nigel,

Nice script.

Editted: wow, that script is elegant.

kel1 · April 24, 2013, 1:34am

That’s what I should have done. instead of nesting the repeat loop. First get the list of indices.