'end of list' or concatenation of strings

kel1 · October 10, 2014, 2:57am

Hi,

One thing that was nagging at my mind was, is it better to concatenate two strings or set the end of list as in my last answer to a poster. I thought that, someone said that concatenating two strings takes longer than something else but I can’t remember what it the other thing was. In this situation:
http://macscripter.net/viewtopic.php?id=43156
I thought that using end of list was faster. Note that AppleScriptLanguageGuide says that it is fast and it used to be, but nowadays it doesn’t seem to matter in my opinion. On the other hand, someone said that concatenating two lists takes time on another post.

Short of me timing it for now, is there a logical reason why one is better than the other?

Edited: also, are these two situations not comparable? i.e. the difference is so small that it doesn’t matter?

Edited: actually, I should say that it can’t be compared, instead of not comparable.

Edited: I think that ‘end of list’ is better because it doesn’t use any operators. What do you think?

Thanks,
kel

kel1 · October 10, 2014, 3:20am

Here’s an article on this site about which is better, but it is a different situation:
http://macscripter.net/viewtopic.php?id=12562
But, it is a different situation. They’re comparing concatenating two lists with adding to the end of a list. That’s not what I’m writing about. See the post.

Thanks,
kel

DJ_Bazzie_Wazzie · October 10, 2014, 10:26am

My advice: Don’t read posts too seriously that is from 2004. We’re not using AppleScript version 1.x anymore. AppleScript 2 (Leopard) did get many updates where one of them was more type checking at runtime but also more memory from the system for better performance.

Between adding an item to an existing list and concatenating two lists into one 3rd list, is the version where only 1 item needed to be added faster. This is because the same list is used and an item to the end is added. Concatenating two lists will take longer because a 3rd list will be created and all items from the 1st and 2nd list will be added (copied) to it.

Nigel_Garvey · October 10, 2014, 12:28pm

Hi Kel.

Deciding what’s faster than something else in any particular situation is usually a matter of understanding (or having a feel for) what the code actually does and how many times it’s going to be doing it.

Concatenation involves taking two strings (or lists, or records) and creating a third containing all the elements from the other two. If you only want to append one piece of text to another, concatenation’s the most efficient way to do it, since only three pieces of data are involved: the two original texts and the result.

If you want to build a text from several smaller pieces, concatenation’s still possibly a good option. It depends on how many pieces there are and how long they are. In a sequence of concatenations, each ‘&’ is an individual concatenation and produces an intermediate result, which is fed into the next concatenation and then discarded:

"Now is the time " & "for all good men " & "to come to the aid " & "of the party."

-- Process:
"Now is the time " & "for all good men " --> "Now is the time for all good men " (intermediate string, 33 characters)
"Now is the time for all good men " & "to come to the aid " --> "Now is the time for all good men to come to the aid " (another intermediate string, 52 characters)
"Now is the time for all good men to come to the aid " & "of the party." --> "Now is the time for all good men to come to the aid of the party." (final result)

As you can see, this simple example creates two intermediate strings totalling 85 characters. Not a lot in this case, but the more concatenations there are in the chain, the more and longer are the waste strings created.

It’s sometimes feasible to parenthesise parts of a concatenation chain to reduce the amount of work done and waste produced:

"Now is the time " & "for all good men " & ("to come to the aid " & "of the party.")

-- Process:
"Now is the time " & "for all good men " --> "Now is the time for all good men " (intermediate string, 33 characters)
"to come to the aid " & "of the party." --> "to come to the aid of the party." (another intermediate string, 32 characters)
"Now is the time for all good men " & "to come to the aid of the party." --> "Now is the time for all good men to come to the aid of the party." (final result)

Same number of steps and strings, but only 65 characters in the intermediate strings.

If you have to build a text from a very large number of smaller strings, it may indeed be worth gathering them in a list and coercing this to text at the end. The assumption (unproven, but likely) is that internally, the single coercion will build a single result from everything in the list instead of creating lots of intermediate results like the individual concatenations. The time and memory thus saved, along with the comparatively short time it takes to append pointers to lists, may make this approach worthwhile. However, the list itself will occupy memory and take a small amount of time to create. It may also still be true that a growing list is occasionally replaced (behind the scenes) with another when it exceeds the memory initially allocated to it. And the longer a list gets, the longer it takes to access its far items.

But the above’s only useful if you know exactly what the code’s going to be handling. Often you don’t. In these cases, you just have to make your best judgement. As long as a script works at all, it’s obeying the First Law of Scripting. And given the speed of today’s computers, the actual method used often makes no discernable difference at all.

With regard to the scripts in the topic to which you linked in post #1, I’d say it wasn’t actually necessary to create the list of text items:

kel1:

set tids to AppleScript's text item delimiters
set AppleScript's text item delimiters to "."
set temp_list to text items 1 through -2 of file_name
set end of temp_list to "txt"
set new_name to temp_list as text
set AppleScript's text item delimiters to tids

You can extract the base name directly from the original and concatenate the new extension to it:

set tids to AppleScript's text item delimiters
set AppleScript's text item delimiters to "."
set new_name to text 1 through text item -2 of file_name & ".txt"
set AppleScript's text item delimiters to tids

kel1 · October 11, 2014, 5:02pm

Interesting info as usual.

Also, Nigel wrote:

set tids to AppleScript's text item delimiters
set AppleScript's text item delimiters to "."
set new_name to text 1 through text item -2 of file_name & ".txt"
set AppleScript's text item delimiters to tids

I’ve never seen:

set new_name to text 1 through text item -2 of file_name & ".txt"

You’re using two types of references in the same line!

Edited: I see it now in ApleScriptLanguageGuide.pdf:

Edited: I can’t get it to work now, using text and text item?

Thanks a lot,

Nigel_Garvey · October 11, 2014, 5:51pm

Yeah. Basically, a “range reference” consists of the class of the element you want to extract and two range boundary specifiers. Specifiers which are just numbers refer to the same class as the extracted element (although they mean characters when extracting ‘text’ from text), but you can specify other classes if required. eg.:

set txt to "There was a young man from Strathclyde
Who fell down a sewer and died.
The next day his brother
Fell into another
And now they're interred side by side."

set AppleScript's text item delimiters to "fell"

text 1 thru paragraph 2 of txt
words 2 thru paragraph 2 of txt
words from paragraph 3 to paragraph 4 of txt -- or words (paragraph 3) thru (paragraph 4) of txt
paragraphs from text item 2 to character -7 of txt -- paragraphs (text item 2) thru (character -7) of txt
characters from word 7 to -2 of txt -- characters (word 7) thru -2 of txt
text items from word 7 to -2 of txt -- text items (word 7) thru -2 of txt
-- or whatever.

You can do similar things with lists.

I’ve written about range references rather long-windedly here.

kel1 · October 11, 2014, 5:59pm

Got it working with:

set t to "abc.def.ghi.jkl"
set tids to AppleScript's text item delimiters
set AppleScript's text item delimiters to "."
set new_t to text from text 1 to text item -2 of t
set AppleScript's text item delimiters to tids
new_t

Thanks,
kel

kel1 · October 11, 2014, 6:21pm

There’s one thing wrong with this, but hardly happens. If you’re trying to change text that begins with “.”

--set t to "abc.def.ghi.jkl"
set t to ".jkl"
set tids to AppleScript's text item delimiters
set AppleScript's text item delimiters to "."
set new_t to text from beginning to text item -2 of t
set AppleScript's text item delimiters to tids
new_t

→ “.”

After concatenation with “.mno” you get “…mno”. i.e. the result wasn’t the empty string “”. But, this would not work only with invisible folders like “.somefoldername”. e.g.

set old_folder to ".somefoldername"
set tids to AppleScript's text item delimiters
set AppleScript's text item delimiters to "."
set new_folder to text from beginning to text item -2 of old_folder & ".newfoldername"
set AppleScript's text item delimiters to tids
new_folder

→ “…newfoldername”

Nigel_Garvey · October 11, 2014, 7:26pm

Yeah. You have to remember that the boundary elements are included in the result. The code in this case should be:

set old_folder to ".somefoldername"
set tids to AppleScript's text item delimiters
set AppleScript's text item delimiters to "."
set new_folder to text from text item 1 to text item -2 of old_folder & ".newfoldername"
set AppleScript's text item delimiters to tids
new_folder
--> ".newfoldername"

ccstone · October 18, 2014, 12:35pm

Hey Kel1,

As Nigel mentioned it’s very much a matter of scale.

Using Shane Stanley’s handy script-timer utility Script Geek:


# String Concatenation
set _str to "A_Long_Word"
set newStr to ""
repeat with i from 1 to 10000
	set newStr to newStr & _str
end repeat

Avg of 3 Run sample: 0.462 sec.


# End of List
set _str to "A_Long_Word"
set _list to {}
repeat with i from 1 to 10000
	set end of _list to _str
end repeat
set _list to _list as text

Avg of 3 Run sample: 0.022 sec.

A little time here, and a little time there adds up in a complicated script ” so I go for speed when I reasonably can without getting too obsessive about it.