Processing folder items in order.

hi,

I’m processing files in a folder. The files are displayed in OS X in the proper natural file order. The comand “list folder” however, captures a folder list that is strictly Alpha Number ordered so for example: files displayed in the finder as

file1A
file2A
file10A

are captured by “list folder” as

file1A
file10A
file2A

I need to be able to capture the files in the order displayed.

is there a way to do this?

thanks

Hi,

I think you’re using a bad naming convention. You should name the files with leading zeros. For instance:

file01a
file02a
file10a

or add more leading zeros depending on how many files you got.

gl,

I understand the naming convention issue, but I’m not the originator of the files and they are not a sequence like video stills. Either way, if the finder can display them in a specific order, why can’t I process the files in the order displayed?

clearly there is a natural order option for the display of file lists in the finder, so I’m wondering what parameters must be set to access that capability within applescript when running Finder operations.

thanks… :slight_smile:

There is no correlation between how a folder is viewed in the Finder and how AppleScript returns that list of files via ‘list folder’ (which isn’t even a Finder command).

The Finder’s ‘natural order’ is a special case handled by the Finder itself.

In any case, there is no guarantee as to how the files will be listed. If you want the files in a specific order, your only safe bet is to sort them yourself.

Just seemed like there should be an option to handle this sort of operation. Sorting them myself manually defeats the whole point of scripting a process, but I suppose there are althernative methods for handling the list outside of applescript.

Thanks for the input though.

regards,

pg

Hi,

If you don’t want to rename the files, then sorting to a Finder list may become a big job. I think it’s sorting alphanumerically where the integer parts in the name of an item is taken as a number. If this is true then you can split the name up into different parts, then sort by coercing the integer strings to integer. For instance, I quickly wrote the following script to split a string (name of an item). I was thinking about this and think that using this method might be easier to rename the items adding leading zeros. Anyway here’s what I got so far:

set the_text to “abc123de45.pdf”
(*
set the_folder to choose folder
set item_list to (list folder the_folder without invisibles)
set item_count to (count item_list)
set sorted_list to {}
set the_index to 0
repeat until the_index > item_count
set the_index to the_index + 1
set this_item to (item the_index of item_list)
end repeat
*)
SplitString(the_text)
– returns a list of integer strings and text not containing digits
– empty string returns empty list
on SplitString(the_text)
set the_digits to “0123456789”
set the_list to {}
try
set this_string to (first character of the_text)
repeat with this_char in rest of text items of the_text
– this string is not empty so check last char
set last_char to (last character of this_string)
if (last_char is in the_digits and this_char is in the_digits) or ¬
not (last_char is in the_digits or this_char is in the_digits) then
set this_string to this_string & this_char
else – new string
set end of the_list to this_string
set this_string to this_char
end if
end repeat
set end of the_list to this_string
end try
return the_list
end SplitString

The beginning commented part I was just starting on the sorting but need to think about it a little. The subroutine probably needs work. I was think of making a script object because I hate to keep stting the variable like that.

Anyway, I need some sleep.

gl,

Here’s a Finder-emulating sort routine I wrote last year in collaboration with Rick Bargerhuff. It’ll error if any of the strings has more than 16 consecutive numeric characters, but it should be OK for your purposes. (I don’t know how it compare’s with Kel’s effort as I’ve only just seen that and haven’t had time to look at it properly yet.)

(* QuickSort handler adapted from Arthur Knapp *)
on QuickSortOnItem(a, l, r, n) -- sorts in-place
  -- Quicksorts a list of lists, sorting on item n of each list
  
  local i, j, v
  
  -- Much thanks to both Serge Belleudy-d'Espinose and Victor Yee
  -- for the script-referencing techniques that they helped to
  -- refine.
  --
  script o
    property p : a
  end script
  
  set i to l
  set j to r
  set v to o's p's item ((l + r) div 2)'s item n
  
  repeat while (j > i)
    repeat while (o's p's item i's item n < v)
      set i to i + 1
    end repeat
    
    repeat while (o's p's item j's item n > v)
      set j to j - 1
    end repeat
    
    if j is not less than i then
      tell o's p's item i
        set o's p's item i to o's p's item j
        set o's p's item j to it
      end tell
      set i to i + 1
      set j to j - 1
    end if
  end repeat
  
  if (l < j) then QuickSortOnItem(o's p, l, j, n)
  if (r > i) then QuickSortOnItem(o's p, i, r, n)
end QuickSortOnItem

on FinderDisplaySort(itemNames)
  -- Replace each name in the list with a sublist containing a
  -- (possibly) doctored name for sorting and the orginal name
  considering case
    repeat with thisName in itemNames
      if thisName contains "1" or thisName contains "2" or thisName contains "3" or thisName contains "4" or thisName contains "5" or thisName contains "6" or thisName contains "7" or thisName contains "8" or thisName contains "9" or thisName contains "0" then
        -- If this name contains a number, get a version bracketed with "X"'s
        -- (Knowing that it doesn't begin or end with a number saves a few checks)
        set Xname to "X" & thisName & "X"
        -- Make up a sorting name in which the original name's numeric
        -- portions have "10000000000000000" added (OK for "integers" up to 16 digits)
        set sortName to ""
        set l to 1
        set nondigits to true
        repeat with r from 2 to (count Xname)
          if nondigits then
            if character r of Xname is in "1234567890" then
              set sortName to sortName & text l thru (r - 1) of Xname
              set l to r
              set nondigits to false
            end if
          else if character r of Xname is not in "1234567890" then
            tell (text l thru (r - 1) of Xname) as number
              set sortName to sortName & (it div 100000000 + 100000000) & text 2 thru 9 of ((it mod 100000000 div 1 + 100000000) as string)
            end tell
            set l to r
            set nondigits to true
          end if
        end repeat
        set sortName to text 2 thru -1 of sortName
        if r > l then set sortName to sortName & text l thru (r - 1) of Xname
        -- Relace the original name in the list with a sublist containing the
        -- sorting name and the original name
        set thisName's contents to {sortName, thisName's contents}
      else
        -- If the name doesn't contain a number, the sorting name's the same
        -- as the original
        set thisName's contents to {thisName's contents, thisName's contents}
      end if
      -- The Finder sorts "_" as " ", so modify the sorting name if necessary
      if thisName's item 1 contains "_" then
        set astid to AppleScript's text item delimiters
        set AppleScript's text item delimiters to "_"
        set theBits to thisName's item 1's text items
        set AppleScript's text item delimiters to " "
        set thisName's item 1 to theBits as string
        set AppleScript's text item delimiters to astid
      end if
    end repeat
  end considering
  -- Sort the sublists on the first (sorting) name in each
  QuickSortOnItem(itemNames, 1, count itemNames, 1)
  -- Substitute the original names for the sorted sublists
  repeat with thisName in itemNames
    set thisName's contents to item 2 of thisName
  end repeat
  
  return itemNames
end FinderDisplaySort


set alphaSortedNames to list folder (choose folder) without invisibles
set FinderSortedNames to FinderDisplaySort(alphaSortedNames)

thanks for the input.

I appreciate the repsonses, I’ll give them a try.

regards

pg

Hi,

I finally figured what was wrong with the script. Yours is probably better, but if you want to look it this and maybe fix it up. It was a good puzzle. I still have the overflow error for integers also. Maybe in these cases compare every digit.

–set string1 to “4abc123de45.pdf” – test
–set string2 to “ghj567.pdf” – test
–SplitString(string1) – test
–{string1,string2) – test
–fixed bug in FinderCompare()
---------- main
set the_folder to choose folder
set item_list to (list folder the_folder without invisibles)
if item_list is {} then return item_list
set sorted_list to Finder_Sort(item_list)
---------- subroutines
– modified Apple linear sort
– sorts according to Finder sort
– digits are compared by their numerical value
– returns sorted list of strings
– needs subroutines FinderCompare(), SplitString()
on Finder_Sort(my_list)
set the index_list to {}
set the sorted_list to {}
set time_count to 0 – for test
repeat (the number of items in my_list) times
set the low_item to “”
repeat with i from 1 to (number of items in my_list)
if i is not in the index_list then
set this_item to item i of my_list as text
if the low_item is “” then
set the low_item to this_item
set the low_item_index to i
else if FinderCompare(this_item, low_item) then
set the low_item to this_item
set the low_item_index to i
end if
end if
end repeat
set the end of sorted_list to the low_item
set the end of the index_list to the low_item_index
end repeat
return the sorted_list
end Finder_Sort

– returns true if first string comes before second string
– needs subroutine SpliString()
on FinderCompare(string1, string2)
set the_flag to false
set list1 to SplitString(string1)
set list2 to SplitString(string2)
set list1_count to (count list1)
repeat with i from 1 to list1_count
– get the items i from each list
try
set list1_item to item i of list1
set list2_item to item i of list2
on error – could not get item i of list2, too short
– this means that list2 comes before list 1 so leave the_flag to false
exit repeat
end try
– compare items i of both list
– check if both are integers
– yes, coerce to integer
– no, just compare alphabetically
try
set list1_int to list1_item as integer
set list2_int to list2_item as integer
if (list1_int < list2_int) or (list1_int > list2_int) then
set the_flag to (list1_int < list2_int)
exit repeat
end if
on error – at least one was a string, Finder sorts this alphabetically
if (list1_item < list2_item) or (list1_item > list2_item) then
set the_flag to (list1_item < list2_item)
exit repeat
end if
end try
end repeat
return the_flag
end FinderCompare

– returns a list of integer strings and text not containing digits
– empty string returns empty list
on SplitString(the_text)
set the_digits to “0123456789”
set the_list to {}
try
set this_string to (first character of the_text)
repeat with this_char in rest of (text items of the_text)
– this string is not empty so check last char
set last_char to (last character of this_string)
if (last_char is in the_digits and this_char is in the_digits) or ¬
not (last_char is in the_digits or this_char is in the_digits) then
set this_string to this_string & this_char
else – new string
set end of the_list to this_string
set this_string to this_char
end if
end repeat
set end of the_list to this_string
end try
return the_list
end SplitString

Goodnight, :slight_smile:

Hi,

I see the obstacle with the underline character now. Still might have a way to go and gotta speed it up.

Thanks Nigel,

Hi, Kel. Happy Easter! :slight_smile:

Thanks for posting your “Finder sort” code. It certainly seems to get the required results for cgrafx’s purposes, though it’s a little on the slow side.

The main trick in keeping a script’s running time to a minimum is to see that it carries out as few incidental actions — and produces as few incidental strings and lists — as possible. This is often achieved through the way that instructions are phrased. For instance, you could replace ‘number of items in my_list’ (ie. “identify all the items of class ‘item’ in my_list and then get the ‘number’ of the result”) with ‘number of my_list’. Better still, since ‘number of’ is far slower than either ‘length of’ or ‘count’, use ‘count my_list’. However, it won’t make much difference here as you only use ‘number of’ in the outer loops of your process.

One of the busiest points in your script is the SplitString() handler, which is called twice from FinderCompare(), which in turn is called from within a nested repeat. The speed of the script nearly doubles on my machines if I change ‘repeat with this_char in rest of (text items of the_text)’ to:

repeat with i from 2 to (count the_text)
  set this_char to item i of the_text

… because the list of ‘text items’ (you mean ‘characters’) is not created and neither is its ‘rest’.


Actually, the script I posted a couple of days ago wasn’t the version I thought it was. It was the original from before my correspondence with Rick. Here’s a bottled version of our joint effort. It’s longer than my original but faster, and can handle an unreasonably large number of file names. Like the original, it sorts lists in place rather than returning sorted copies of them, but this can be changed by making the property declaration for listCopy ‘property listCopy : theList’s items’.

It’s basic approach is pretty gross, but benefits from “economies of scale”:

1. Coerce the input list to a single string, using an unlikely character as a delimiter.
2. Traverse this string, identifying the "numeric" and "non-numeric" sections. Assemble the identified sections in another list, padding the numeric ones with leading zeros in the process.
3. Coerce the assembly list to a single string, using "" as a delimiter.
4. Using the unlikely delimiter, get the text items of this string. The result is a list equivalent to the original input list, but with the numeric portions of the strings padded out to the same length.
5. Quicksort the doctored list, mimicking each move in the original list.

Assumptions:
1. All the items in the original list are strings.
2. Mixed numeric/non-numeric strings are unlikely to contain runs of more than 16 numeric characters. However, the script can easily be adapted to accommodate more. (It may be possible to get away with 10, which would make the sort faster.)

It occurred to me while looking at your code that my own effort might be better off with an insertion sort rather than a Quicksort. Since the input list is most likely to be provided either by ‘list folder’ or by the Finder, it’ll be nearly in the right order anyway - a situation where an insertion sort would be more efficient. But my tests only give the insertion sort a fraction of a second’s advantage here, compared with several seconds disadvantage when the input list is [i]not[/] nearly in order!

(* Sorts a list of strings, treating numerics numerically rather than lexicographically. *)
on FinderDisplaySort(theList)
  
  script o
    -- The input list -- or a copy of it
    property listCopy : theList --'s items
    -- A list for doctored strings that will sort in the required order
    property doctoredList : {}
    -- A temporary list for assembling parts of doctored strings
    property partList : {}
    -- A delimiter character that's unlikely to be in any string in the list
    property unlikelyDelim : ASCII character 1
    -- The numeric characters
    property theDigits : "1234567890"
    -- A string of "zero" characters. Use as many as required to pad numerics.
    property zeros : "00000000000000000000"
    
    -- Prepare a version of the list with doctored strings.
    on getDoctoredList()
      -- Process the input list 3900 items (or less) at a time.
      -- (3900 is near the maximum number of text items that can be extracted in one go.)
      set listLen to (count listCopy)
      set listL to 1
      repeat until listL > listLen
        set listR to listL + 3899 -- 3900 items...
        if listR > listLen then set listR to listLen -- ... or less at a time
        
        -- Coerce a batch of items to a single string using the unlikely delimiter
        set listStr to my ListToString(items listL thru listR of my listCopy, unlikelyDelim)
        
        considering case
          -- The X Finder sorts underscores as spaces: replace the one with the other here
          if (listStr contains "_") then set listStr to my ListToString(StringToList(listStr, "_"), space)
          
          -- If the string contains any numerics, go through and modify them
          if (listStr contains "1") or (listStr contains "2") or (listStr contains "3") or (listStr contains "4") or (listStr contains "5") or (listStr contains "6") or (listStr contains "7") or (listStr contains "8") or (listStr contains "9") or (listStr contains "0") then
            set my partList to {}
            set l to 1
            set numeric to (character l of listStr is in theDigits) -- true or false
            repeat with r from 2 to (count listStr)
              if (numeric) then
                if (character r of listStr is not in theDigits) then
                  set end of my partList to my padNumeric(text l thru (r - 1) of listStr)
                  set l to r
                  set numeric to false
                end if
              else if (character r of listStr is in theDigits) then
                set end of my partList to text l thru (r - 1) of listStr
                set l to r
                set numeric to true
              end if
            end repeat
            if (numeric) then
              set end of my partList to my padNumeric(text l thru r of listStr)
            else
              set end of my partList to text l thru r of listStr
            end if
            set listStr to ListToString(partList, "")
          end if
        end considering
        
        -- Break up the string again using the unlikely delimiter
        -- and concatenate the list result to doctoredList
        set doctoredList to doctoredList & StringToList(listStr, unlikelyDelim)
        -- Advance the left input-list pointer for the next batch of strings
        set listL to listR + 1
      end repeat
      
      return doctoredList
    end getDoctoredList
    
    -- Prepend enough leading zeros to a numeric string to make it 16 digits wide.
    -- 16 is a compromise: fewer digits sort more quickly; more digits are less likely.
    -- Names of cache files should sort lexicographically.
    on padNumeric(theString)
      set pad to 16 - (count theString)
      if pad > 0 then
        return (text 1 thru pad of zeros) & theString
      else
        return theString
      end if
    end padNumeric
    
    -- turn string into list
    on StringToList(theString, theDelim)
      set astid to AppleScript's text item delimiters
      set AppleScript's text item delimiters to theDelim
      set theStringList to every text item of theString
      set AppleScript's text item delimiters to astid
      return theStringList
    end StringToList
    
    -- turn list into string
    on ListToString(theStringList, theDelim)
      set astid to AppleScript's text item delimiters
      set AppleScript's text item delimiters to theDelim
      set theString to theStringList as string
      set AppleScript's text item delimiters to astid
      return theString
    end ListToString
    
    (* QuickSort by Arthur J Knapp *)
    -- This variation is hardwired to sort listCopy by copying the moves required to sort doctoredList.
    on qMimicSort(l, r) -- sorts in-place
      
      -- Much thanks to both Serge Belleudy-d'Espinose and Victor Yee
      -- for the script-referencing techniques that they helped to
      -- refine.
      --
      set i to l
      set j to r
      set v to my doctoredList's item ((l + r) div 2)
      
      repeat while (j > i)
        
        repeat while (my doctoredList's item i < v)
          set i to i + 1
        end repeat
        
        repeat while (my doctoredList's item j > v)
          set j to j - 1
        end repeat
        
        if (i is not greater than j) then
          tell my doctoredList's item i
            set my doctoredList's item i to my doctoredList's item j
            set my doctoredList's item j to it
          end tell
          
          tell my listCopy's item i
            set my listCopy's item i to my listCopy's item j
            set my listCopy's item j to it
          end tell
          
          set i to i + 1
          set j to j - 1
        end if
        
      end repeat
      
      if (l < j) then qMimicSort(l, j)
      if (r > i) then qMimicSort(i, r)
      
    end qMimicSort
  end script
  
  tell o
    getDoctoredList()
    qMimicSort(1, count its listCopy)
    return its listCopy
  end tell
  
end FinderDisplaySort

set FinderSortedNames to FinderDisplaySort(list folder (choose folder) without invisibles)

Hmm… as per our quicksort discussions, did you try using a sentinal in the insertion sort? You could just add an empty string to the head of the list, since nothing will sort before it.

P.S. I love this thread. Nigel and Kai, you guys are just too brilliant!!! :slight_smile:

I never thought of that! I just scanned the list for the “lowest” string and swapped it with the string in the first position. Your suggestion sounds as though it may be even faster, but not by enough to defeat the insertion sort’s severe disadvantage when working with not-nearly-ordered lists. :frowning:

Kel’s not Kai, is he!!!

I’ve had a couple more thoughts about Kel’s SplitString() handler:

-- returns a list of integer strings and text not containing digits
-- empty string returns empty list
on SplitString(the_text)
  set the_digits to "0123456789"
  set the_list to {}
  try
    set this_string to (first character of the_text)
    repeat with this_char in rest of (text items of the_text)
      -- this string is not empty so check last char
      set last_char to (last character of this_string)
      if (last_char is in the_digits and this_char is in the_digits) or ¬
        not (last_char is in the_digits or this_char is in the_digits) then
        set this_string to this_string & this_char
      else -- new string
        set end of the_list to this_string
        set this_string to this_char
      end if
    end repeat
    set end of the_list to this_string
  end try
  return the_list
end SplitString

I mentioned changing the form of the repeat in my previous post. In the long ‘if’ line, the test boils down to “if (both characters are in the_digits) or (both characters are not in the_digits).” This can be expressed as:

if (last_char is in the_digits) = (this_char is in the_digits) then -- both are true or both are false

It so happens that the variable ‘last_char’ is always set to a character that was previously held by another variable. It therefore makes sense to set last_char to that variable rather than going through the process of extracting the character from ‘this_string’ again. So:

-- returns a list of integer strings and text not containing digits
-- empty string returns empty list
on SplitString(the_text)
  set the_digits to "0123456789"
  set the_list to {}
  try
    set this_string to (first character of the_text)
    set last_char to this_string
    repeat with i from 2 to (count the_text)
      set this_char to character i of the_text
      if (last_char is in the_digits) = (this_char is in the_digits) then
        set this_string to this_string & this_char
      else -- new string
        set end of the_list to this_string
        set this_string to this_char
      end if
      set last_char to this_char
    end repeat
    set end of the_list to this_string
  end try
  return the_list
end SplitString

(* Timings with a small mixture of alphanumeric and alphabetical names:
  As originally written:             17.48 seconds
  With changed repeat format:         8.325 seconds
    and compressed logic:             6.3 seconds
    and revised 'last_char' setting:  5.29 seconds
    *)

… which shows what a large difference a few minor code details can make! :slight_smile:

Hi Nigel,

Thanks for the feedback!

I didn’t know about the “rest of (some_list)” in the repeat line adding so much time. I just started doing this (using non-list parameters for the repeat.

About the calls to the SplitString() handler, I was thinking that maybe making a list of lists might solve this. The sublist would be a list of 2, the string and the splited string. Then, I would just need to cal the handler according to the number of strings. You make a lot of sense.

About the last_char bit, I originally did it another way. I changed it to getting the last char to shorten the script. This shows that making it short is not necessarily better as your test shows some impressive results.

I dusted off my old ics books and looked up insertion sort. My fear of dead brain cells subsided when it didn’t look that hard. The last time I tried to see how quicksort went, I had a very hard time with it.

The efficiency part is starting to come back. I’ll continue with this script after my next Heineken.

Still need to deal with the underscore character and something else. What was I thinking about? Oh well cheers and Happy Easter to you also. Thank you Jesus!

Thanks a lot,

Sorry, a little slip in my old age. :wink:

Sigh… I take back everything I said before. You two are rank amateurs. :wink:

property kcAsc0 : ASCII character 0

on BreakAtNumbers(s)
	
	set astids to AppleScript's text item delimiters -- save tids
	try
		
		--	Nigel Garvey's observation that string comparisons are
		--	faster when case is considered, (even when case isn't
		--	"involved").
		--
		considering case
			
			repeat with i from 0 to 9
				
				set i to i as string
				
				if (i is in s) then
					
					set AppleScript's text item delimiters to i
					set s to s's text items
					set AppleScript's text item delimiters to kcAsc0 & i & kcAsc0
					set s to s as string
					
				end if
			end repeat
		end considering
		
		set AppleScript's text item delimiters to kcAsc0 & kcAsc0
		set s to s's text items
		set AppleScript's text item delimiters to ""
		set s to s as string
		
		set AppleScript's text item delimiters to kcAsc0
		set s to s's text items
		
	on error m number n from f to t partial result p
		set AppleScript's text item delimiters to astids -- restore tids
		error m number n from f to t partial result p
	end try
	set AppleScript's text item delimiters to astids -- restore tids
	
	return s
	
end BreakAtNumbers

Oh tut, Arthur! Call this professional? :wink:

BreakAtNumbers("10.jpg")
--> {"", "10", ".jpg"}

And how about using a property klPreconcatenatedDelims?

Other than that, it’s totally brilliant! :slight_smile:

Like their old adverts: “Heineken. Refreshes the parts other beers can’t reach.” :wink:

Enjoy. As for me, “City of Truro” is running on the Gloucestershire & Warwickshire Railway this weekend, so I’m off to do some trainspotting!

Of course, I consider the above to be “correct,” since this ensures that the data returned is always consistent, ie: numerical strings at every even index. Otherwise, one would have to test for the situation in which the string had started with or ended with a run of digits.

It has recently occurred to me that I don’t use enough prefixes in the namespace of my scripts. One way in which I plan to remedy this is to start prefixing all of my handlers with ‘gh’; ‘g’ for global and ‘h’ for handler. Also, just so my code can always be properly understood, I’m going to start posting in zeros and ones exclusively.

01000010 01110010 01101001 01100100 01100111 01100101 01110100 01110100 01100101.

01001111 01001011?

My friend Eliot once said to a friend of his, “I have Unix running on an Amiga 500.” The friend replied, “Is that like Flying Toasters?”