I am importing two text files into two lists that I’m comparing. One text file has a list of updates to install and the other has the history of updates already installed, they are all separated by returns. There will be some duplicates between the 2 lists since some of the updates may have been already installed. I want to be able to compare the 2 lists and create a third list with just the unique non duplicate values (updates yet to be installed). This is what I have so far but its not working.
set listofMtnLion to {{"osx1"},{"osx2"},{"osx3"}}
set listofMtnLionHistory to {{"osx1"},{"osx2"}}
set Uniquelist to {}
repeat with a from 1 to count of listOfMtnLion
set Updatelist to (item a of listOfMtnLion)
set Historylist to (item a of listofMtnLionHistory)
if {Updatelist} is not equal to {Historylist} then
set Uniquelist's end to {Updatelist}
end if
end repeat
set listofMtnLion to {"osx1", "osx2", "osx3"}
set listofMtnLionHistory to {"osx1", "osx2"}
set Uniquelist to {}
repeat with a from 1 to (count listofMtnLion)
set thisUpdate to item a of listofMtnLion
if (thisUpdate is not in listofMtnLionHistory) then set end of Uniquelist to thisUpdate
end repeat
Uniquelist
When I use your updated code with preset lists, this works. When I just use the automatically created lists from my imported text documents, it lists all the entires instead of just the unique one.
I have a feeling I’m importing the data into the lists incorrectly.
This is what I have for importing from text document.
--// Import OSX Update file for 10.8
set listOfMtnLion to {}
set OSUpdates2 to paragraphs of (read POSIX file "/Scentsy/Updates/OSX10.8.txt")
repeat with nextLine2 in OSUpdates2
if length of nextLine2 is greater than 0 then
copy nextLine2 to the end of listOfMtnLion
end if
end repeat
--// Create OSX History file
tell application "Finder" to if exists "/Scentsy/Updates/UpdateHistory10.8.txt" as POSIX file then
else
do shell script ("sudo touch /Scentsy/Updates/UpdateHistory10.8.txt") user name (account of theCredentials) password (password of theCredentials) with administrator privileges
do shell script ("sudo chmod 777 /Scentsy/Updates/UpdateHistory10.8.txt") user name (account of theCredentials) password (password of theCredentials) with administrator privileges
end if
--// Tests OSX History file to make sure its not empty with a try catch
set listofMtnLionHistory to {}
try
set OSUpdatesHist2 to paragraphs of (read POSIX file "/Scentsy/Updates/UpdateHistory10.8.txt")
set MtnLionFileNotEmpty to "yes"
on error Error1
set MtnLionFileNotEmpty to "no"
end try
--// If History file not empty this imports it into a list
if MtnLionFileNotEmpty is equal to "yes" then
set OSUpdatesHist2 to paragraphs of (read POSIX file "/Scentsy/Updates/UpdateHistory10.8.txt")
repeat with nextLine2 in OSUpdatesHist2
if length of nextLine2 is greater than 0 then
copy nextLine2 to the end of listofMtnLionHistory
end if
end repeat
end if
Because of the kind of repeat you’re using to build the two lists, the items you’re putting in them aren’t the actual paragraph texts but references to them in the two values of OSUpdates2 ” eg. not the text “osx1” but the reference ‘item 1 of {“osx1”, “osx2”, “osx3”}’. It’s these references which are being compared rather than the values at the end of them. There are a couple of ways round this, the slightly easier being to change the two instances of .
copy nextLine2 to the end of listOfMtnLion -- or of listofMtnLionHistory
. to .
copy contents of nextLine2 to the end of listOfMtnLion -- or of listofMtnLionHistory
Then the lists will contain actual text and the comparisons should work properly.
Hello
If there is in listofMtnLionHistory an item which is not available in listofMtnLion my understanding is that it must be added to Uniquelist which the script doesn’t.
Yvan KOENIG (VALLAURIS, France) samedi 9 novembre 2013 10:32:35
The way I read it, Gokussi only wanted “updates yet to be installed” in the third list. I’m presuming that listOfMtnLionHistory is the list with the “history of updates already installed.”
I was thinking a little about this, and I wondered if this couldn’t be solved by using text item delimiters as a set theoretic tool. I am disappointed, as I figured the problem to involve three lists, and not just two.
Now, to get a unique subset out of two sets A and B, you would form the difference A-B + B-A. (A set can per definition only contain unique elements).
Although this uses text item delimiters, I don’t think this to be any faster, but it would be fun to time it.
set listofMtnLion to {"osx1", "osx2", "osx3"}
set listofMtnLionHistory to {"osx1", "osx2"}
set tmp1 to difference(listofMtnLion, listofMtnLionHistory)
set tmp2 to difference(listofMtnLionHistory, listofMtnLion)
set uniqueValues to tmp1 & tmp2
-- > {"osx3"}
on difference(setA, setB)
tell (a reference to AppleScript's text item delimiters)
local tids
set {tids, contents of it} to {contents of it, return}
set setB to setB as text
set contents of it to setA
set setB to text items of setB
set contents of it to return
set setB to text items of (setB as text)
set contents of it to missing value
set setB to text items of (setB as text)
set contents of it to tids
end tell
return setB
end difference
Nice idea. But the handler needs to be a little more robust.
set listofMtnLion to {"osx1", "osx2", "osx3", "osx4"}
set listofMtnLionHistory to {"osx13", "osx4", "osx2"}
set Uniquelist to difference(listofMtnLion, listofMtnLionHistory)
-- Return a list of the items in setA which aren't in setB.
on difference(setA, setB)
set astid to text item delimiters
set text item delimiters to return & return
set setA to return & setA & return
set text item delimiters to return & linefeed & return
set setB to return & setB & return
set text item delimiters to linefeed
set text item delimiters to setB's text items
set setA to setA's text items
set text item delimiters to ""
set setA to setA as text
if ((count setA) > 0) then
set text item delimiters to return & return
set setA to text items of text 2 thru -2 of setA
else
set setA to {}
end if
set text item delimiters to astid
return setA
end difference
Because, I thought it to be a list as presented, that is: no linefeeds, or returns in the list, and no empty elements in the list, given this, and the constraint below, I can’t see any reason for it not to to work,.
There is one major constraint/deficiency by this approach, no element can be a substring of another element:
Example:
if osx was an element in the list, then everything would be ruined. -We would probably end up with a list of the numbers. -Which could be all right if we took height for that. I think that is a valid constraint in set theory, but it is rather awkward in real life, too much constraints for my taste.
I’m not sure if this is going to be faster than anything, given all the constraints here, the one place it can be faster, is when there are only two lists that are to be sifted for duplicates. 3 lists, leads to 10 difference operations.
So, the traditional way of doing things still rules.
Edit
You removed the constraint by embedding everything into returns! Nice!!!
I am sure it is more usable now, but I am still very unsure if it is faster, the in operator is a fast one too…
The problem with your example does not rely in linefeeds or empty items. It’s when a name of a file overlaps another. When you have an update for iTunes and iTunes helper is already installed, your example fails. Nigel’s example fixes that.
Hello.
Yes I figured it out, several hours ago, that he boxes the text item delimiters with returns, so that there is no way a text item delimiter can work on a substring, really nice. And even nicer how he uses the linefeed for safety in the process!
I figured it out a couple of hours ago before your post, (see the stamp).
The handler is also a great idea, but for set manipulation, getting the interscetion, and difference between sets.
The techniques have been around for years and aren’t mine. I was only posting from experience. But if McUsrII was previously unaware of the idea of coercing a list of text to text and using TIDs, instead of repeating through the original list, his invention of it here for the current purpose shows some original thinking.
The linefeed isn’t for safety, but is a delimiter for separating the return-bounded sections of setB, which are then used as delimiters themselves.
Frankly no original thinking from my part -I only took advantage of the fact that the delimiters nowadays can be a list.
I have seen Nigel Garvey do “the robustness” before, with lines I think. What I wasn’t aware of, is that it would also protect words from being broken up into pieces, as you wont find a match for return & osx & return, which the new text item delimiters are, which makes them only match when the word boundaries are the same. Which is quite smart. I didn’t really contribute with anything new here I think, except for using missing value as a text item delimiter maybe, and Apple has provided for the lists of text item delimiters. -Everything else, except for the idea of looking at the uniqueness problem as a set operation, (which can hardly be called original), everything is stolen as usual. Ranting along: whether and idea is original or not, is of lesser importance than if the idea works, not just technically, but also as a solution to some problem. -That it feels good to use.
It is Nigel’s effort that made the handler usable, which I think is a practical one for finding differences between sets.
There is no problem in using the while loop really, but this feels at least more correct, though the one with the loop may actually outperform it, at least when it has gotten a script object to keep the reference to a list with.
I got it with the linefeed, but this is boggling all together. (As usual with text item delimiters, which makes them so fascinating, apart from the speed they do represent.)
I just add this complementary operation in this thread since that is most relevant, and thereby complets the set handlers.
add and remove member, isMemberOf, and UnionSet, should be fairly easy to implement.
It should also be fairly easy to implement sets with a counter for how many times it has been added, for some “histogrammic” purpose, or whatnot.
-- Return a list of the items in aSet which aren't in Universe.
on isSubset of Universe for aSet
” Stolen from Nigel Garvey
if aSet = {} then return true
-- the empty set is a member of every set.
set astid to text item delimiters
set text item delimiters to return & return
set aSet to return & aSet & return
set text item delimiters to return & linefeed & return
set Universe to return & Universe & return
set text item delimiters to linefeed
set text item delimiters to Universe's text items
set aSet to aSet's text items
set text item delimiters to ""
set aSet to aSet as text
set text item delimiters to astid
return ((length of aSet) = 0)
end isSubset
Edit
If you are a lazy typist, then there is of course nothing hindering you in using the handler above for figuring out if two lists with simple values in it are equal.