Compare excel columns

As new to AppleScript, I would really appreciate any help to the following:

I have two spreadsheets (with only one sheet each) which I would like to compare them in a new spreadsheet.

In the new comparison spreadsheet the first column I would like to be the first column of the first spreadsheet, then the second column to be the first column of the second spreadsheet and the third column to be an IF formula comparison =IF(A1=B1,“True”,“False”). And this pattern to be continued in the rest of columns. (Column i of Spreadsheet 1, Column i of Spreadsheet 2, Comparison, Column i+1 of Spreadsheet 1, Column i+1 of Spreadsheet 2, Comparison, and so on for all the rows and columns that include text)

For example:

Spreadheet 1:
S1ColA S1ColB S1ColC
Item1 1 2
Item2 3 4
Item3 5 6

Spreadheet 2:
S2ColA S2ColB S2ColC
Item1 1 20
Item20 3 40
Item3 50 6

Comparison spreadsheet:
S1ColA S2ColA CompareColA S1ColB S2ColB CompareColB S1ColC S2ColC CompareColC
Item1 Item1 TRUE 1 1 TRUE 2 20 FALSE
Item2 Item20 FALSE 3 3 TRUE 4 40 FALSE
Item3 Item3 TRUE 5 50 FALSE 6 6 TRUE

Can someone give any advise of how to proceed on with the comparison?

So far I can open the two spreadsheets with prompts, and now I am stuck to create a loop to get the data from the two spreadsheets and to paste them to the new spreadsheet accordingly.

set aFile to choose file with prompt "Please select the first Excel file:"
tell application "Microsoft Excel"
	set aWBook to open workbook workbook file name (aFile as string)
	set aName to name of aWBook
end tell

set bFile to choose file with prompt "Please select the second Excel file:"
tell application "Microsoft Excel"
	set bWBook to open workbook workbook file name (bFile as string)
	set bName to name of bWBook
end tell


tell application "Microsoft Excel"
	activate
	set myWorkbook to make new workbook
end tell

set EndofRow to 4
set input_data to ""

tell application "Microsoft Excel"
	tell sheet "Sheet1" of workbook 1
		activate
		repeat with i from 1 to EndofRow
			set InputCell to the value of (cell i of range ("A1:A" & i)) of aName of aWBook
			set input_data to InputCell
			tell sheet "Sheet1" of workbook 1
				activate
				set newCell to ("A" & i)
				set value of cell newCell of sheet "Sheet1" of workbook 1 to InputCell
			end tell
		end repeat
	end tell
end tell

This should do what you ask. It would need a bit of tweaking if your spreadsheets have more rows and columns. As for your variables… I changed them a little bit. Also, to avoid repetitive choosing and making, I used saved workbooks. It should be relatively straightforward to change those back once you’ve confirmed that the process works.

Basically, the script cycles through the columns and rows of the source1 workbook and collects each value in a list. It then switches to the destination workbook and deposits those values in the appropriate cells; repeats for the source2 workbook. Finally, it inserts an ‘=EXACT()’ function in the comparison cells in the destination workbook to make the comparison.

tell application "Microsoft Excel"
	with timeout of 10 seconds
		set wa to "sheep1.xlsx"
		set wb to "lamb2.xlsx"
		set wc to "wolf.xlsx"
		
		set x to 3 -- columns
		set y to 4 -- rows
		
		-- collect values from workbook 1
		activate object workbook wa
		set acol to {} -- values as list
		repeat with xx from 1 to x -- cycle through columns and rows to collect values
			repeat with yz from 1 to y
				set end of acol to value of cell yz of column xx
			end repeat
		end repeat
		
		-- deposit values in workbook 3		
		activate object workbook wc
		repeat with xx from 1 to x -- 3
			repeat with yz from 1 to y -- 4
				set i12 to (y * (xx - 1) + yz)
				set value of cell yz of column (x * (xx - 1) + 1) to item i12 of acol -- columns A,D,G
			end repeat
		end repeat
		
		-- collect values from workbook 2
		activate object workbook wb
		set bcol to {} -- values as list
		repeat with xx from 1 to x -- cycle through columns and rows to collect values
			repeat with yz from 1 to y
				set end of bcol to value of cell yz of column xx
			end repeat
		end repeat
		
		-- deposit values in workbook 3
		activate object workbook wc
		repeat with xx from 1 to x -- 3
			repeat with yz from 1 to y -- 4
				set i12 to (y * (xx - 1) + yz)
				set value of cell yz of column (x * (xx - 1) + 2) to item i12 of bcol -- columns B,E,H
			end repeat
		end repeat
		
		-- proceed with workbook 3
		-- enter EXACT formulae (NB excel automatically corrects function's range)
		set ubc to union range1 range "C2:C4" range2 range "F2:F4" range3 range "I2:I4" -- columns C,F,I
		set formula of ubc to "=EXACT(A2,B2)"
		
	end timeout
end tell

NB regarding the EXACT arguments… when you dump a formula into a range of cells, Excel will automatically modify the function argument cells. Of course, if you include ‘$’ in your cell references then you can control that behaviour (e.g. =EXACT($A2,$B2).

Browser: Firefox 104.0
Operating System: macOS 10.12

Awesome work Mockman! You are my legend!

I loved your “wolf” and “sheep” naming and I will adopt it!

As for the comparison, yes it works as expected for small tables. And there was no need for the $ in the EXACT formula as you mentioned.

However, I noticed that it is timing out when comparing bigger spreadsheets, i.e. when comparing the sales of our department for the past two years where there are hundreds or even thousands of rows (our list of products) and so many additional columns (columns of prices per month, sales per month and other details as well).

I’m glad that it works at least for smaller files. I used those filenames so I wouldn’t have to guess at whether I was referring to a source or destination file.

Can you identify where in the script it is bogging down? You might have to break the script into pieces… for example, if it occurs while inserting the ‘exact’ formulae, then you could perhaps build a loop and cycle through every third column, or perhaps, insert the formula into row 2 in every third column and then cycle through each, select and fill down. While this may not be the cause, multiple ranges in large spreadsheets will often create performance issues.