Complex Lookup/Cross-Reference without deep-nest IFs?

I’ve got a project where I am writing a script that fixes File Type and Creator Type information that has been stripped from files.

The detection involves a combination of:

–check extension
–matches against known naming standards
–scan of the file “header” for known strings

So for example, a file ending in “.eps” and a positive match for “Creator: Adobe Illustrator(R) 8” in the header is known to be an Illustrator 8 EPS file and should be of File Type = EPSF and Creator Type = ART5 (the Illustrator version is simply logged, since all Illustrator files share the same type/creator in this case).

The criteria are all unique, so once a file is found to meet one set of criteria, thus can be repaired, the script should stop checking against any remaining criteria.

So the idea is:

CRITERIA1 = at least two dependent checks (CHECK1 & CHECK2 are TRUE)
CRITERIA2 = at least two dependent checks (CHECK3 & CHECK4 are TRUE)
CRITERIA3 = at least two dependent checks (CHECK4 & CHECK6 are TRUE)

IF ANYOF CRITERIA = TRUE
THEN load variables with values associated with found CRITERIA

So…

IF CRITERA1 = TRUE
THEN variable1 = value1, variable2 = value2

I have about three dozen checks like this. I’m looking at one really ugly nest if-the-else setup. I had already planned to put the most common files at the top of the if-then-else so it is more likely to stop early. But this seems inefficient somehow, and I was wondering if there is some other structre I could use to do this lookup.

Thanks in advance for your help,

–Kevin

hi calvin,

not sure if this helps you, but could you write a generic handler to do the compares & then use your checks as parameters to the handler?

something like this:


set CHECK1 to {"1", "2", "3"}
set CHECK2 to {"a", "b", "c"}
set CHECK3 to {"4", "5", "6"}
set CHECK4 to {"d", "e", "f"}

set my1Var to "1a"
set my2Var to "4d"


CRITERIA(my1Var, CHECK1, CHECK2)
CRITERIA(my1Var, CHECK3, CHECK4)
CRITERIA(my2Var, CHECK1, CHECK2)
CRITERIA(my2Var, CHECK3, CHECK4)


on CRITERIA(theVar, aCheck, bCheck)
	set aBool to 0
	set bBool to 0
	repeat with x in theVar
		if aCheck contains x then
			set aBool to 1
		else if bCheck contains x then
			set bBool to 1
		end if
	end repeat
	
	if aBool + bBool is 2 then
		display dialog theVar & " is a match" & " to " & aCheck & " & " & bCheck
	end if
	
end CRITERIA

Suppose, for the sake of this argument there were 31 tests the results of which could be stored as a variable length binary number. These tests would be done in the order of their expectations to occur, and the results as 0 and 1 are both summed, and recorded at the beginning of a list so that when the sum of the list elements = 2 a unique binary number results identifying the 2 tests that passed.

If tests 4 and 6 were passed, for example, we would stop with {1, 0, 1, 0, 0, 0} or (as string as number) 101000 = 40 decimal. A simple series of new ifs would identify the criterion to which this number belonged and generate the file and creator types you wanted or the number itself could be coded as an address in a list somehow.

Yeah, if I am stuck with a basic if-the-else nest, then a handler along those lines will certainly clean-up the code.

The problem still becomes…


if someCheck = someResult then
	doSomething()
else
	if someOtherCheck = someResult then
		doSomething()
	else
		if someOtherOtherCheck = someResult then
			doSomething()
		else
			if someOtherOtherOtherCheck = someResult then
				doSomething()
			else
				if someOtherOtherOtherOtherCheck = someResult then
					doSomething()
				else
					justMoreNesting()
				end if
			end if
		end if
	end if
end if

…gets very tedious with more than a half dozen checks, and I have at least two dozen, might be three dozen when when I’m done. Eeek!

–Kevin

I’ve re-read if a number of times, but it still sounds a series of next IFs in order to get the binary sequence, then a second set of IFs to determine what the first set of IFs figured out.

But it’s even more likely that your answer just went way over my head. :stuck_out_tongue:

–Kevin

hi calvin,

well, you could call your generic handlers from a generic handler:


set CHECK1 to {"1", "2", "3"}
set CHECK2 to {"a", "b", "c"}
set CHECK3 to {"4", "5", "6"}
set CHECK4 to {"d", "e", "f"}

set my1Var to "1a"
set my2Var to "4d"

someCheck(my1Var, my2Var, CHECK1, CHECK2, CHECK3, CHECK4)

on someCheck(a1Var, a2Var, CHECK1, CHECK2, CHECK3, CHECK4)
	if CRITERIA(a1Var, CHECK1, CHECK2) = OTHERCRITERIA(a2Var, CHECK3, CHECK4) then
		display dialog "We have a winner!"
	end if
end someCheck

on CRITERIA(theVar, aCheck, bCheck)
	set aBool to 0
	set bBool to 0
	repeat with x in theVar
		if aCheck contains x then
			set aBool to 1
		else if bCheck contains x then
			set bBool to 1
		end if
	end repeat
	
	if aBool + bBool is 2 then
		return 1
	end if
	
	return 0
	
end CRITERIA

on OTHERCRITERIA(Var1, my1Check, my2Check)
	set aBool to 0
	set bBool to 0
	repeat with x in Var1
		if my1Check contains x then
			set aBool to 1
		else if my2Check contains x then
			set bBool to 1
		end if
	end repeat
	
	if aBool + bBool is 2 then
		return 1
	end if
	
	return 0
	
end OTHERCRITERIA

that would certainly shrink your code some, but i’m not sure it makes your problem any easier to solve.

NOTE: although this code works, it’s just an example. there are problems with the example, but the basic idea should be obvious.

Using else if would reduce the degree of nesting:

if someCheck = someResult then
	doSomething()
else if someOtherCheck = someResult then
	doSomething()
else if someOtherOtherCheck = someResult then
	doSomething()
else if someOtherOtherOtherCheck = someResult then
	doSomething()
else if someOtherOtherOtherOtherCheck = someResult then
	doSomething()
else
	justMoreElseIfing()
end if

Presumably, a hit with one of your checks narrows the range of other checks you need to do, so you might try a hierarchical approach. It wouldn’t necessarily be more compact, but it would be quite fast and easy to manage. Not knowing the exact form of your three dozen checks and how they relate to the various criteria, it’s difficult to suggest any exact code. The sort of thing I have in mind is:

  1. Get the information from the file.
  2. Check the extension (say) and pass the rest of the information to a handler that knows what other checks to do based on the expectations raised by the extension being what it is.
if (extn is ".eps") then
	doTheseChecks(theName, theHeader)
else if (extn is ".abc") or (extn is ".xyz") then
	doThisOne(theName, theHeader)
else if (extn is ".uh") then
	doSeveralMoreChecks(theName, theHeader)
else
	etc()
end if

After arranging the other check(s) and coming to a decision, the handler could then return the appropriate creator and file types to the main part of the script.

My checks are not consistant, they can take the form of things like: 2-criteria AND, 2-criteria OR, 3-criteria mixed, member-of checks, etc. So using a “check handler” was too wierd. Now that I’ve gotten to the script writing I decided on another “technique” (a term I use somewhere loosely I admit) I’ve used before for complex multi-step checks.

Basically I write a series of stand-alone if-then-else blocks. Each block checks a certain trait, and perhaps has a few nested sub-if’s to narrow-down what I’m looking for. The outermost if-then checks for the value of one or more “flag” variables. The blocks themselves may or may not set these flags to something else.

So the first if-then check finds the “flags” in their default empty state, thus the check(s) inside will proceed. If the check comes-up true, then the flags are change, if not, they remain default. Once this block is done, the next checks are also surrounded by this “flag check.” If the first check found what it needed, then the second if-then block “sees” this via the flags, and never executes the inner checks.

Bit hard to verbalize, so here is dummy code:


--
-- what kind of file is it?
--
on lessNests()
	set variable1 to ""
	set variable2 to ""
	
	--
	--START DETECTIVE WORK
	--
	--CHECK #1
	if (variable1 = "") and (variable2 = "") then
		if someCheck then
			doSomething()
			set variable1 to someValue
			set variable2 to someValue
		else
			if someOtherCheck then
				doSomething
				set variable1 to someValue
				set variable2 to someValue
			else
				if someOtherCheck then
					doSomething
					set variable1 to someValue
					set variable2 to someValue
				end if
			end if
		end if
	end if
	
	--CHECK #2
	if (variable1 = "") and (variable2 = "") then
		if someCheck then
			doSomething()
			set variable1 to someValue
			set variable2 to someValue
		end if
	end if
	
	--CHECK #3
	if (variable1 = "") and (variable2 = "") then
		if someCheck then
			doSomething()
			set variable1 to someValue
			set variable2 to someValue
		end if
	end if
	
	--ERROR TRAP (if none of the checks yield anything)
	if (variable1 = "") and (variable2 = "") then
		if someCheck then
			doSomething()
		end if
	end if
	
end lessNests

This may not be 100% as efficient as a huge nest of if-then-else (and I’m staring at 12 “blocks” which each have 3-6 sub-nests for a total for 40 checks in all), but it’s alot easier to look at and troubleshoot. Well, for me anyway.

If anyone has better ways to do it this way, I’m still open to suggestions. I’m only about 1/3 the way through coding the checks. :stuck_out_tongue:

Thanks again for all the insights.

–Kevin