Lots of binary combinations

Hi there,

I need to calculate 131,072 unique binary numbers, where the 1s and 0s are always 17 characters long
For example
10000000000000000
01000000000000000
00100000000000000

The 1s designate the position of an item in a physical code (a bit like a barcode, with a variable number of bars)

This is what I have so far:


script so
	property theBigList : {}
	property posOnly : {}
end script
repeat with q from 1 to 20000
	set theList to {}
	set z to q
	repeat
		if ((z mod 2) is 0) is true then
			set theList to 1 & theList
			set z to ((z - 2) / 2)
		else
			set theList to 0 & theList
			set z to ((z - 1) / 2)
		end if
		if z is 0 then
			exit repeat
		end if
	end repeat
	set xCount to (count theList)
	if xCount is less than 17 then
		repeat with i from (xCount + 1) to 17
			set end of theList to 0
		end repeat
	end if
	if theList is not {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} then
		set listwithTabs to my getdelimitedList(theList, tab, text)
		set isUnique to listwithTabs is in so's posOnly
		if not isUnique then
			copy listwithTabs to end of so's posOnly
			copy (q & tab & listwithTabs) as text to end of so's theBigList
		end if
	end if
end repeat

set finalTocopy to my getdelimitedList(so's theBigList, return, text)

on getdelimitedList(theInput, theDL, myClass)
	set tid to AppleScript's text item delimiters
	set AppleScript's text item delimiters to theDL
	if myClass is list then
		set theList to text items of theInput
	else
		set theList to text items of theInput as text
	end if
	set AppleScript's text item delimiters to tid
	return theList
end getdelimitedList

The script above is set to 20,000 repeats which takes 84 seconds and gives me 10,000 unique codes. North of 20,000 iterations, the time taken goes right up.
I changed the main repeat loop to 131,072 and the script took 1 hour and 8 minutes and only then produced half as many codes (65536). Before I leave my script running all night, is there a better way to achieve the same thing?

The end result needs to go into a mySQL database, so that is why I have the results tab delimited.

Many thanks for any help you could give

Ian

I’m not sure whether I got the proper output, but this is a bit faster


on StringWithBits(mask)
	set binaryString to {mask * 2 as text}
	repeat with i from 1 to 17
		set end of binaryString to mask mod 2 as text
		set mask to mask div 2
	end repeat
	set binaryString to binaryString as text
	return binaryString
end StringWithBits

script so
	property resultList : {}
end script

set {tid, text item delimiters} to {text item delimiters, tab}
repeat with i from 1 to 131072
	set end of so's resultList to StringWithBits(i)
end repeat
set text item delimiters to tid
so's resultList


Hi.

This takes about 92 seconds on my MacBook Pro:

local o, astid, binaryList

script
	property l : {}
end script
set o to result

repeat with i from 0 to 131071
	set end of o's l to (i div 65536 as text) & ¬
		text 2 thru 9 of (100000000 + i div 32768 mod 2 * 10000000 + i div 16384 mod 2 * 1000000 + i div 8192 mod 2 * 100000 + i div 4096 mod 2 * 10000 + i div 2048 mod 2 * 1000 + i div 1024 mod 2 * 100 + i div 512 mod 2 * 10 + i div 256 mod 2 as text) & ¬
		text 2 thru 9 of (100000000 + i div 128 mod 2 * 10000000 + i div 64 mod 2 * 1000000 + i div 32 mod 2 * 100000 + i div 16 mod 2 * 10000 + i div 8 mod 2 * 1000 + i div 4 mod 2 * 100 + i div 2 mod 2 * 10 + i mod 2 as text)
end repeat
-- At this point, o's l is a list of the binary-number strings.

set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to linefeed
set binaryList to o's l as text
set AppleScript's text item delimiters to astid
-- binaryList is a single text containing all the binary numbers, separated by linefeeds.

-- Return the first ten and last ten lines, just to show they're there.
tell binaryList to return {text from paragraph 1 to paragraph 10, text from paragraph 131063 to paragraph 131072}

amazing as always :slight_smile:

But the (non-competitive) winner is the Objective-C version: 2,2 seconds

This is a variation on Nigel’s fine script, but on my Mac runs about four times faster.

local o, astid, binaryList, s

script
	property l : {}
end script
set o to result

script
	property m : {}
end script
set s to result

repeat with i from 0 to 15
	set end of s's m to text 2 thru 5 of (10000 + i div 64 mod 2 * 1000000 + i div 32 mod 2 * 100000 + i div 16 mod 2 * 10000 + i div 8 mod 2 * 1000 + i div 4 mod 2 * 100 + i div 2 mod 2 * 10 + i mod 2 as text)
end repeat
repeat with x in {"0", "1"}
	repeat with i from 1 to 16
		repeat with j from 1 to 16
			repeat with k from 1 to 16
				repeat with n from 1 to 16
					set end of o's l to x & item i of s's m & item j of s's m & item k of s's m & item n of s's m
				end repeat
			end repeat
		end repeat
	end repeat
end repeat

-- At this point, o's l is a list of the binary-number strings.

set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to linefeed
set binaryList to o's l as text
set AppleScript's text item delimiters to astid
-- binaryList is a single text containing all the binary numbers, separated by linefeeds.

-- Return the first ten and last ten lines, just to show they're there.
tell binaryList to return {text from paragraph 1 to paragraph 10, text from paragraph 131063 to paragraph 131072}

Maybe I have a faster Mac than Stephen, but 5.5 seconds is not too shabby compared with his Objective-C time. (Nigel’s original takes about 22 seconds here.)

Sorry, I was using a NSLog statement at the end which tampered the result.
Without any logging the ObjC code takes 270 ms


NSString *NSStringWithBits(uint32_t mask) {
    NSMutableString *mutableStringWithBits = [NSMutableString new];
    for (int8_t bitIndex = 0; bitIndex < 17; bitIndex++) {
        [mutableStringWithBits appendString:mask & 1 ? @"1" : @"0"];
        mask >>= 1;
    }
    return [mutableStringWithBits copy];
}
         
 NSMutableArray *list = [NSMutableArray array];
   for (uint32_t i = 0; i < 131072; ++i) {
      [list addObject:NSStringWithBits(i)];
 }


Ah, that makes sense. I was thinking later, after a second coffee, that your original figure seemed pretty slow…

Brilliant! Thanks for all of your replies.

I obviously only need to perform this once to get my result for the database, so I can use any of the fine examples you have provided, but out of interest, how would I make the Obj-C one usable? As in, can it be called somehow from a call method in Applescript, or turn it into a command line application (I have only just recently dabbled with the basics of Obj-C)

Now to figure out how your examples work!

Thanks

Ian

If you’re running Mavericks, the easiest way is to save it as a framework and call it from an ASObjC-based library. Otherwise you could save it a command-line app, or a scriptable app.

Snow Leopard here (because I’m frightened of making the jump to Obj-C /ASOC :smiley: )

I’ll try compiling it in XCode

still much faster with NSStringWithBits written in C


static const uint8_t bitLength = 17;

NSString *NSStringWithBits(int32_t mask) {
    char array[bitLength];
    for (uint8_t bitIndex = 0; bitIndex < bitLength; bitIndex++) {
        array[bitIndex] = 0x030 + (mask & 1);
        mask >>= 1;
    }
    return [[NSString alloc] initWithBytes:(const void*)array length:sizeof(array) encoding:NSASCIIStringEncoding];
}


it takes 46 ms for 2^17 iterations

Wonderful! Calculate only the 16 possible four-bit values and then just plug them into the seventeen-bit strings as required! (I like the form of the outer loop for the hi-end bit. :wink: ) It’s only just over twice as fast as mine on my machine (43 seconds as opposed to 92), but that’s still a considerable saving in execution time!

I haven’t been able to try Stefan’s code, but 46 ms is certainly a better prospect than Ian’s “all night” and the C version is particularly easy on the eye. :slight_smile:

Maybe it’s fair to say for the others that this example does most of the processing on stack memory and not on heap memory. It’s not only C that is faster than ObjC but also the difference in memory.

On closer inspection, there’s some superfluous calculation here. But it adds virtually nothing to the running time:

repeat with i from 0 to 15
	set end of s's m to text 2 thru 5 of (10000 + i div 8 mod 2 * 1000 + i div 4 mod 2 * 100 + i div 2 mod 2 * 10 + i mod 2 as text)
end repeat

See, I really did cut-and-paste :wink: