Better code for parsing an array?

fiz · July 11, 2012, 10:04pm

Hello, better-code writers,

Look at this code:

I suppose there is a better way to do this. But imagine some_value could not be found into the colorTable NSArrayController.

Regards,

DJ_Bazzie_Wazzie · July 11, 2012, 10:31pm

In which perspective? performance-wise or better looking.

fiz · July 11, 2012, 10:34pm

Both.

leonsimard · July 12, 2012, 12:21am

I’d use Grand Central Dispatch in a background thread for this, all depends on how many records you have to go thru… a 1000? 10,000? more?

I find that the for loop is pretty fast for most jobs, I rarely notice any lag, and if i suspect a any I use GCD or similar things, like performSelectorInBackground:withObject:

What kind of performance issues do you experience, if any?

fiz · July 12, 2012, 2:05am

Core Data Programming Guide:

Sound like a “forget it” for me. “Inconsistent state”. Mr. Spock has stopped in Cupertino.

Well, I’ll stick to my fast enumeration. It was just a cosmetic problem.

StefanK · July 12, 2012, 5:03am

It looks like you’re coding something “manually” that Core Data is supposed to do automatically

fiz · July 12, 2012, 8:33am

Hi Stefan,

Core Data is doing a lot for me (for instance, multiple undo/redo is amazing) but I didn’t explore all its possibilities yet.

This framework forces you to thing before coding: if the model is clear, there is very little coding to do. If it’s unclear, don’t even think about creating files, or you will fall quickly into versioning. When you archive files with NSArchivers and keyed encoded data (or dictionaries), you can much more easily recover old files, even from previous versions. With CoreData, it’s very painful.

Here, one of the problems is that I’m testing unsorted data. “arrangedObjects” is here an inappropriate word: I can’t rely on the order Core Data recovers a file’s content (I use a sort descriptor to arrange them alphabetically).

The other problem is that I can’t establish a to-many relationship here, because it works between objects, not their attributes.

Filter the array with a predicate won’t add much clarity or speed to a fast enumeration.

I was wondering, as this test is made 4096 times for a given number of “arrangedObjects”, if I could establish a reference table between the integerValue of mColorRef and the corresponding mColor value.

mColorRef is attributed when a record is created, by finding the first available integer. If, theoretically, it could be as big as 32767, it is, in practice, much smaller, maybe 20 or 30.

I envisaged a C vector (released after the test). Into this vector, mColorRef could be the index, and mColor the value, so I could avoid the loop and the IF test by doing simply

color = myCvector [n]

A change into this design would be independent of the Core Data model.

My code version is working, and working fast (you only notice a delay when the mouse is dragged because the 4096-test occurs continuously, and is sometimes skipped).

Maybe it’s more a problem of drawing. I could refresh only elements which are changed, not the whole thing.

fiz · July 13, 2012, 6:54pm

After testing, inclusively with Quartz Debugger, the answer is:

No possible optimization.

A color look-up table is not faster as the key fetching of a NSManagedObject.
Drawing the whole view is equally fast as calculating a “dirty rect” for update.

I suppose these basic functions of the frameworks are highly optimized. I shall stop thinking I’m coding for the Mac 512K.

DJ_Bazzie_Wazzie · July 13, 2012, 7:20pm

Do you mean the color lookup table in C (you named vector)?

fiz · July 14, 2012, 8:26am

DJ,

Yes. I tested three possibilities:

A pure C vector (integer);
A C matrix;
A NSColor array.

Maybe the optimization is noticeable as the number of indexes becomes important, say 1000, 32767 items. but for the tests I had less than 20 items. Anyway, I’m not sure it makes a difference for NSArray to fetch the first or the 30’000th item.

On the other hand, the loop counter (the number of times I look into the color table) is always 4096.

The point is: does this loop really needs optimization? A color lookup table consumes more memory: int size * n(indexes) and if an index cannot be found (one of these “should never happen” situations.) there is no problem for NSArray. For a C structure, it’s not so sure.

And for a “Core-Dataed” version, it would add a useless complication to handle to-many relationships between elements, and the storage will increase drastically. So: forget it.

20 years ago it was really significant to split an integer into 32 binary flags. But nowadays.

Thank you for your interest, I’ll certainly be back with other questions.

DJ_Bazzie_Wazzie · July 15, 2012, 12:39am

You mean a singly linked list? Vector is something I use in C++ and not in C. Linked lists is a list that is used when there need to be continuously items added and removed in the list. drawback is getting an item by it’s index is slow(er). Normally indexed lists are used for lists that only adds data at the end of the list and needs get an item at a specific index. Also to remove data a certain value is mostly used rather than really removing the data (it’s faster).

First of all: You decide if there is a problem or not. The code is good but if you want it faster to perform then look at valueForKey. It’s very expensive and time consuming. So if you’re looking to reduce time you should start there.

Objective-C binds almost everything late and is pretty simple. There is a stack with method names in it and points to C functions. So when you call a method that doesn’t exists it won’t throw an error. Calling pointer addresses, what we’re doing here, it doesn’t matter if you use Objective-C or C, that ‘safety’ feature of Objective-C doesn’t apply here.

Still used a lot inside your machine more than you think. The higher the bit set instruction of your processor, the more flags you can store inside an unsigned integer. Think about fonts in your system, bold, italic and underline are all stored in a single bit mask.

fiz · July 15, 2012, 11:50am

I found someone who made the test for me at :
http://belkadan.com/blog/2007/10/Performance-Optimization-Why-We-Cannot-Use-valueForKeyPath/

I seems that valueForKeyPath is indeed very time-consuming :

is ugly, but more than 10 times faster as:

Interesting. On the other hand, the difference between objectForKey (NSDictionary method) and valueForKey (KVC method) is very small (let’s say insignificant).

My optimization question was due to this: I color cells with a Pencil tool. When the pencil goes too fast, some of the cells are skipped. This never happen with Photoshop but I suppose that its bimap operations use a completely different design and much better optimized (C or even assembler).

DJ_Bazzie_Wazzie · July 16, 2012, 8:56am

Because the same hash function is used for many key value lookups in Cocoa.