PDA

View Full Version : autorelease malloc data?




MrFusion
Apr 11, 2011, 03:23 AM
I am currently messing around with the sgetrf_ and sgetrs_ functions in the accelerate framework. When returning the solution, these functions overwrite their array parameters. "columnMajorMatrix" in this snippet is overwritten to return the output.


sgetrf_(&numberofRows, &numberofColumns, columnMajorMatrix, &numberofRows, pivotIndices, &errorCode);


I would like to hang on to that data. Therefore I copy the data into a new array "result" and feed that to sgetrf_. The "result" is returned to the user. Copying of the data is done via malloc and memcpy. However, I have no free equivalent in the code below. In other words, there is a memory leak here. I also don't want to rely on the user to free the allocated memory. As far as I am aware, there is no autorelease equivalent for malloc memory. Is there a way around this?

This snippet is incomplete (with regard to "numberofRows", etc.) , but illustrates the basic problem I want to solve.

-(float *) solveProblemFor:(float *) columnMajorMatrix
... : ...
{
//copy data
float *result = (float *)malloc(numberofRows*numberofColumns*sizeof(float));
memcpy(&result,columnMajorMatrix,numberofRows*numberofColumns*sizeof(float));

//solve for dataset
....
sgetrf_(&numberofRows, &numberofColumns, result, &numberofRows, pivotIndices, &errorCode);
....

//return solution. And free data?
return result;
}


I would prefer to use malloc since I don't know in advance how big these arrays can get.

Thanks!



jiminaus
Apr 11, 2011, 03:58 AM
Perhaps try this untested equivalent:

-(NSMutableData *) solveProblemFor:(float *) columnMajorMatrix
... : ...
{
//copy data
NSMutableData *result =
[NSMutableData dataWithBytes:columnMajorMatrix
length:numberofRows*numberofColumns*sizeof(float)];

//solve for dataset
....
sgetrf_(&numberofRows, &numberofColumns, (float *)[result mutableBytes], &numberofRows, pivotIndices, &errorCode);
....

return result;
}


Actually you try to use NData/NSMutableData instead of naked pointers to memory blocks as much as possible in Objective-C. Not only does it have the advantage of carrying useful metadata like its length, you don't need special memory-management, just use the usual retain/release/autorelease techniques. It will even garbage collect in a GC environment.

It can also simply your code. For example, if you used NSMutableData through out, the code above becomes:

-(NSMutableData *) solveProblemFor:NSData *columnMajorMatrix
... : ...
{
//copy data
NSMutableData *result = [[columnMajorMatrix mutableCopy] autorelease];

//solve for dataset
....
sgetrf_(&numberofRows, &numberofColumns, (float *)[result mutableBytes], &numberofRows, pivotIndices, &errorCode);
....

return result;
}

chown33
Apr 11, 2011, 01:03 PM
I think you should make a wafer-thin object layer on top of your malloc'ed memory. Then use those objects whenever you want anything more than the basic primitive C functions. You didn't say why you wanted to preserve the incoming data, so that explanation might be important for designing the encapsulation layer.

You already have a class with this method:
-(float *) solveProblemFor:(float *) columnMajorMatrix
... : ...

So make another class that encapsulates the malloc'ed float*, the numberOfColumns, numberOfRows, etc. Then only do copying operations with that object.

I think NSData and NSMutableData are a bad idea, because it loses important structural metadata like numRows, numCols, etc. The number of bytes is insufficient metadata for interpreting the structure of the data, and you can't operate on the data without knowing its structure. If you go to the trouble of putting the structural metadata in the NSData, then you might as well just make an object that has actual members and properties, rather than relying on what is essentially a serialized-to-bytes representation in NSData. Serializing for storage or transport is a different question than encapsulating.

MrFusion
Apr 11, 2011, 04:43 PM
Thanks for the replies!

I think you should make a wafer-thin object layer on top of your malloc'ed memory. Then use those objects whenever you want anything more than the basic primitive C functions. You didn't say why you wanted to preserve the incoming data, so that explanation might be important for designing the encapsulation layer.


For undo/redo operations when the user doesn't like the result.


You already have a class with this method:
-(float *) solveProblemFor:(float *) columnMajorMatrix
... : ...

So make another class that encapsulates the malloc'ed float*, the numberOfColumns, numberOfRows, etc. Then only do copying operations with that object.

I think NSData and NSMutableData are a bad idea, because it loses important structural metadata like numRows, numCols, etc. The number of bytes is insufficient metadata for interpreting the structure of the data, and you can't operate on the data without knowing its structure. If you go to the trouble of putting the structural metadata in the NSData, then you might as well just make an object that has actual members and properties, rather than relying on what is essentially a serialized-to-bytes representation in NSData. Serializing for storage or transport is a different question than encapsulating.

Will this not end up in (badly) recreating NSMutableArray and NSNumber?
This snippet also gives a bunch of methods (last three) which are not always applicable, depending on primitiveType.
When dealing with data structures, I certainly feel the lack of a formal CS background. :(



@interface Matrix : NSObject
{
NSInteger numberRows,numberColumns;

@private
NSInteger primitiveType; //int=0, float=1, double=2 because sizeof(primitiveType) needs to be known as well
NSMutableData *matrixData; // copied | created via malloc
}

-(id) initWithData:(NSMutableData *)newData
ofType:(NSInteger) type;

-(void) setData:(NSMutableData *)newData
ofType:(NSInteger) type;

-(void) setValue:(NSNumber *) newValue
atRow:(int) r
column:(int) c;

-(NSNumber *) valueAtRow:(int) r
column:(int) c;

-(int) intAtRow:(int) r
column:(int) c;
-(float) floatAtRow:(int) r
column:(int) c;
-(double) doubleAtRow:(int) r
column:(int) c;

@end

jiminaus
Apr 11, 2011, 05:33 PM
So make another class that encapsulates the malloc'ed float*, the numberOfColumns, numberOfRows, etc. Then only do copying operations with that object.


Agree. Especially because the array is in column-major order. The opposite to C's row-major order. It would be too easy to accidentally code arr[row][col] instead of arr[col][row]. Whereas [arr valueAtRow:row column:col] is hard to get wrong.


I think NSData and NSMutableData are a bad idea, because it loses important structural metadata like numRows, numCols, etc. The number of bytes is insufficient metadata for interpreting the structure of the data, and you can't operate on the data without knowing its structure. If you go to the trouble of putting the structural metadata in the NSData, then you might as well just make an object that has actual members and properties, rather than relying on what is essentially a serialized-to-bytes representation in NSData. Serializing for storage or transport is a different question than encapsulating.

Here I disagree. I think the custom class should still wrap the memory buffer in an NSData. For the reasons I said above, particularly the advantage of integrated memory management. I'm not suggesting to subclass NData, but have an NSMutableData ivar instead of a bare pointer.

While you typically see NSData being used in serialisation and transportation in Objective-C, that's not its reason d'etre. NSData's reason d'etre is just to manage a memory block. You only typically see it in the contexts of serialisation and transportation because there's little need for memory blocks in Objective-C otherwise. In Objective-C you'd typically use a higher structure like NSArray where C would use a memory block.

I use NSData (or more typically NSMutableData) any time I create a memory block in Objective-C. When interfacing with a C library, I wrap a received memory block as soon a possible using one of the NoCopy initialisers. Because I code in a GC environment, I then don't need to take any special care with it, and I don't encounter anything like of problems that the OP had with passing up responsibility for freeing the memory block.


Will this not end up in (badly) recreating NSMutableArray and NSNumber?
This snippet also gives a bunch of methods (last three) which are not always applicable, depending on primitiveType.
When dealing with data structures, I certainly feel the lack of a formal CS background. :(


Do you have a good reason for returning making Matrix generalised like this? I think it would be better performance-wise to have a FloatMatrix. The wrapping of the primitiveType in matrixData into and out of an NSNumber wouldn't be so cheap. Think about iterating over the 10x10 Matrix. The would be 100 conversions of a float to NSNumber.

Or perhaps this is a good place to implement a class cluster.

And if you do decide to keep this generalised, consider using Objective-C's type encodings instead of your custom NSInteger codings for primativeType. See the Type Encodings chapter of the Objective-C Runtime Programming Guide. There's even an @encode keyword to help keep your code self-documenting.

chown33
Apr 11, 2011, 06:32 PM
I use NSData (or more typically NSMutableData) any time I create a memory block in Objective-C. When interfacing with a C library, I wrap a received memory block as soon a possible using one of the NoCopy initialisers.

Aha, I see where you're going with this. The NoCopy initializer is effectively wrapping the malloc'ed block in a wafer-thin object wrapper. I forgot that the NoCopy variants require a malloc'ed pointer:
The returned [NSData] object takes ownership of the bytes pointer and frees it on deallocation. Therefore, bytes must point to a memory block allocated with malloc.

In that case, I concur; wrapping in NSData using NoCopy is a good approach.


As to the question of undo, I think the approach should be to make the copy as part of an explicit linkage to an undo strategy. Smashing it all into a single solver method is poor factoring. And frankly, if you're making a copy, you should have an entire copy of the object, not just the data array. That means copying the numRows and numCols, too, since those appear to be as mutable as the data values.

And I agree with the suggestion of using a simple FloatMatrix. Design it as simply as possible. Worry about generalizations later, after you have a demonstrated need. At that point it may make more sense to use a protocol instead of class/subclass. You'll have to see how the design and implementation progresses.

And if you're going to make primitiveType an integer, why mess around with codes that stand for sizes? Just use the actual sizeof(type). Does it matter if the size for, e.g. int and float, are the same?


EDIT:
The more I think about it, the more I think NSData is unnecessary here. Just store the malloc'ed pointer as an ivar, and free() it in dealloc after checking for null.

@interface FloatMatrix : NSObject
{
NSInteger numberRows,numberColumns;

@private
float * mallocedData;
}

Why? Because there doesn't seem to be any reason for two wrappers, when one will do. You have to release an NSMutableData in dealloc, so you're not gaining anything there. You have to copy the contents of the data when you copy the FloatMatrix, so you're not gaining anything there. If you have an NSMutableData, you have to send a message to get the pointer, so you're paying a price there. Since you need the malloc'ed pointer (not the NSMutableData*) for almost all the operations, you're paying the message cost every time. And for what? Just so you can have an NSMutableData* in the @interface instead of a float*?


On the subject of undo:
If the sole purpose of copying is to undo, then you can simplify the object a lot by only making it be redo-able data. That is, to do anything with the data, it must be put back into a FloatMatrix object, or whatever the C struct is that's got the encapsulated structural info like numRows and numCols.

I'm assuming there's already a struct of some kind like:
struct myMatrix
{
int numRows, numCols;
float * floatData;
};

If not, I'd like to know how the structural metadata like numRows, numCols, (and apparently primitiveSize) is kept with the malloc'ed data. Maybe the whole thing is a malloc'ed struct.

MrFusion
Apr 12, 2011, 04:48 AM
Do you have a good reason for returning making Matrix generalised like this? I think it would be better performance-wise to have a FloatMatrix. The wrapping of the primitiveType in matrixData into and out of an NSNumber wouldn't be so cheap. Think about iterating over the 10x10 Matrix. The would be 100 conversions of a float to NSNumber.


For mathematical operations and plotting, the back and forth conversion with NSNumber is indeed inefficient for what I want to do. However, and NSArray of NSNumber values is convenient for displaying data in a NSTable via bindings.


Or perhaps this is a good place to implement a class cluster.


This is also a good suggestion. NSNumber is a class cluster. All the access methods are public, but the storage implementation is private (according to the documentation on class clusters). This does give a long list of accessor methods for each primitive type.
But this can be extended via categories.


And if you do decide to keep this generalised, consider using Objective-C's type encodings instead of your custom NSInteger codings for primativeType. See the Type Encodings chapter of the Objective-C Runtime Programming Guide. There's even an @encode keyword to help keep your code self-documenting.

I didn't know about this. Thanks.