Unable to locate memory leak

Discussion in 'Mac Programming' started by ranguvar, Apr 2, 2010.

  1. ranguvar macrumors 6502

    Joined:
    Sep 18, 2009
    #1
    Hi,

    I have yet again a memory management problem. Here's my code:

    Code:
    NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
    
    // expression is an NSString object.
    expression = [expression stringByTrimmingCharactersInSet:
    			  [NSCharacterSet whitespaceAndNewlineCharacterSet]];
    
    NSArray *arguments = [NSArray arrayWithObjects:expression, [@"~/Desktop/file.txt" stringByExpandingTildeInPath], @"-n", @"--line-number", nil];
    NSPipe *outPipe = [[NSPipe alloc] init];
    
    NSTask *task = [[NSTask alloc] init];
    [task setLaunchPath:@"/usr/bin/grep"];
    [task setArguments:arguments];
    [task setStandardOutput:outPipe];
    [outPipe release];
    
    [task launch];
    
    NSData *data = [[outPipe fileHandleForReading] readDataToEndOfFile];
    
    [task waitUntilExit];
    [task terminate];
    [task release];
    
    NSString *stringOrig = [[NSString alloc] initWithBytes:[data bytes] length:[data length] encoding:NSUTF8StringEncoding];
    NSString *string = [stringOrig stringByReplacingOccurrencesOfString:@"\r" withString:@""];
    
    int linesNum = 0;
    
    NSMutableArray *possibleMatches = [[NSMutableArray alloc] init];
    
    // Check if the output appears to be valid.
    if ([string length] > 0) {
    	
    	// Separate the different lines.
    	NSArray *lines = [string componentsSeparatedByString:@"\n"];
    	linesNum = [lines count];
    	
    	// Now go through all the different lines.
    	for (int i = 0; i < [lines count]; i++) {
    		
    		NSString *currentLine = [lines objectAtIndex:i];
    		NSArray *values = [currentLine componentsSeparatedByString:@"\t"];
    		
    		// Let's check if this is a valid line.
    		if ([values count] == 20) {
    			// OK, this might be something we're looking for. Add it to the array for later inspection.
    			[possibleMatches addObject:currentLine];
    		}
    	}
    }
    
    [stringOrig release];
    [pool release];
    
    return [possibleMatches autorelease];
    The code is constantly allocating more and more memory, although it shouldn't. It's noticeable when you grep a large file for an expression that's contained often in that file (e.g. if grep returns something like 400 lines).

    I've looked with the Leaks tool from Instruments and the Clang Static Analyzer, but both do not find any leaks.

    I really tried to adhere to the simple Cocoa memory management rules, but it seems like I'm doing something wrong.

    Thanks for any help!

    -ranguvar
     
  2. gnasher729 macrumors P6

    gnasher729

    Joined:
    Nov 25, 2005
    #2
    At which point do you think all your tons and tons of autoreleased objects will actually be released? Think about that question unless you know the answer (and then you will know the answer to your problem as well). If you can't figure out when your autoreleased objects will be released, then tell us.
     
  3. ranguvar thread starter macrumors 6502

    Joined:
    Sep 18, 2009
    #3
    Every time I call this code I surround it by first creating an NSAutoreleasePool, then executing the code, and finally releasing the NSAutoreleasePool. Why would any objects not get deallocated?
     
  4. eharley macrumors newbie

    Joined:
    Dec 27, 2007
    #4
    You have to add the objects you want collected to the pool before you drain it. Where are you telling the objects you're allocating to be autoreleased?
     
  5. ranguvar thread starter macrumors 6502

    Joined:
    Sep 18, 2009
    #5
    I believed objects that are not created with init, copy or new get added to the current pool automatically? Anyways, would you mind posting code so that I can see what you mean?
    Why would I send autorelease to any of the objects in my code?
     
  6. Catfish_Man macrumors 68030

    Catfish_Man

    Joined:
    Sep 13, 2001
    Location:
    Portland, OR
    #6
    That code looks correct offhand to me. Have you tried using the ObjectAlloc instrument to see which objects are sticking around?
     
  7. ranguvar thread starter macrumors 6502

    Joined:
    Sep 18, 2009
    #7
    Yes, CFString and CFArray (store-deque).
     
  8. jared_kipe macrumors 68030

    jared_kipe

    Joined:
    Dec 8, 2003
    Location:
    Seattle
    #8
    Well I didn't read through the code all the way but it seems that line 4: expression = [expression stringByTrimmingCharactersInSet:
    [NSCharacterSet whitespaceAndNewlineCharacterSet]];

    would probably leak.

    The reason is that apparantly expression is already an NSString object and you are stomping on its pointer with a new object.

    Roughly akin to..
    NSString *aStringToLeak = [[NSString alloc] initWithFormat: @"integer = %d", someInt];
    aStringToLeak = [aStringToLeak substringToIndex: 6];

    At the end of this aStringToLeak is now an autoreleased string, but the original string just leaked.
     
  9. ranguvar thread starter macrumors 6502

    Joined:
    Sep 18, 2009
    #9
    Thanks, but I don't think that's what's happening, because expression is an autoreleased string anyway. If I outcomment everything but that line, the code doesn't leak.

    After some testing I think the leak is related to the NSTask object not properly finishing up or something. Just creating the object, launching the task and releasing the object for some reason eats up memory. (Found this out by looping, it does leak quite significantly).

    Unfortunately, I have no idea why this would happen, how it happens and how to prevent it from happening...
     
  10. jared_kipe macrumors 68030

    jared_kipe

    Joined:
    Dec 8, 2003
    Location:
    Seattle
    #10
    I just reread your first post, and it says that your app isn't "leaking" but that it is using a lot of memory. Now you say that NSTask is leaking? What objects are being leaked? The CFString's and CFArrays?

    I kinda assumed you already fixed this, but at the end [pool release] should probably be [pool drain]
     
  11. chown33 macrumors 604

    Joined:
    Aug 9, 2009
    #11
    First, you should post your looping code, which demonstrates the problem. It should be compilable by someone else. I'm not convinced that NSTask is the cause of the problem, mainly because I can't demonstrate the problem, since you haven't posted all your code.


    Second, if you have no idea how NSTask might be the problem, but you believe it is a problem, what should your first step be in finding out if someone else might have seen the same problem? If your answer is "Ask questions on the MacRumors programming forum", then I have to say that's wrong.

    Your first step should always be to search the web for it, especially when the problem is this specific. There are many more programmers around the world than the ones that hang out here. If one of them has already seen and solved the problem, and that information is in one of the many locations outside of this forum, then you wouldn't even have to ask here.

    I entered the two most obvious keywords from your last post, NSTask leak, into Google and came up with a wealth of posts on Apple's cocoa-dev list and elsewhere, many of which include carefully outlined solutions. Are any of them a solution for your problem? I don't know, because you haven't posted your code that fully demonstrates the problem.

    If you haven't tried googling first (or binging, or searching in general), then you've barely tried solving the problem. Searching should always be the first step, not the last. And if you haven't posted code that demonstrates the problem, then no one anywhere can really help you, because without your code, everyone other than you is just guessing at what code you wrote.

    You should read this:
    http://www.mikeash.com/getting_answers.html

    The two most important points in your case are:
    Post Your Code.
    Do Your Research {BeforeHand,During,After}
     
  12. gnasher729 macrumors P6

    gnasher729

    Joined:
    Nov 25, 2005
    #12
    That's the point: Everything that gets allocated stays there until you release the autorelease pool. If there is a lot that is allocated then you will use up a lot of memory. You mentioned hundreds of lines (one string per line) separated apparently into 20 objects per line so that will eat up a bit of memory.
     
  13. chown33 macrumors 604

    Joined:
    Aug 9, 2009
    #13
    This code fragment is wrong.

    When outPipe is released, you no longer own it and must no longer refer to it. Yet you do refer to it in the red-hilited code.
     
  14. jared_kipe macrumors 68030

    jared_kipe

    Joined:
    Dec 8, 2003
    Location:
    Seattle
    #14
    I thought that was weird, and not in vogue, but I assumed that putting it in the NSTask object retained it.

    So while
    NSData *data = [[[task standardOutput] fileHandleForReading] readDataToEndOfFile];
    would be the correct way of doing it, I don't see why
    NSData *data = [[outPipe fileHandleForReading] readDataToEndOfFile];
    would cause any leaks. At least in a "perfect" world.
     
  15. chown33 macrumors 604

    Joined:
    Aug 9, 2009
    #15
    If the NSTask retains it, it's because it owns it. The posted code presumes that the calling code also still owns it, but it doesn't. The pipe's continued existence is a side-effect of NSTask's ownership, not due to the calling code being correct.

    If by "not in vogue" you mean "wrong but might still work", then I'd agree. Yes, the posted code might still work. But just because it works doesn't mean it's correct.

    Note that I didn't say it was leaking. I only said the posted code was wrong. It also made me wonder if any of the unposted code might be doing something that worked (more or less) but was still wrong. It's the kind of error that makes me doubt the correctness of unseen code.
     
  16. mdatwood macrumors 6502a

    Joined:
    Mar 14, 2010
    Location:
    Denver, CO
    #16
    While you may be correct that NSTask retains outpipe it is still incorrect semantically for you to reference it after releasing it. You're relying on a side effect of NSTask that may change at some point in the future. If you want to use outpipe you need to retain ownership of it until you are done with it.

    *EDIT* chown33 beat me to it :)
     
  17. jared_kipe macrumors 68030

    jared_kipe

    Joined:
    Dec 8, 2003
    Location:
    Seattle
    #17
    I'm not saying I like it, but the OP isn't saying the code doesn't "work" so I have to make certain assumptions as to how the code is working.

    I agree, I have already pointed out one place where unseen code "could" have made it leak.

    EDIT: I've also seen lots of code where you make an object, put it in an array, and then immediately release the object so you don't forget to later. Then modify/work with that object. I'd bet everyone has done that at least once.
     
  18. ranguvar thread starter macrumors 6502

    Joined:
    Sep 18, 2009
    #18
    When I say I loop the code (the code I posted) I really mean I just loop it. It's really just a for-loop that passes a new value as expression to the code I posted (I stuffed the code into a function):

    Code:
    NSArray *values; // this contains a bunch of strings
    for (int i = 0; i < [values count]; i++) {
    	// here I call the function that contains the code I previously posted,
    	// passing [values objectAtIndex:i] as a parameter (that will be
    	// the expressions object from the code I previously posted.
    }

    I checked that there were no problems in the rest of the code that would result in the program eating up memory, so the problem definitely lies in the code I posted. I'd love to post the entire code, but unfortunately it's really depending on certain files that are too large to post here (~1.8 GB).
    I already spent more than a day on this problem, searching on the net and trying to fix my code. I did not post a new topic here without doing any work myself.

    Concerning the problem with releasing outPipe and then referencing it afterwards: NSTask retains outPipe so it's okay to release it. While I know the talk about "don't be a copy-and-paste kiddie" I took the code from Hillegass' book, so I guess it should be correct, even semantically.

    In a reference counted environment, release acts the same way as drain for NSAutoreleasePool objects (check Apple docs).
    The "leak" is maybe related to the NSData object which in turn is related to the NSTask object.
     
  19. chown33 macrumors 604

    Joined:
    Aug 9, 2009
    #19
    Post your code.

    If it's necessary to use smaller data files, and a well-isolated but complete code base that demonstrates the problem, then do that.

    If the problem is really what you think it is, then smaller input files will still cause the problem, it just won't happen as quickly.

    Posting parts of code and descriptions is not the same thing. In particular, it is necessary to see what you do with the NSArray returned by your previously posted code.

    BTW, your previously posted code is not a complete function or method. Are we supposed to guess what the parameters and ivars are?

    You're assuming that authors never make mistakes. Did you check his website for errata in the published books? Which edition of the book?

    Also, context is important. In a GC environment, your posted code isn't wrong, because release does nothing and you still have a reference.
     
  20. ranguvar thread starter macrumors 6502

    Joined:
    Sep 18, 2009
    #20
    Here's the complete function:

    Code:
    NSArray* grepFileForExpression(NSString *expression) {
    	
    	NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
    	
    	expression = [expression stringByTrimmingCharactersInSet:
    				  [NSCharacterSet whitespaceAndNewlineCharacterSet]];
    	
    	NSArray *arguments = [NSArray arrayWithObjects:expression, [@"~/Desktop/file.txt" stringByExpandingTildeInPath], @"-n", @"--line-number", nil];
    	NSPipe *outPipe = [[NSPipe alloc] init];
    	
    	NSTask *task = [[NSTask alloc] init];
    	[task setLaunchPath:@"/usr/bin/grep"];
    	[task setArguments:arguments];
    	[task setStandardOutput:outPipe];
    	[outPipe release];
    	
    	[task launch];
    	
    	NSData *data = [[outPipe fileHandleForReading] readDataToEndOfFile];
    	[task waitUntilExit];
    	[task release];
    	
    	NSString *stringOrig = [[NSString alloc] initWithBytes:[data bytes] length:[data length] encoding:NSUTF8StringEncoding];
    	NSString *string = [stringOrig stringByReplacingOccurrencesOfString:@"\r" withString:@""];
    	
    	int linesNum = 0;
    	
    	NSMutableArray *possibleMatches = [[NSMutableArray alloc] init];
    	
    	// Check if the output appears to be valid.
    	if ([string length] > 0) {
    		
    		// Separate the different lines.
    		NSArray *lines = [[string componentsSeparatedByString:@"\n"] autorelease];
    		linesNum = [lines count];
    		
    		// Now go through all the different lines.
    		for (int i = 0; i < [lines count]; i++) {
    			
    			NSString *currentLine = [lines objectAtIndex:i];
    			NSArray *values = [currentLine componentsSeparatedByString:@"\t"];
    			
    			// Let's check if this is a valid line.
    			if ([values count] == 20) {
    				// OK, we found a possible match. Add it to the array for later inspection.
    				[possibleMatches addObject:currentLine];
    			}
    		}
    	}
    	
    	[stringOrig release];
    	[pool release];
    	
    	return [possibleMatches autorelease];
    }
    Here's the code of the loop:

    Code:
    NSArray *values; // contains a bunch of NSStrings.
    for (int i = 0; i < [values count]; i++) {
    	NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
    	NSArray *possibleMatches = grepFileForExpression([values objectAtIndex:i]);
    	[pool release];
    }
    I'm sorry I cannot post the rest of the code. It doesn't leak, I checked a billion times, the problem's not there.
     
  21. jared_kipe macrumors 68030

    jared_kipe

    Joined:
    Dec 8, 2003
    Location:
    Seattle
    #21
    Again, is the Leaks tool finding Leaks, or are you just talking about the program using a lot of memory?
     
  22. ranguvar thread starter macrumors 6502

    Joined:
    Sep 18, 2009
    #22
    I am just talking about my program using a lot of memory.
     
  23. chown33 macrumors 604

    Joined:
    Aug 9, 2009
    #23
    Thank you for posting the code.

    However, I want to point something out. Your posted code does not match your earlier description:
    Code:
    NSArray *values; // this contains a bunch of strings
    for (int i = 0; i < [values count]; i++) {
    	// here I call the function that contains the code I previously posted,
    	// passing [values objectAtIndex:i] as a parameter (that will be
    	// the expressions object from the code I previously posted.
    }
    
    Your description does not mention creation and release of an autorelease pool. This is crucial information, and illustrates exactly why I was so adamant about you posting your code. If you think the difference is insignificant, you should try it both ways. If you agree the difference is significant, then why didn't you just post the 3 lines of code instead of creating 3 lines of incorrect description?


    Unfortunately, you still haven't provided any sample data, nor have you provided anything that gives even a hint as to the contents of the values array. You may consider these to be insignificant, but your previous posts suggest otherwise.

    In particular, you already noted that the problem only occurs when grep returns a large number of results. Are we supposed to make up sample data ourselves? How would we know if it matched your data?

    I'm not asking you to upload a multi-megabyte file. I wouldn't download it even if you did.

    I'm asking you to prepare a reasonably sized file of sample data that clearly and consistently causes the problem.

    I'm also asking you to provide the list of strings that go into the values array, which clearly have some relationship to the problem.

    In short, we still need a complete set of data and code that consistently demonstrates the problem. That's all anyone has ever needed you to provide.

    You've gotten so caught up in telling us where the problem isn't, and how convinced you are of it, that you've failed to recognize that the one thing we need most is a running example of the problem (code and data), so someone else can actually run it, see it, and debug it. Unfortunately, we can't reach into your computer and debug your program for you. You have to give us something that we can compile, run, and debug on our computers.

    If we have to fill in the missing pieces using code we write, or data we generate, there is no way to know if it will demonstrate the problem. If it does demonstrate the problem, there's no way to know if it's for the same reason you see the problem. It could be there's a completely different bug in the code one of us writes. Or if the problem doesn't appear, it could depend on the data we generated.

    I am explaining this so you understand why I keep asking for your actual compilable code and sample data that consistently shows the problem.


    One last thing: double-check that you aren't running your program in GC mode.
     
  24. jared_kipe macrumors 68030

    jared_kipe

    Joined:
    Dec 8, 2003
    Location:
    Seattle
    #24
    Then don't use the word leak!

    Just make your algorithm more effecient, obviously if you're processing a large file you're gonna take up a lot of memory.

    For instance
    Code:
    NSString *currentLine = [lines objectAtIndex:i];
    NSArray *values = [currentLine componentsSeparatedByString:@"\t"];
    			
    // Let's check if this is a valid line.
    if ([values count] == 20) {
    	// OK, we found a possible match. Add it to the array for later inspection.
    	[possibleMatches addObject:currentLine];
    }
    
    Is very memory inefficient since you're looking for twenty tabseparated things

    Code:
    #define kCONDITION 20
    int counter, index;
    NSString *currentLine;
    
    currentLine = (NSString *) [lines objectAtIndex: i];
    counter = 0;
    for ( index = 0, index < [currentLine length], index++ ) {
    	if ( [currentLine characterAtIndex: index] == '\t' ){
    		counter++;
    	}
    	// just gets out of the scanner early if the condition is already incorrect...
    	if ( counter > kCONDITION ){
    		index = [currentLine length];
    	}
    }
    
    if ( counter == kCONDITION ) {
    	[possibleMatches addObject:currentLine];
    }
    
    While longer, this is MUCH MUCH less memory intensive. I'd put the counter and index ints outside of the loop that is around them so that they stay around since you can use the same one in every loop. Same with the string pointer currentLine.
     
  25. chown33 macrumors 604

    Joined:
    Aug 9, 2009
    #25
    The autorelease in red is wrong. It causes a double-free when the code is run.


    My code follows.

    ranguvar1.m
    (slightly modified from original, with autorelease bug fixed, and testing code added)
    Code:
    #import <Foundation/Foundation.h>
    
    #import <unistd.h>
    
    
    //#define INPUT_PATH	@"~/Desktop/file.txt"
    
    #define INPUT_PATH	@"./abc.txt"
    
    
    
    NSArray* exprsArray( NSString* exprsPathname );
    NSArray* grepFileForExpression(NSString *expression);
    
    
    int main( int argc, const char * argv[] ) 
    {
        NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
    	
    	NSArray *values = exprsArray( @"./exprs.txt" );  // pathname may contain ~/
    
    	for (int i = 0; i < [values count]; i++) 
    	{
    		NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
    		NSArray *possibleMatches = grepFileForExpression([values objectAtIndex:i]);
    		NSLog( @" possible count: %d", [possibleMatches count] );
    //		NSLog( @" array: %@", possibleMatches );
    		[pool drain];
    	}
    
    	NSLog( @"Done" );
    //	sleep( 10 );  // time in secs  or pause()
    	pause();  // waits for signal
    
        [pool drain];
        return 0;
    }
    
    
    NSArray* exprsArray( NSString* exprsPathname )
    {
    	NSString* exprPath = [exprsPathname stringByExpandingTildeInPath];
    	NSStringEncoding enc = NSUTF8StringEncoding;
    	NSError* error = nil;
    	NSString* exprStr = [NSString stringWithContentsOfFile:exprPath encoding:enc error:&error];
    
    	if ( exprStr )
    	{
    		NSString* trimmed = [exprStr stringByTrimmingCharactersInSet:
    				  [NSCharacterSet whitespaceAndNewlineCharacterSet]];
    
    		return [trimmed componentsSeparatedByString:@"\n"];
    	}
    
    	NSLog( @"createExpressions() error: %@", error );
    	return nil;
    }
    
    
    
    // Notes:
    // 1. Input filename defined by INPUT_PATH (differs from original).
    //   Therefore, no other file can be grepped.
    //
    // 2. Whitespace is trimmed from the expression string.
    //  Therefore, patterns must not start or end with whitespace.
    
    NSArray* grepFileForExpression(NSString *expression) 
    {
    	
    	NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
    	
    	expression = [expression stringByTrimmingCharactersInSet:
    				  [NSCharacterSet whitespaceAndNewlineCharacterSet]];
    	
    	NSArray *arguments = [NSArray arrayWithObjects:expression, 
    //		[@"~/Desktop/file.txt" stringByExpandingTildeInPath], 
    		[INPUT_PATH stringByExpandingTildeInPath], 
    		@"-n", @"--line-number",   // ?? redundant: both options mean the same thing
    		nil];
    	NSPipe *outPipe = [[NSPipe alloc] init];
    	
    	NSTask *task = [[NSTask alloc] init];
    	[task setLaunchPath:@"/usr/bin/grep"];
    	[task setArguments:arguments];
    	[task setStandardOutput:outPipe];
    	[outPipe release];
    	
    	[task launch];
    	
    	NSData *data = [[outPipe fileHandleForReading] readDataToEndOfFile];
    	[task waitUntilExit];
    	[task release];
    	
    	NSString *stringOrig = [[NSString alloc] initWithBytes:[data bytes] length:[data length] encoding:NSUTF8StringEncoding];
    
    //	NSString *string = [stringOrig stringByReplacingOccurrencesOfString:@"\r" withString:@""];
    	NSString *string = [stringOrig stringByTrimmingCharactersInSet:[NSCharacterSet characterSetWithCharactersInString:@"\r"]];
    
    
    	int linesNum = 0;  // ?? unused
    	
    	NSMutableArray *possibleMatches = [[NSMutableArray alloc] init];
    	
    	// Check if the output appears to be valid.
    	if ([string length] > 0) {
    		
    		// Separate the different lines.
    //		NSArray *lines = [[string componentsSeparatedByString:@"\n"] autorelease];  // ERROR: not owned
    		NSArray *lines = [string componentsSeparatedByString:@"\n"];  
    		linesNum = [lines count];  // ?? unused
    		
    		// Now go through all the different lines.
    		for (int i = 0; i < [lines count]; i++) {
    			
    			NSString *currentLine = [lines objectAtIndex:i];
    			NSArray *values = [currentLine componentsSeparatedByString:@"\t"];
    			
    			// Let's check if this is a valid line.
    			if ([values count] == 20) {
    				// OK, we found a possible match. Add it to the array for later inspection.
    				[possibleMatches addObject:currentLine];
    			}
    		}
    	}
    	
    	[stringOrig release];
    	[pool release];
    	
    	return [possibleMatches autorelease];
    }
    
    Generator for data and exprs files.
    RanguvarGen.java
    Code:
    import java.io.*;
    import java.util.*;
    
    public class RanguvarGen
    {
    	public static void
    	main( String[] args )
    	throws IOException
    	{
    		// Blindly require single input arg: output pathname
    		// Throws exception if no arg.
    		File outFile = new File( args[ 0 ] );
    
    	
    		// These are the patterns that will be in the "exprs.txt" file.
    		// The generator must produce these in the output at a non-zero rate.
    		// The "exprs.txt" file is written to working directory.
    		String[] exprs = { "abc", "def", "ghi" };
    		writeExprs( "./exprs.txt", exprs );
    
    
    		// This is the number of tab-separated fields on each line.
    		// It's hard-wired at 20 because the Obj-C code is hard-wired at 20.
    		final int fieldCount = 20;
    
    		// Line-count from "lines" property, or default to 40.
    		int lineCount = Integer.getInteger( "lines", 40 ).intValue();
    
    		genData( outFile, fieldCount, lineCount, exprs );
    	}
    
    
    	private static void
    	writeExprs( String outPath, String[] exprs )
    	throws IOException
    	{
    		PrintStream out = new PrintStream( outPath, "UTF8" );
    		try
    		{
    			for ( String s : exprs )
    			{  out.println( s );  }
    		}
    		finally
    		{
    			out.close();
    		}
    	}
    
    
    	private static void
    	genData( File outFile, final int fieldCount, final int lineCount, String[] exprs )
    	throws IOException
    	{
    		PrintStream out = new PrintStream( outFile, "UTF8" );
    		try
    		{
    			final int xCount = exprs.length;
    			for ( int i = 0;  i < lineCount;  ++i )
    			{  out.println( genLine( fieldCount, exprs[ i % xCount ] ) );  }
    		}
    		finally
    		{
    			out.close();
    		}
    	}
    
    	// For generating random field values.
    	private static Random rand = new Random();
    
    	private static String
    	genLine( final int fieldCount, String expr )
    	throws IOException
    	{
    		// To make grep's job non-trivial, put the expr to be found at mid-point of line.
    		final int exprPosn = fieldCount / 2;
    		StringBuilder s = new StringBuilder();
    
    		for ( int i = 0;  i < fieldCount;  ++i )
    		{
    			if ( i == exprPosn )
    				s.append( expr ).append( '\t' );
    			else
    				s.append( rand.nextInt( 1000 ) ).append( '\t' );  // random 0..1000
    		}
    
    		return s.toString().trim();  // no leading or trailing whitespace (tabs).
    	}
    
    }
    
    Example Terminal commands:
    Code:
    javac RanguvarGen.java
    
    # 40 lines in output file
    java RanguvarGen abc.txt
    
    # 50k lines in output file
    java -Dlines=50000 RanguvarGen abc.txt
    
    gcc -framework Foundation -std=c99  ranguvar1.m
    ./a.out
    
    When a.out is run, it will pause without exiting, awaiting a signal (the pause() call in source). You can terminate it, or poke around at it. You can also take out the pause(), and then run it under Instruments or whatever.

    You can also try running RanguvarGen to produce much larger "abc.txt" files, say a million lines, without having to upload or download any million-line files.
     

Share This Page