creative text file parsing methodology? any ideas?

Discussion in 'Mac Programming' started by dj.mooky, Jun 28, 2009.

  1. macrumors newbie

    Feb 26, 2008
    So from the poker application I'm working on.. I made major headway today, but noticed a glitch in the tracking of money.

    As of right now, I am breaking the text file down line by line, and then analyzing the prefix attached to it to determine where to send the next few lines too(without giving too much away, I most of the time can establish EXACTLY what would be before the name, and have the beginning search aspect of this covered). I then go on to break the line down into an array separated by spaces to fetch other info like numbers, or actions based on " objectAtIndex:x" type of situations... but onto the problem

    In a nutshell, players are allowed to have names with spaces in them, and there is no set way in the text files generated from pokerstars to descern where EXACTLY they begin and or end.

    .txt example being like this

    Seat 2: pebos train ($6.59 in chips) 
    pebos train: folds  //(however this could be checks, bets, raises, calls....)
    Seat 2: pebos train folded before Flop
    These are 3 examples of this one person named "pebos train," but that is not to say that they are the only lines throwing loops at me. The name could have even more spaces, or it could be something that I would expect to come after it, for example he could have the name "pebos folds a lot" then the line would look like "pebos folds a lot: folds" and if i just searched for the first existance of "folds" then it would give me the incorrect name, then the rest of the data will be incorrect.

    So i've set about attempting to write a method to fetch the name, without knowing EXACTLY what will be after the name. I can however come up with a list of characters, such as ":" or "(" that will come after a name in given scenarios, but I was hoping to come up with a way to do it all in one function.

    So far I can't find a way to hammer down a perfect method for either A: getting the full range(again mainly the end point) of a given name of a player, or B: figure out where a name ends in general because of the naming conventions allowed by pokerstars.

    Any brilliant ideas? Has anyone else done something like this?

    The function i'm using right now looks something like this:
    +(NSString *) getNameFromString: (NSArray *) anArray from:(NSString *) preString to: (NSString *) postString {
    	int start, finish;
    	start = 0;
    	BOOL done;
    	// we figure out the start point for the name by finding the end point of the pre-string
    	if ( preString != nil ) {
    		done = NO;
    		while (!done) {
    			if ([[anArray objectAtIndex: start] hasPrefix: preString]) {
    				done = YES;
    	//now we figure the the end point when we hit the word that the name should end on  
    	done = NO;
    	finish = start + 1;
    	while (!done) {
    		if ([[anArray objectAtIndex: finish] hasPrefix: postString]) {
    			if (![[anArray objectAtIndex: finish] hasPrefix: postString]) {
    				done = YES;
    	NSString *tmpString = @"";
    	int i;
    	for ( i = start; i < finish; i++) {
    		tmpString = [tmpString stringByAppendingString: [anArray objectAtIndex: i]];
    		if ( (i + 1) < finish)
    			tmpString = [tmpString stringByAppendingString: @" "]; // this throws a space in the name where needed
    	return tmpString;
    I've thought about accepting a string, instead of an array as an argument, and searching for specific strings that would be an expected ending, however.. I always stumble back to the point of the name could have the ending string I'm searching for in it. And for some reason, searching FROM the end of the string to the beginning just does not seem like a good idea, especially when I could be looking for multiple options.... I know there is a better way, I just cannot think of it
  2. Moderator emeritus


    Aug 16, 2005
    Generally, when I work with flat files (plain text, often each line is a record) I use a string as the separator that won't exist in the text.

    field 1#$#field-2#$#some other field#$#and so on
    You then just split the line based on that separator (#$#).
  3. macrumors 604

    Jul 29, 2003
    Silicon Valley
    You could always send the data to a language environment that has better support for string processing and regex's. For instance, pipe to perl on a Mac, or send to Javascript in a UIWebView on an iPhone, to do any fancy string processing needed.
  4. thread starter macrumors newbie

    Feb 26, 2008
    I'm not the one writing the text files actually. They are being produced by a program for playing poker online( if you are interested). So my pure intent is to draw information OUT of the file, as opposed to writing it :)

    Writing it would be a piece of cake comparatively.

    and to firewood:
    I currently know no pearl whatsoever. Could you perhaps expound just enough so that I know what to search for in terms of examples so that I may try this idea of yours? I am always looking to expand my horizons to be able to meet any challenge that comes up
  5. macrumors 6502a


    Jan 4, 2002
    Austin, TX
    It looks like just searching for the cash amounts will not vary at all as they will always start with "($" chars. So you could do a pass and gather up the names by looking for those lines with "($" in them and getting all unique names.

    Once you have the names you can use the names for breaking down the strings using the name as the delimiter.

    Yeah its not all one function, but darn, sometimes you deal with the problem at hand any way that works.
  6. macrumors regular

    Nov 24, 2006
    The Netherlands
    How I would do it

    As I'm normally a Perl programmer I would probably use a regular expression (I'm pretty sure ObjC/Cocoa supports this, right ?)

    Now, I don't know much about your input, but if I assume that you have one file per match (and I read your input correctly) and want to know all the users and seat the user is on then I would use this regexp to walk over the file once so I have all the users and their seats:

    I'm assuming this is an 'introduction' line which describes the players:

    Seat 2: pebos train ($6.59 in chips)

    So for every line:

    Try to match:

    followed by whitespace
    followed by an integer (and save the integer)
    followed by colon and whitespace
    Any text (the username, certainly save this)
    followed by ' ($'
    followed by an amount (might as wel save this)
    followed by 'in chips)'

    In Perl (can't test this second, but should work):
    if (/Seat\s+(\d+):\s+(.*?)\s+\(\$([0-9.]+)\s+in chips\)$/) {
     my $seat        = $1;
     my $username    = $2;
     my $person_cash = $3;
    There might be some stuff here which deals with regexps in for NSStrings:

    If you have a lot of sample data and can let me know what you need exactly I might be able to figure this out in Obj-C for you tonight.
  7. macrumors regular

    Nov 24, 2006
    The Netherlands
  8. macrumors member

    Jun 4, 2009
    Use a regular expression. There are other ways but nothing so powerful or elegant. When the format of the file changes you will be glad you went with this method.

    I haven't checked but I think PCRE comes with OSX - I'm sure there will be an Obj-C wrapper though I don't know if that will come by default.


  9. macrumors regular

    Nov 24, 2006
    The Netherlands
    a try

    Okay, I'm still pretty new with Obj-C ... But how would something like this work for you ? :

    #import <Foundation/Foundation.h>
    int main (int argc, const char * argv[]) {
        NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
    	NSArray *testFile = [NSArray arrayWithObjects: @"Seat 2: pebos train ($6.59 in chips)\n",
    						    @"Seat 5: ChOas ($15.0 in chips)\n",
    						    @"something bad about iphonejudy\n",
    						    @"Seat 7: Bastard ($5.5 in chips) player ($15.0 in chips)\n",
    						    @"Seat 4: Easier in Perl ($15.0 in chips)\n",nil];
    	NSInteger userSeat;
    	NSString *userName = NULL;
    	float     userCash;
    	for (NSString *curLine in testFile) {
    		NSScanner *myScanner = [NSScanner scannerWithString:curLine];
    		if ([myScanner scanString:@"Seat" intoString:NULL]         &&
    		    [myScanner scanInteger:&userSeat]                      &&
    		    [myScanner scanString:@":" intoString:NULL]            &&
    		    [myScanner scanUpToString:@" ($" intoString:&userName] &&
    	    	    [myScanner scanString:@"($" intoString:NULL]           &&
    		    [myScanner scanFloat:&userCash]                        && 
    		    [myScanner scanString:@"in chips)" intoString:NULL]) {
    			NSLog(@"Found: \nSeat: \"%d\"\nUser:\"%@\"\nCash:\"%0.2f\"\n",userSeat, userName, userCash);
    	[pool drain];
        return 0;
    maybe sscanf would have been possible too, but I'm not sure...

    Oh, seat 7 messes stuff up... dunno how to anchor the searching to the back so I hope they sanitise the player names :)
  10. thread starter macrumors newbie

    Feb 26, 2008
    I looked over the link from ChOas, and think I am going to go with the regex route. Mainly because what cqexbesd said rang so true. I think overall it will allow for easier searches when things change, or I decide to incorporate compatibility for other poker sites...

    Now if I can only fix that strange math error >.> When you have only 6 people total, 3 people win $30 and no one lost more than $4 something is up for sure :)

    Thanks for all the help, I think I have plenty to get me going down the right path now
  11. macrumors 65816


    Jul 17, 2008
    1) You know how to find dollar amounts
    2) You know all the words in pokerstars' "dictionary"

    Names would be what's left, right?

  12. thread starter macrumors newbie

    Feb 26, 2008
    I managed to get it working(100% as per my handhistory files that I can see) with a combination of regex's and NSScanner to do the info sucking in the strings.

    Much thanks for all the help from everyone.

    Currently seeking 2-3 beta testers if anyone is interested >.>

    App is still buggy as all get out but I really want to hammer down the cash tracking before I progress much further down the line
  13. thread starter macrumors newbie

    Feb 26, 2008
    oh, and the solution..... was something like this

    	NSPredicate *regextest;
    	NSString *seatedPlayer = @"Seat .*: .* (.* in chips).*";
    	NSCharacterSet *colonSet;
    	NSCharacterSet *dollarSet;
    	NSCharacterSet *openPenSet;
    	NSCharacterSet *closePenSet;
    	colonSet = [NSCharacterSet characterSetWithCharactersInString:@":"];
    	dollarSet = [NSCharacterSet characterSetWithCharactersInString:@"$"];
    	openPenSet = [NSCharacterSet characterSetWithCharactersInString:@"("];
    	closePenSet = [NSCharacterSet characterSetWithCharactersInString:@")"];
                            NSString *thisGuysName;
    			NSScanner *theScanner;
    			theScanner = [NSScanner scannerWithString:[lineByLine objectAtIndex:line]];
    			float amount;	
    			NSLog(@"%@", [lineByLine objectAtIndex:line]);
    			regextest= [NSPredicate predicateWithFormat:@"SELF MATCHES %@", seatedPlayer];
    			//@"Seat .*: .* .*.* in chips)";
    			if ([regextest evaluateWithObject:[lineByLine objectAtIndex:line]] == YES) {
    				if ( [theScanner scanUpToCharactersFromSet:colonSet intoString:NULL] &&
    					[theScanner scanCharactersFromSet:colonSet intoString:NULL] &&
    					[theScanner scanUpToString:@"($" intoString: &thisGuysName] ) {
    					NSLog(@"Seated player: %@", thisGuysName);
    				} else {
    					NSLog(@"Match posted! but we have a problem");
    				NSLog(@"Match posted blind!");
    I used a similar fashion to glean the rest of the info I wanted out of other lines

Share This Page