Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

TastyCocoa

macrumors newbie
Original poster
Mar 21, 2011
13
0
I was wondering if anyone had any advice for me in how I can speed up the reading in and processing of several large ASCII data files.

What I'm trying to do is read in 40 or so 200mb+ files, take the data they contain and dump it into a large array of objects.

What I'm currently doing is pulling my files in line by line using getline() and then processing the resulting strings using a stringstream. The resulting variables are then added to my objects.

The whole process isn't exactly slow, but it can take 30 seconds or so for one file and I generate a fair number.

Thanks for any help you can give me!
 
At first blush my suggestion would be memory mapped IO. Look into mmap. At a certain point, though, you're still reading 10s or 100s of MBs off slow storage.

-Lee
 
I was wondering if anyone had any advice for me in how I can speed up the reading in and processing of several large ASCII data files.

What I'm trying to do is read in 40 or so 200mb+ files, take the data they contain and dump it into a large array of objects.

What I'm currently doing is pulling my files in line by line using getline() and then processing the resulting strings using a stringstream. The resulting variables are then added to my objects.

The whole process isn't exactly slow, but it can take 30 seconds or so for one file and I generate a fair number.

Thanks for any help you can give me!

Read a file using [NSData dataWithContentsOfFile], then dispatch a block to process it, read the next file etc. That way you keep all processors busy. On the other hand, 8 GB of data may be bit much to keep in RAM. Look for "Block programming" in your XCode documentation.
 
I was wondering if anyone had any advice for me in how I can speed up the reading in and processing of several large ASCII data files.

What I'm trying to do is read in 40 or so 200mb+ files, take the data they contain and dump it into a large array of objects.

What I'm currently doing is pulling my files in line by line using getline() and then processing the resulting strings using a stringstream. The resulting variables are then added to my objects.

The whole process isn't exactly slow, but it can take 30 seconds or so for one file and I generate a fair number.

Thanks for any help you can give me!

Have you tried making your stream buffer larger? You can provide your own buffer (of any size) for your stream reader - give that a shot to see if it improves I/O. Do a google search for streambuf and the pubsetbuf method.

Also, if your data files are structured and each line has the same data formatting, fscanf may be a huge improvement over doing the same string processing over and over and over.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.