Fast C++ File IO

Discussion in 'Mac Programming' started by TastyCocoa, Jul 13, 2011.

  1. TastyCocoa macrumors newbie

    TastyCocoa

    Joined:
    Mar 21, 2011
    #1
    I was wondering if anyone had any advice for me in how I can speed up the reading in and processing of several large ASCII data files.

    What I'm trying to do is read in 40 or so 200mb+ files, take the data they contain and dump it into a large array of objects.

    What I'm currently doing is pulling my files in line by line using getline() and then processing the resulting strings using a stringstream. The resulting variables are then added to my objects.

    The whole process isn't exactly slow, but it can take 30 seconds or so for one file and I generate a fair number.

    Thanks for any help you can give me!
     
  2. lee1210 macrumors 68040

    lee1210

    Joined:
    Jan 10, 2005
    Location:
    Dallas, TX
    #2
    At first blush my suggestion would be memory mapped IO. Look into mmap. At a certain point, though, you're still reading 10s or 100s of MBs off slow storage.

    -Lee
     
  3. mfram macrumors 65816

    Joined:
    Jan 23, 2010
    Location:
    San Diego, CA USA
    #3
    Um, how much RAM do you have? 40*200Meg is already 8 gig. And that's not counting any overhead of which there may be a lot.
     
  4. gnasher729 macrumors P6

    gnasher729

    Joined:
    Nov 25, 2005
    #4
    Read a file using [NSData dataWithContentsOfFile], then dispatch a block to process it, read the next file etc. That way you keep all processors busy. On the other hand, 8 GB of data may be bit much to keep in RAM. Look for "Block programming" in your XCode documentation.
     
  5. brianbauer04 macrumors member

    Joined:
    Dec 4, 2010
    #5
    Have you tried making your stream buffer larger? You can provide your own buffer (of any size) for your stream reader - give that a shot to see if it improves I/O. Do a google search for streambuf and the pubsetbuf method.

    Also, if your data files are structured and each line has the same data formatting, fscanf may be a huge improvement over doing the same string processing over and over and over.
     

Share This Page