NSXMLDocument/NSXMLParser large memory footprint

Discussion in 'Mac Programming' started by electronpusher, Nov 17, 2008.

  1. electronpusher macrumors newbie

    Joined:
    Nov 17, 2008
    #1
    Hello, all

    I'm parsing in large XML files (e.g. 50-200mb) and I find that the memory usage both during the parse and after releasing the NSXMLDocument or the NSXMLParser is still huge.

    With alloc/init of a large XML file with NSXMLDocument I find that the memory consumed is about 7-10 times the size of the file.

    With alloc/init of large XML with NSXMLParser the memory consumed is about 4X the size of the file.

    After releasing either the NSXMLDocument or the NSXMLParser (and thus triggering dealloc) I find that about 80% of the memory used by that object is still consumed.

    I am sure that my retain/release are balanced and I'm sure that dealloc is getting triggered on both types of parsers.

    I am using Activity Monitor to view the memory consumption of the process. I've also used Instruments/Object-Alloc to see what AppKit is doing with the memory, the breakdown is the following:
    GeneralBlock-<some number>: is taking up about half the memory or more consumed by the NSXML parser
    CFString: is taking up about a quarter
    CFDictionary: is taking up about a quarter if you use NSXMLParser, if you use NSXMLDocument as the parser the remaining quarter is in GeneralBlock

    It seems to me like the memory in GeneralBlock-<some number> is a buffer that the NSXML parser uses to take bytes from the file and then create Foundation/AppKit objects from.

    If this is true or not, why am I not recovering all of the memory once I'm done parsing the large XML? Note: I'm not making references or retaining any of the objects created in the object graph rooted at the NSXMLDocument or NSXMLParser instance.

    Is Activity Viewer giving me a false picture of the memory consumption?

    Thanks in advance for help!
    Code on!
    -Michael C Gilson
     
  2. kainjow Moderator emeritus

    kainjow

    Joined:
    Jun 15, 2000
    #2
    I'm not sure if this would help but you could try adding your own autorelease pool and releasing it when the XML is done parsing.
     
  3. garethlewis2 macrumors 6502

    Joined:
    Dec 6, 2006
    #3
    Reboot your machine.

    Write down the figures activity monitor records when displaying memory, e.g. Free, Wired, Active, Inactive and Used, run your program, then write down the figures gain. I am only guessing, but I would expect that most of the memory is in the wired section. This isn't lost. It can still be used by the OS and by you when need memory allocated.
     
  4. Catfish_Man macrumors 68030

    Catfish_Man

    Joined:
    Sep 13, 2001
    Location:
    Portland, OR
    #4
    Why on earth would NSXMLParser/Document be wiring memory?

    Anyway, how are you measuring memory usage? I've found that rprvt numbers in top provide a good first estimation. If that seems to be odd, the 'heap', and 'vmmap' tools can give more detailed views (along with instruments, etc... which you've already tried). One thing to consider when looking at heap output is the percentage of the heap that's in use after freeing the NSXMLParser; it's possible that the memory allocator is not returning the ram to the system (either intentionally for performance, or unintentionally due to heap fragmentation).
     
  5. garethlewis2 macrumors 6502

    Joined:
    Dec 6, 2006
    #5
    Even if memory is wired after use, it is still free for another program to use, it is cached, e.g, since you loaded up such a massive amount of data, you might do it again. How the hell do you think that Apps load so much quicker than the very first time they load on OS X after a reboot?
     
  6. Catfish_Man macrumors 68030

    Catfish_Man

    Joined:
    Sep 13, 2001
    Location:
    Portland, OR
    #6
    You seem to be confused about what wiring memory does. All it's for is making sure that memory can't be paged out to disk.
     
  7. garethlewis2 macrumors 6502

    Joined:
    Dec 6, 2006
    #7
    I bow to your superior knowledge.

    Now you can answer the posters question about why the memory is being held. I believe it to be cached. So it doesn't show up in the free pool, but it can be allocated to another task if required. But if the OP runs the original program and OS X decides to load the program into the same memory location, then it can use the cached memory.
     
  8. Catfish_Man macrumors 68030

    Catfish_Man

    Joined:
    Sep 13, 2001
    Location:
    Portland, OR
    #8
    Certainly could be caching, either at the filesystem level (iirc OSX's fs cache is the called the 'unified buffer cache') or the memory allocator level. I think ruling out mismeasurement or heap fragmentation first would be good though.
     

Share This Page