How should I parse HTML5?

Discussion in 'iOS Programming' started by ArtOfWarfare, Aug 21, 2013.

  1. ArtOfWarfare macrumors G3


    Nov 26, 2007
    So I'm testing my app which needs to be able to parse websites. I set up a unit test for parsing a copy of MacRumors homepage and threw an NSXMLParser at it.

    It stopped at line 49, column 8, complaining about how a meta tag hadn't been closed but the head tag outside of it was now being closed.

    After a bit of investigating, I discovered that in html5, meta tags need not have a close to them.

    So... I need suggestions. Is there a way I can make NSXMLParser handle this? Is there another parser for iOS which will handle this or should I just try rolling my own?
  2. ElectricSheep macrumors 6502


    Feb 18, 2004
    Wilmington, DE
    You might try looking at an Objective-C wrapper for Gumbo, such as ObjectiveGumbo
  3. ArtOfWarfare thread starter macrumors G3


    Nov 26, 2007
    Thanks for sharing - I'll look into that.

Share This Page