How should I parse HTML5?

Discussion in 'iOS Programming' started by ArtOfWarfare, Aug 21, 2013.

  1. ArtOfWarfare macrumors 604

    ArtOfWarfare

    Joined:
    Nov 26, 2007
    #1
    So I'm testing my app which needs to be able to parse websites. I set up a unit test for parsing a copy of MacRumors homepage and threw an NSXMLParser at it.

    It stopped at line 49, column 8, complaining about how a meta tag hadn't been closed but the head tag outside of it was now being closed.

    After a bit of investigating, I discovered that in html5, meta tags need not have a close to them.

    So... I need suggestions. Is there a way I can make NSXMLParser handle this? Is there another parser for iOS which will handle this or should I just try rolling my own?
     
  2. ElectricSheep macrumors 6502

    ElectricSheep

    Joined:
    Feb 18, 2004
    Location:
    Wilmington, DE
    #2
    You might try looking at an Objective-C wrapper for Gumbo, such as ObjectiveGumbo
     
  3. ArtOfWarfare thread starter macrumors 604

    ArtOfWarfare

    Joined:
    Nov 26, 2007
    #3
    Thanks for sharing - I'll look into that.
     

Share This Page