Help with regex (RegexKitLite)

Discussion in 'Mac Programming' started by teek, Mar 8, 2010.

  1. teek macrumors member

    teek

    Joined:
    Feb 12, 2008
    Location:
    Norway
    #1
    I'm trying to parse a few <input ../> tags but this regex only work for input tags that are on separate lines:
    NSString *html = @"html with multiple lines containing a few lines with input elements only and then a few lines with multiple input elemets";

    NSString *regex = @"(?i:<input.*name=\"(.*))\".*value=\"(.*)\".*/>";
    NSArray *names = [html componentsMatchedByRegex:regex capture:1];
    NSArray *values = [html componentsMatchedByRegex:regex capture:2];

    This is working fine but it does NOT work on the lines that consist of multiple <input .../> elements.

    What's wrong with my regex ? Also, This regex won't work if the name and value attributes are not in correct order.. I wan't it to work regardless of order and if there's other attributes specified.:


    <input id="foo" value="bar" name="something"/>
    <input value="foo" id="y" name="bar" other_attribute="x"/>

    Can someone please help with this regex ? I've looked at the docs but I can't figure it out.
     
  2. JoshDC macrumors regular

    Joined:
    Apr 8, 2009
    #2
    Think about what the last .* is doing. It's matching any character, therefore also matching the whole next <input.../> tag. Either use something more specific than .* (\s for whitespace looks like what you want) or add a ? to the end of the .* to force that .* match as little as possible.

    As for matching arbitrary numbers of attributes, you could wrap that section in a non-capture group and + or * it. Something like this (note I haven't tested it):

    ...input\\s*(?:\\w+=\"(.*)\"\\s*)+...

    Also, take a look at the ICU user guide (what RegexKitLite uses as its backend):

    http://userguide.icu-project.org/strings/regexp
     
  3. teek thread starter macrumors member

    teek

    Joined:
    Feb 12, 2008
    Location:
    Norway

Share This Page