Help with regular expressions

Discussion in 'macOS' started by simX, Jul 2, 2006.

  1. simX macrumors 6502a

    simX

    Joined:
    May 28, 2002
    Location:
    Bay Area, CA
    #1
    So... I'm having a little trouble with regular expressions. I need to form a regular expression that will match this:

    [a href="http://www.apple.com/"]

    but not this:

    [a href="http://www.apple.com/"] oogity boogity [/a] [blockquote]

    .. and the result should be that the regex is that it replaces the square brackets with angle brackets of the A tags. I've come up with this /\[a(.*)\]/<a$1>/g , but it seems to match the latter first rather than the former, so it ends up replacing the wrong closing bracket. In other words, I want the regex to specifically prohibit additional closing brackets ] between the initial opening bracket [ and final closing bracket ]. How would I do this with a regular expression? (Or is it possible?)
     
  2. HexMonkey Administrator

    HexMonkey

    Staff Member

    Joined:
    Feb 5, 2004
    Location:
    New Zealand
    #2
    I think I've got it working. Firstly you need to prevent [ and ] characters from appearing inside the tag, which can be done by changing .* to [^\[\]]*. Then you need to use a negative lookahead (in the format (?!<expression>))to check that [/a] doesn't appear after it.

    This results in the following expression, which is almost as readable as machine code:
    Code:
    /\[a([^\[\]]*)\](?!.*\[/a\])/<a$1>/g
    In my tests, these lines were matched:
    • [a href="http://www.apple.com/"]
    • [a href="http://www.apple.com/"] oogity boogity
    • [a href="http://www.apple.com/"][b][/b]

    These lines weren't matched:
    • [a href="http://www.apple.com/"] oogity boogity [/a] [blockquote]
    • [a href="http://www.apple.com/"][/a]
    • [a href="http://www.apple.com/"][a][/a]

    Note the last one, if you have multiple [a] tags on the same line, and one of them is closed, none of them will be matched. If they're all on different lines it will work properly. Hopefully this won't be a problem in your situation, let me know if it is and I'll look into a fix.
     
  3. PatrickF macrumors 6502

    Joined:
    Feb 16, 2006
    Location:
    Blighty
    #3
    Even easier if you just want the first part of the tag,

    Code:
    /\[[aA](.*?)\]/<a$1>/g
    
    The .*? is essentially the same as .* but it switches from greedy matching to non-greedy matching (i.e. it will stop at the first ] instead of trying to match as much as possible before finding the last ])

    You could also do it like this:

    Code:
    /\[[aA]([^\]]*)\]/<a$1>/g
    
    This basically says match [ then either a or A followed by any character that's not ] zero or more times and then you match on the ] just to complete it.

    Hope this helps.
     
  4. HexMonkey Administrator

    HexMonkey

    Staff Member

    Joined:
    Feb 5, 2004
    Location:
    New Zealand
    #4
    I used the non-greedy quantifier in my first attempt, but changed it as it wasn't working as expected. For example, in the line:
    [a href="http://www.apple.com/"] oogity boogity [/a] [blockquote]​
    It would first try to match [a href="http://www.apple.com/"], since it's non-greedy, but would see it has a [/a] following it. It would then try to match [a href="http://www.apple.com/"] oogity boogity [/a], which would succeed as it has no [/a] following it.

    Of course, if you want to match the opening part of the tag regardless of whether it has a closing part, either of PatrickF's expressions will work.
     

Share This Page