Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

simX

macrumors 6502a
Original poster
May 28, 2002
765
4
Bay Area, CA
So... I'm having a little trouble with regular expressions. I need to form a regular expression that will match this:

[a href="http://www.apple.com/"]

but not this:

[a href="http://www.apple.com/"] oogity boogity [/a] [blockquote]

.. and the result should be that the regex is that it replaces the square brackets with angle brackets of the A tags. I've come up with this /\[a(.*)\]/<a$1>/g , but it seems to match the latter first rather than the former, so it ends up replacing the wrong closing bracket. In other words, I want the regex to specifically prohibit additional closing brackets ] between the initial opening bracket [ and final closing bracket ]. How would I do this with a regular expression? (Or is it possible?)
 

HexMonkey

Administrator emeritus
Feb 5, 2004
2,240
504
New Zealand
I think I've got it working. Firstly you need to prevent [ and ] characters from appearing inside the tag, which can be done by changing .* to [^\[\]]*. Then you need to use a negative lookahead (in the format (?!<expression>))to check that [/a] doesn't appear after it.

This results in the following expression, which is almost as readable as machine code:
Code:
/\[a([^\[\]]*)\](?!.*\[/a\])/<a$1>/g

In my tests, these lines were matched:
  • [a href="http://www.apple.com/"]
  • [a href="http://www.apple.com/"] oogity boogity
  • [a href="http://www.apple.com/"][b][/b]

These lines weren't matched:
  • [a href="http://www.apple.com/"] oogity boogity [/a] [blockquote]
  • [a href="http://www.apple.com/"][/a]
  • [a href="http://www.apple.com/"][a][/a]

Note the last one, if you have multiple [a] tags on the same line, and one of them is closed, none of them will be matched. If they're all on different lines it will work properly. Hopefully this won't be a problem in your situation, let me know if it is and I'll look into a fix.
 

PatrickF

macrumors 6502
Feb 16, 2006
335
0
Blighty
Even easier if you just want the first part of the tag,

Code:
/\[[aA](.*?)\]/<a$1>/g

The .*? is essentially the same as .* but it switches from greedy matching to non-greedy matching (i.e. it will stop at the first ] instead of trying to match as much as possible before finding the last ])

You could also do it like this:

Code:
/\[[aA]([^\]]*)\]/<a$1>/g

This basically says match [ then either a or A followed by any character that's not ] zero or more times and then you match on the ] just to complete it.

Hope this helps.
 

HexMonkey

Administrator emeritus
Feb 5, 2004
2,240
504
New Zealand
I used the non-greedy quantifier in my first attempt, but changed it as it wasn't working as expected. For example, in the line:
[a href="http://www.apple.com/"] oogity boogity [/a] [blockquote]​
It would first try to match [a href="http://www.apple.com/"], since it's non-greedy, but would see it has a [/a] following it. It would then try to match [a href="http://www.apple.com/"] oogity boogity [/a], which would succeed as it has no [/a] following it.

Of course, if you want to match the opening part of the tag regardless of whether it has a closing part, either of PatrickF's expressions will work.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.