Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

ppc_michael

Guest
Original poster
Apr 26, 2005
1,498
2
Los Angeles, CA
I'm using preg_replace, and basically I want to match a certain HTML tag and turn it into a <div>. So I want to take these:

Code:
<cut title="titletext">Some content</cut>
<cut>Some more content</cut>

And change them to these:

Code:
<div class="cut">Some content</div>
<div class="cut">Some more content</div>

This is the pattern I used with my preg_replace function:

Code:
/<cut.*>(.*)<\/cut>/s

That pattern matches the <cut>Some more content</cut>, but it does not match the <cut title="titletext">Some content</cut> one.

Obviously I must be doing something wrong, but what? I would think using <cut.*> would accept <cut> or <cut ANYTHING>.
 
I'm not a preg_replace user, but I know my regular expressions.

Are you passing the lines to the call in an array ?

Does the first line appear in the output (unaltered) or is it missing ?

If it's missing I think it's you first <cut.*> it's matching the whole of the first line and up to the end of the <cut> on the second line. RE do this, matching as long as they can as early as they can.
You could test by replacing the period '.' for any char in the <cut.*> to exclude any right angles '>', so the pattern stops at the first '>'.
If you're using arrays you shouldn't have this issue (from my quick glimpse at the preg_replace page.
 
@OP

I use a nice little custom function named strip_tags_attributes() that works great for stuff like this.

Using your example posted here:

PHP:
$mystring="<cut title=\"titletext\">Some content</cut>";
$newstring=strip_tags_attributes($mystring,"<cut>");

Output would be: <cut>Some content</cut>

The reason you might use this function instead of one regular expression is you have the power to specify which tag in a string, you can keep attributes you don't want to remove, plus it properly handles extra white space and single or double quotes surrounding those case insensitive attributes.

For example, I want to keep foo but only remove title attribute:

PHP:
$mystring="<cut foo='foobar' title=\"titletext\">Some content</cut>";
$newstring=strip_tags_attributes($mystring,"<cut>","foo");

You'll end up with "<cut foo='foobar'>Some content</cut>"


PHP:
/* 
Usage:
argument 1: string to search
argument 2: tag names just like strip_tags() uses in PHP
argument 3: optional - attributes to allow, may be comma separated or an array, otherwise remove all if not defined
*/

function strip_tags_attributes($string,$allowtags=NULL,$allowattributes=""){
    $string = strip_tags($string,$allowtags);
    if (!is_null($allowattributes)) {
        if(!is_array($allowattributes))
            $allowattributes = explode(",",$allowattributes);
        if(is_array($allowattributes))
            $allowattributes = implode(")(?<!",$allowattributes);
        if (strlen($allowattributes) > 0)
            $allowattributes = "(?<!".$allowattributes.")";
        $string = preg_replace_callback("/<[^>]*>/i",create_function(
            '$matches',
            'return preg_replace("/ [^ =]*'.$allowattributes.'=(\"[^\"]*\"|\'[^\']*\')/i", "", $matches[0]);'   
        ),$string);
    }
    return $string;
}


-jim
 
Jim has a good solution there. The regex does seem to be fine though when I tested it in my regular expression testing tool (see attachment). That means it's more likely an issue in other PHP you have. Show us the code and we'll see if anything pops out.
 

Attachments

  • regex.png
    regex.png
    11.6 KB · Views: 97
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.