PHP + Regex: Why isn't this pattern matching?

Discussion in 'Web Design and Development' started by ppc_michael, Oct 22, 2009.

  1. ppc_michael Guest


    Apr 26, 2005
    Los Angeles, CA
    I'm using preg_replace, and basically I want to match a certain HTML tag and turn it into a <div>. So I want to take these:

    <cut title="titletext">Some content</cut>
    <cut>Some more content</cut>
    And change them to these:

    <div class="cut">Some content</div>
    <div class="cut">Some more content</div>
    This is the pattern I used with my preg_replace function:

    That pattern matches the <cut>Some more content</cut>, but it does not match the <cut title="titletext">Some content</cut> one.

    Obviously I must be doing something wrong, but what? I would think using <cut.*> would accept <cut> or <cut ANYTHING>.
  2. BertyBoy macrumors 6502

    Feb 1, 2009
    I'm not a preg_replace user, but I know my regular expressions.

    Are you passing the lines to the call in an array ?

    Does the first line appear in the output (unaltered) or is it missing ?

    If it's missing I think it's you first <cut.*> it's matching the whole of the first line and up to the end of the <cut> on the second line. RE do this, matching as long as they can as early as they can.
    You could test by replacing the period '.' for any char in the <cut.*> to exclude any right angles '>', so the pattern stops at the first '>'.
    If you're using arrays you shouldn't have this issue (from my quick glimpse at the preg_replace page.
  3. SrWebDeveloper macrumors 68000


    Dec 7, 2007
    Alexandria, VA, USA

    I use a nice little custom function named strip_tags_attributes() that works great for stuff like this.

    Using your example posted here:

    $mystring="<cut title=\"titletext\">Some content</cut>";
    Output would be: <cut>Some content</cut>

    The reason you might use this function instead of one regular expression is you have the power to specify which tag in a string, you can keep attributes you don't want to remove, plus it properly handles extra white space and single or double quotes surrounding those case insensitive attributes.

    For example, I want to keep foo but only remove title attribute:

    $mystring="<cut foo='foobar' title=\"titletext\">Some content</cut>";
    You'll end up with "<cut foo='foobar'>Some content</cut>"

    argument 1: string to search
    argument 2: tag names just like strip_tags() uses in PHP
    argument 3: optional - attributes to allow, may be comma separated or an array, otherwise remove all if not defined

    function strip_tags_attributes($string,$allowtags=NULL,$allowattributes=""){
    $string strip_tags($string,$allowtags);
        if (!
    is_null($allowattributes)) {
    $allowattributes explode(",",$allowattributes);
    $allowattributes implode(")(?<!",$allowattributes);
            if (
    strlen($allowattributes) > 0)
    $allowattributes "(?<!".$allowattributes.")";
    $string preg_replace_callback("/<[^>]*>/i",create_function(
    'return preg_replace("/ [^ =]*'.$allowattributes.'=(\"[^\"]*\"|\'[^\']*\')/i", "", $matches[0]);'   

  4. angelwatt Moderator emeritus


    Aug 16, 2005
    Jim has a good solution there. The regex does seem to be fine though when I tested it in my regular expression testing tool (see attachment). That means it's more likely an issue in other PHP you have. Show us the code and we'll see if anything pops out.

    Attached Files:

Share This Page