preg_replace to only allow <br />

Discussion in 'Web Design and Development' started by Cabbit, May 4, 2009.

  1. Cabbit macrumors 68020

    Cabbit

    Joined:
    Jan 30, 2006
    Location:
    Scotland
    #1
    ^_^ hi hi i would like my preg replace to only allow the <br /> tag if anyone can help.

    looks like this just now

    $value = preg_replace('/<\/?(?:\b(?!)[^>]+?)>/i', '', $value);


    also i have bbcode for lists it looks like the following

    [ LIST=1 ]
    [*]item1
    [*]item2
    [ /LIST ]


    but i am unsure how to get a good replace for it
    // Numbered list //
    $pattern[7] = "/\
    • (.*?)\[\/LIST\]/is";
      $replace[7] = "<ol>$1</ol>";
      $pattern[8] = "/\[*\](.*?)\[\/*\]/is";
      $replace[8] = "<li>$1</li>";
     
  2. angelwatt Moderator emeritus

    angelwatt

    Joined:
    Aug 16, 2005
    Location:
    USA
    #2
    I didn't thoroughly test this, but I believe this regex should work for the br,
    Code:
    /<\/?((?!(br))[^>]+?)>/i
    The (?!(br)) is saying make sure the start of the tag does not contain 'br.' So this also would allow for <brown> but not <abraham>. Off hand I couldn't think of any legitimate HTML tags that have br in it. The regex could be modified to take care of such cases though.

    I'll have to stare at the BBCode some more before I have a solution for that.
     
  3. Cabbit thread starter macrumors 68020

    Cabbit

    Joined:
    Jan 30, 2006
    Location:
    Scotland
    #3
    ^_^ thanks, i am plodding along on a few other things before i look at it again later today. I tend to need a break before the answer comes.

    What would be more useful and cleaner is i am trying to allow this to support line breaks from a text area to php. Would be good if the text blocks there made into paragraphs if there is a linebreak.

    such as the user enters

    Hello this is my post.
    Hey


    php returns just now
    Hello this is my post.<br />Hey

    but
    <p>Hello this is my post.</p><p>Hey</p>
    would be much better.

    EDIT: sorted this bit with

    PHP:
    $paragraphs explode("\n"$value);
            for (
    $i 0$i count ($paragraphs); $i++)
            {
                
    $paragraphs[$i] = '<p>' $paragraphs[$i] . '</p>';
            }
            
    $value implode (''$paragraphs);
     
  4. angelwatt Moderator emeritus

    angelwatt

    Joined:
    Aug 16, 2005
    Location:
    USA
    #4
    OK got the [ * ] part figured out,
    Code:
    \[\*\]([\w\W]+?)\n?(?=(?:(?:\[\*\])|(?:\[\/list\])))
    My test case for this was,
    Code:
    [plain][list=1][/plain]
    [plain][*]hello there.[/plain]
    [plain][*]second[*]oops[/plain]
    [plain][*]third[/plain]
    [plain][/list][/plain]
    I think you pretty well had the ol part figured out. If not, let me know.

    Note: Given this regex you'll want to do this regex before the list one.
     
  5. Cabbit thread starter macrumors 68020

    Cabbit

    Joined:
    Jan 30, 2006
    Location:
    Scotland
    #5

    This is the function just now so were would i want to fit it in?
    PHP:
    // Function to convert the bbtages to html. //
        
    private function bbtags_html()
        {
            
    // Variables
            
    $value $this->build_paragraphs();
            
            
    /*
                Formating Tags
            */
            // Bold //
            
    $pattern[0] = "/\[b\](.*?)\[\/b\]/is";
            
    $replace[0] = "<strong>$1</strong>";
            
            
    // Italic //
            
    $pattern[1] = "/\[i\](.*?)\[\/i\]/is";
            
    $replace[1] = "<i>$1</i>";
            
            
    // Underlined //
            
    $pattern[2] = "/\[u\](.*?)\[\/u\]/is";
            
    $replace[2] = "<u>$1</u>";
            
            
    // url //
            
    $pattern[3] = "/\[url\](.*?)\[\/url\]/is";
            
    $replace[3] = "<a href=\"$1\">$1</a>";
            
            
    // img //
            
    $pattern[4] = "/\[img\](.*?)\[\/img\]/is";
            
    $replace[4] = "<img src=\"$1\" alt=\"$1\" \>";
            
            
    // quote //
            
    $pattern[5] = "/\[quote\](.*?)\[\/quote\]/is";
            
    $replace[5] = "<div class=\"quote\">$1</div>";
            
            
    // code //
            
    $pattern[6] = "/\[code\](.*?)\[\/code\]/is";
            
    $replace[6] = "<div class=\"code\">$1</div>";
            
            
    // Numbered list //
            
    $pattern[7] = "/\[LIST=1\](.*?)\[\/LIST\]/is";
            
    $replace[7] = "<ol>$1</ol>";
            
    // unordered list //
            
    $pattern[8] = "/\[LIST\](.*?)\[\/LIST\]/is";
            
    $replace[8] = "<ul>$1</ul>";
            
            
    // list element //
            
    $pattern[9] = "/\[*\](.*?)\[\/*\]/is";
            
    $replace[9] = "<li>$1</li>";
            
            
    /*
                Image Tags
            */
            // smile
            
    $pattern[10] = '/\:\)/';
            
    $replace[10] = '<img src="/include/class/bbeditor/images/smilies/emoticon_smile.png" alt=":)" />';
        
            
    // tongue 
            
    $pattern[11] = '/\:p/';
            
    $replace[11] = '<img src="/include/class/bbeditor/images/smilies/emoticon_tongue.png" alt=":p" />';
            
    $pattern[12] = '/\:P/';
            
    $replace[12] = '<img src="/include/class/bbeditor/images/smilies/emoticon_tongue.png" alt=":p" />';
            
            
            
    /*
                Output
            */
            
    $value preg_replace($pattern$replace$value);
            return 
    $value;
        }
     
  6. angelwatt Moderator emeritus

    angelwatt

    Joined:
    Aug 16, 2005
    Location:
    USA
    #6
    Mine would become #7 and your current #9 would disappear.
     
  7. Cabbit thread starter macrumors 68020

    Cabbit

    Joined:
    Jan 30, 2006
    Location:
    Scotland
    #7
    I done it as so

    PHP:
    // Numbered list //
            
    $pattern[7] = "\[\*\]([\w\W]+?)\n?(?=(?:(?:\[\*\])|(?:\[\/list\])))";
            
    $replace[7] = "<ol>$1</ol>";
    though it seems to have nocked out the function as it does not return the value now.
     
  8. angelwatt Moderator emeritus

    angelwatt

    Joined:
    Aug 16, 2005
    Location:
    USA
    #8
    That regex was for the li not ol. So the replace array needs updating.
     
  9. Cabbit thread starter macrumors 68020

    Cabbit

    Joined:
    Jan 30, 2006
    Location:
    Scotland
    #9
    HTML:
    // list element // 
            $pattern[7] = "\[\*\]([\w\W]+?)\n?(?=(?:(?:\[\*\])|(?:\[\/list\])))"; 
            $replace[7] = "<li>$1</li>";
    		
    		// Numbered list // 
            $pattern[8] = "/\[list=1\](.*?)\[\/LIST\]/is"; 
            $replace[8] = "<ol>$1</ol>"; 
    
            // unordered list // 
            $pattern[9] = "/\[LIST\](.*?)\[\/LIST\]/is"; 
            $replace[9] = "<ul>$1</ul>";

    like this returns
    Warning: preg_replace() [function.preg-replace]: Delimiter must not be alphanumeric or backslash in /home/abcomfor/development_html/include/class/textarea.class.php on line 152
     
  10. angelwatt Moderator emeritus

    angelwatt

    Joined:
    Aug 16, 2005
    Location:
    USA
    #10
    I believe it needs the / and / around the regex I gave. I've been messing around with the regex in a non-PHP environment so forgot to include them.

    A note for the ol and ul regexes; They currently won't be able to handle nested lists. It is uncommon for people to use them, though admittedly I have here on Mac Rumors. A single regex won't be able to handle it, but I think a regex with a for loop may be able to do it. You may also want to download someone else's BBCode parser to see how they handle it. (A quick Google showed many parsers just ignored lists. The wimps :))

    Essentially you could use,
    Code:
    /\[list\]([\w\W]+)\[\/list\]/
    with a replace and loop over it until no match is found. At least that would work in theory. I have my doubts whether it would hold up when multiple lists are used in the post. It may need a fancier regex.
     
  11. Cabbit thread starter macrumors 68020

    Cabbit

    Joined:
    Jan 30, 2006
    Location:
    Scotland
    #11
    PHP:
             // list element // 
            
    $pattern[7] = '/\[\*\]([\w\W]+?)\n?(?=(?:(?:\[\*\])|(?:\[\/list\])))/'
            
    $replace[7] = "<li>$1</li>";
            
            
    // Numbered list // 
            
    $pattern[8] = "/\[list=1\](.*?)\[\/LIST\]/is"
            
    $replace[8] = "<ol>$1</ol>"

            
    // unordered list // 
            
    $pattern[9] = "/\[LIST\](.*?)\[\/LIST\]/is"
            
    $replace[9] = "<ul>$1</ul>";  
    if you try it out (it posts the input formated) making a list 1 2 3 it returns <li>1</li><li>2</i>[ * ]3

    try me


    and my other function happily shoves the <ol> ect in <p> tags >.< need more logics.
    PHP:
        private function build_paragraphs()
        {
            
    $value $this->bbtags_html();
            
            
    // Rebuilding the paragraphs //
            
    $paragraphs explode("\n"$value);
            for (
    $i 0$i count ($paragraphs); $i++)
            {
                
    $paragraphs[$i] = '<p>' $paragraphs[$i] . '</p>';
            }
            
    $value implode (''$paragraphs);
            
            
    // Making the paragraphs HTML
            
    $value preg_replace('/<(.p?)>/''<$1>'$value);
            
            
    // Return the value
            
    return $value;
        }
     
  12. angelwatt Moderator emeritus

    angelwatt

    Joined:
    Aug 16, 2005
    Location:
    USA
    #12
    It seemed to work for me.
    Given:
    Code:
    [plain][list][/plain]
    [plain][*]1[/plain]
    [plain][*]2[/plain]
    [plain][*]3[/plain]
    [plain][/list][/plain]
    I got back,
    HTML:
    <ul>
    <p></p><li>1
    </li><li>2
    </li><li>3
    </li></ul>
     
  13. Cabbit thread starter macrumors 68020

    Cabbit

    Joined:
    Jan 30, 2006
    Location:
    Scotland
    #13
    with the same data i got back

    HTML:
    <p><ul>
    </p><p><li>1
    </li><li>2
    </li>[*]3
    </p><p></ul></p>
    Ah i found the issue, the bbcode i am using is uppercase, but the actual replace is lower case
     
  14. angelwatt Moderator emeritus

    angelwatt

    Joined:
    Aug 16, 2005
    Location:
    USA
    #14
    Hmm, interesting. For what it's worth I'm on Windows XP using Firefox 3.0.10.

    Are you doing the paragraph part before or after the BBCode replacements?
     
  15. Cabbit thread starter macrumors 68020

    Cabbit

    Joined:
    Jan 30, 2006
    Location:
    Scotland
    #15
    The paragraph bit happens after the bbcode, i have tired it both ways here is the full class for you to have a noodle. I got the list thing working by adding a is at the end this means i think ignore case.

    PHP:
    <?php
    /************************************************************************************
                                      Kittenbunny CMS
                                     Filename: textarea.class.php
                                     Class: Validate Textarea 
    ************************************************************************************/

    class textarea
    {
        
    // Variables
          
    public $post;
            
        
    // Function to remove html tags //
        
    private function sanitize_htmltags()
        {
            
    $value $this->post;

            
    // Sets the allowed taggs //
            
    $value preg_replace('/<\/?(?:\b(?!)[^>]+?)>/i'''$value);  
        
            
    // Does the htmlspecialchars bit //
            
    $value htmlspecialchars($valueENT_QUOTES"UTF-8");
            
            
            
            return 
    $value;
        }
        
        
    // Adds paragraph tags to the html
        
    private function build_paragraphs()
        {
            
    $value $this->bbtags_html();
            
            
    // Rebuilding the paragraphs //
            
    $paragraphs explode("\n"$value);
            for (
    $i 0$i count ($paragraphs); $i++)
            {
                
    $paragraphs[$i] = '<p>' $paragraphs[$i] . '</p>';
            }
            
    $value implode (''$paragraphs);
            
            
    // Making the paragraphs HTML
            
    $value preg_replace('/<(.p?)>/''<$1>'$value);
            
            
    // Return the value
            
    return $value;
        }
        
        
    // Function to convert the bbtages to html. //
        
    private function bbtags_html()
        {
            
    // Variables
            
    $value $this->sanitize_htmltags();
            
            
    /*
                Formating Tags
            */
            // Bold //
            
    $pattern[0] = "/\[b\](.*?)\[\/b\]/is";
            
    $replace[0] = "<strong>$1</strong>";
            
            
    // Italic //
            
    $pattern[1] = "/\[i\](.*?)\[\/i\]/is";
            
    $replace[1] = "<i>$1</i>";
            
            
    // Underlined //
            
    $pattern[2] = "/\[u\](.*?)\[\/u\]/is";
            
    $replace[2] = "<u>$1</u>";
            
            
    // url //
            
    $pattern[3] = "/\[url\](.*?)\[\/url\]/is";
            
    $replace[3] = "<a href=\"$1\">$1</a>";
            
            
    // img //
            
    $pattern[4] = "/\[img\](.*?)\[\/img\]/is";
            
    $replace[4] = "<img src=\"$1\" alt=\"$1\" \>";
            
            
    // quote //
            
    $pattern[5] = "/\[quote\](.*?)\[\/quote\]/is";
            
    $replace[5] = "<div class=\"quote\">$1</div>";
            
            
    // code //
            
    $pattern[6] = "/\[code\](.*?)\[\/code\]/is";
            
    $replace[6] = "<div class=\"code\">$1</div>";
            
            
    // list element // 
            
    $pattern[7] = '/\[\*\]([\w\W]+?)\n?(?=(?:(?:\[\*\])|(?:\[\/LIST\])))/is'
            
    $replace[7] = "<li>$1</li>";
            
            
    // Numbered list // 
            
    $pattern[8] = "/\[LIST=1\](.*?)\[\/LIST\]/is"
            
    $replace[8] = "<ol>$1</ol>"

            
    // unordered list // 
            
    $pattern[9] = "/\[LIST\](.*?)\[\/LIST\]/is"
            
    $replace[9] = "<ul>$1</ul>";  
            
            
    /*
                Image Tags
            */
            
            // smile
            
    $pattern[10] = '/\:\)/';
            
    $replace[10] = '<img src="/include/class/bbeditor/images/smilies/emoticon_smile.png" alt=":)" />';
        
            
    // tongue 
            
    $pattern[11] = '/\:p/';
            
    $replace[11] = '<img src="/include/class/bbeditor/images/smilies/emoticon_tongue.png" alt=":p" />';
            
    $pattern[12] = '/\:P/';
            
    $replace[12] = '<img src="/include/class/bbeditor/images/smilies/emoticon_tongue.png" alt=":p" />';
            
            
    // happy 
            
    $pattern[13] = '/\:d/';
            
    $replace[13] = '<img src="/include/class/bbeditor/images/smilies/emoticon_happy.png" alt=":D" />';
            
    $pattern[14] = '/\:D/';
            
    $replace[14] = '<img src="/include/class/bbeditor/images/smilies/emoticon_happy.png" alt=":D" />';
            
            
    // Kitty Smile
            
    $pattern[15] = '/\:3/';
            
    $replace[15] = '<img src="/include/class/bbeditor/images/smilies/emoticon_waii.png" alt=":3" />';
            
            
    // grin 
            
    $pattern[16] = '/\:grin\:/';
            
    $replace[16] = '<img src="/include/class/bbeditor/images/smilies/emoticon_grin.png" alt=":grin:" />';
            
    $pattern[17] = '/\:GRIN\:/';
            
    $replace[17] = '<img src="/include/class/bbeditor/images/smilies/emoticon_grin.png" alt=":grin:" />';
            
            
    // wink
            
    $pattern[18] = '/\;\)/';
            
    $replace[18] = '<img src="/include/class/bbeditor/images/smilies/emoticon_wink.png" alt=";)" />';
            
            
    // twisted 
            
    $pattern[19] = '/\:twisted\:/';
            
    $replace[19] = '<img src="/include/class/bbeditor/images/smilies/emoticon_evilgrin.png" alt=":twisted:" />';
            
    $pattern[20] = '/\:TWISTED\:/';
            
    $replace[20] = '<img src="/include/class/bbeditor/images/smilies/emoticon_evilgrin.png" alt=":twisted:" />';
            
            
    // surprised
            
    $pattern[21] = '/\:o/';
            
    $replace[21] = '<img src="/include/class/bbeditor/images/smilies/emoticon_surprised.png" alt=":o" />';
            
    $pattern[22] = '/\:O/';
            
    $replace[22] = '<img src="/include/class/bbeditor/images/smilies/emoticon_surprised.png" alt=":o" />';
            
            
    // sad
            
    $pattern[23] = '/\:\(/';
            
    $replace[23] = '<img src="/include/class/bbeditor/images/smilies/emoticon_unhappy.png" alt=":(" />';

            
            
    /*
                Output
            */
            
    $value preg_replace($pattern$replace$value);
            return 
    $value;
        }
        
        
    // Function to check for errors. //
        
    private function error_checking()
        {
            
    $value $this->sanitize_htmltags();

            
    // Trimming excess space from the value. // 
            
    if(!$value || strlen($value trim($value)) == 0)
            {
                
    // If the value is empty. //
                
    return "The post is empty.";
            }
                else if (
    preg_match('/[\w]{1,}/'$value)) 
            { 
                
    // Checking if the value is a number. //
                
    if (is_numeric($value)) 
                { 
                        
    // Sets the is numeric error. // 
                    
    return "The post is numeric it must be alpha-numeric.";
                }
                
    // The value is not a number so lets proceed. //
                
    else  
                    { 
                    
    // Now making sure the string is not to short to avoid laziness //
                    
    if (strlen($value) < 5)
                    {
                        
    // Sets the error as to short. //
                        
    return "The post is to short, It must be greater than 5 characters.";
                    }
                    else if (
    strlen($value) > 12500)
                    {
                        
    // Sets the error as to short. //
                        
    return "The post is to long, It must be less than 12500 characters.";
                    }
                }
            } 
        }
        
        
    // Function to return any errors. //
        
    public function error_return()
        {
            return 
    $this->error_checking();
        }
        
        
    // Function to return the post with bbcode. html tags removed. //
        
    public function return_bbcode()
        {
            return 
    $this->sanitize_htmltags();
        }
        
        
    // Function to return the post in html format for display. //
        
    public function return_formatedhtml()
        {
            return 
    $this->build_paragraphs();
        }
    }
    ?>
     
  16. angelwatt Moderator emeritus

    angelwatt

    Joined:
    Aug 16, 2005
    Location:
    USA
    #16
    How attached are you for using list=1 for ol? Makes it quite difficult to do the regex.

    Here's an update for the paragraph. I just added a if statement. It's essentially checking if the start of the line is a tag, and if so don't make it a paragraph. This stops it from wrapping ul/ol with a paragraph. It would also keep it from catching any line that starts with other tags, like strong. This can be further fined tuned if you like.
    PHP:
                // Check is start a tagged area
                
    if (substr($paragraphs[$i], 0,1) != '<') {
                  
    $paragraphs[$i] = '<p>' $paragraphs[$i] . '</p>';
                }
    On the pattern/replace arrays you don't need to give an index as you're adding to them so you can have,
    PHP:
    $pattern[] = '/.*/';
    Here's patterns/replacements for the ul, ol, and li,
    PHP:
            // unordered list // 
            
    $pattern[] = '/\[(\/?)list\]/is'
            
    $replace[] = '<$1ul>';       

            
    // Numbered list // 
            
    $pattern[] = '/\[(\/?)ol\]/is'
            
    $replace[] = '<$1ol>'

            
    // list element //         
            
    $pattern[] = '/\[\*\]([\w\W]+?)\n?(?=(?:(?:\[\*\])|(?:<\/ul>)|(?:<\/ul>)|(?:<li>)))/is'
            
    $replace[] = "<li>$1</li>\n";
    It assumes you're up for changing the way ordered list are done.

    The last piece is for handling nested lists, which gets executed right before all of the other replacements.
    PHP:
            // handle nested lists inside list items
            
    $value preg_replace('/\[\*\]([^\*]+?(?:\[list\]|\[ol\]).*?(?:\[\/list\]|\[\/ol\]))/is''<li>$1</li>'$value);
            
    $value preg_replace($pattern$replace$value);
     

Share This Page