View Full Version : preg_replace to only allow <br />
babyjenniferLB
May 4, 2009, 08:38 AM
^_^ hi hi i would like my preg replace to only allow the <br /> tag if anyone can help.
looks like this just now
$value = preg_replace('/<\/?(?:\b(?!)[^>]+?)>/i', '', $value);
also i have bbcode for lists it looks like the following
[ LIST=1 ]
item1
item2
[ /LIST ]
but i am unsure how to get a good replace for it
// Numbered list //
$pattern[7] = "/\[LIST=1\](.*?)\[\/LIST\]/is";
$replace[7] = "<ol>$1</ol>";
$pattern[8] = "/\[*\](.*?)\[\/*\]/is";
$replace[8] = "<li>$1</li>";
angelwatt
May 4, 2009, 10:55 AM
I didn't thoroughly test this, but I believe this regex should work for the br,
/<\/?((?!(br))[^>]+?)>/i
The (?!(br)) is saying make sure the start of the tag does not contain 'br.' So this also would allow for <brown> but not <abraham>. Off hand I couldn't think of any legitimate HTML tags that have br in it. The regex could be modified to take care of such cases though.
I'll have to stare at the BBCode some more before I have a solution for that.
babyjenniferLB
May 4, 2009, 11:03 AM
^_^ thanks, i am plodding along on a few other things before i look at it again later today. I tend to need a break before the answer comes.
What would be more useful and cleaner is i am trying to allow this to support line breaks from a text area to php. Would be good if the text blocks there made into paragraphs if there is a linebreak.
such as the user enters
Hello this is my post.
Hey
php returns just now
Hello this is my post.<br />Hey
but
<p>Hello this is my post.</p><p>Hey</p>
would be much better.
EDIT: sorted this bit with
$paragraphs = explode("\n", $value);
for ($i = 0; $i < count ($paragraphs); $i++)
{
$paragraphs[$i] = '<p>' . $paragraphs[$i] . '</p>';
}
$value = implode ('', $paragraphs);
angelwatt
May 4, 2009, 11:42 AM
OK got the [ * ] part figured out,
\[\*\]([\w\W]+?)\n?(?=(?:(?:\[\*\])|(?:\[\/list\])))
My test case for this was,
hello there.
second oops
third
I think you pretty well had the ol part figured out. If not, let me know.
Note: Given this regex you'll want to do this regex before the list one.
babyjenniferLB
May 4, 2009, 12:18 PM
Note: Given this regex you'll want to do this regex before the list one.
This is the function just now so were would i want to fit it in?
// Function to convert the bbtages to html. //
private function bbtags_html()
{
// Variables
$value = $this->build_paragraphs();
/*
Formating Tags
*/
// Bold //
$pattern[0] = "/\[b\](.*?)\[\/b\]/is";
$replace[0] = "<strong>$1</strong>";
// Italic //
$pattern[1] = "/\[i\](.*?)\[\/i\]/is";
$replace[1] = "<i>$1</i>";
// Underlined //
$pattern[2] = "/\[u\](.*?)\[\/u\]/is";
$replace[2] = "<u>$1</u>";
// url //
$pattern[3] = "/\[url\](.*?)\[\/url\]/is";
$replace[3] = "<a href=\"$1\">$1</a>";
// img //
$pattern[4] = "/\[img\](.*?)\[\/img\]/is";
$replace[4] = "<img src=\"$1\" alt=\"$1\" \>";
// quote //
$pattern[5] = "/\[quote\](.*?)\[\/quote\]/is";
$replace[5] = "<div class=\"quote\">$1</div>";
// code //
$pattern[6] = "/\[code\](.*?)\[\/code\]/is";
$replace[6] = "<div class=\"code\">$1</div>";
// Numbered list //
$pattern[7] = "/\[LIST=1\](.*?)\[\/LIST\]/is";
$replace[7] = "<ol>$1</ol>";
// unordered list //
$pattern[8] = "/\[LIST\](.*?)\[\/LIST\]/is";
$replace[8] = "<ul>$1</ul>";
// list element //
$pattern[9] = "/\[*\](.*?)\[\/*\]/is";
$replace[9] = "<li>$1</li>";
/*
Image Tags
*/
// smile
$pattern[10] = '/\:\)/';
$replace[10] = '<img src="/include/class/bbeditor/images/smilies/emoticon_smile.png" alt=":)" />';
// tongue
$pattern[11] = '/\:p/';
$replace[11] = '<img src="/include/class/bbeditor/images/smilies/emoticon_tongue.png" alt=":p" />';
$pattern[12] = '/\:P/';
$replace[12] = '<img src="/include/class/bbeditor/images/smilies/emoticon_tongue.png" alt=":p" />';
/*
Output
*/
$value = preg_replace($pattern, $replace, $value);
return $value;
}
angelwatt
May 4, 2009, 12:27 PM
This is the function just now so were would i want to fit it in?
Mine would become #7 and your current #9 would disappear.
babyjenniferLB
May 4, 2009, 12:34 PM
Mine would become #7 and your current #9 would disappear.
I done it as so
// Numbered list //
$pattern[7] = "\[\*\]([\w\W]+?)\n?(?=(?:(?:\[\*\])|(?:\[\/list\])))";
$replace[7] = "<ol>$1</ol>";
though it seems to have nocked out the function as it does not return the value now.
angelwatt
May 4, 2009, 12:38 PM
// Numbered list //
$pattern[7] = "\[\*\]([\w\W]+?)\n?(?=(?:(?:\[\*\])|(?:\[\/list\])))";
$replace[7] = "<ol>$1</ol>";
though it seems to have nocked out the function as it does not return the value now.
That regex was for the li not ol. So the replace array needs updating.
babyjenniferLB
May 4, 2009, 12:56 PM
// list element //
$pattern[7] = "\[\*\]([\w\W]+?)\n?(?=(?:(?:\[\*\])|(?:\[\/list\])))";
$replace[7] = "<li>$1</li>";
// Numbered list //
$pattern[8] = "/\[list=1\](.*?)\[\/LIST\]/is";
$replace[8] = "<ol>$1</ol>";
// unordered list //
$pattern[9] = "/\[LIST\](.*?)\[\/LIST\]/is";
$replace[9] = "<ul>$1</ul>";
like this returns
Warning: preg_replace() [function.preg-replace]: Delimiter must not be alphanumeric or backslash in /home/abcomfor/development_html/include/class/textarea.class.php on line 152
angelwatt
May 4, 2009, 01:06 PM
Warning: preg_replace() [function.preg-replace]: Delimiter must not be alphanumeric or backslash in /home/abcomfor/development_html/include/class/textarea.class.php on line 152
I believe it needs the / and / around the regex I gave. I've been messing around with the regex in a non-PHP environment so forgot to include them.
A note for the ol and ul regexes; They currently won't be able to handle nested lists. It is uncommon for people to use them, though admittedly I have here on Mac Rumors. A single regex won't be able to handle it, but I think a regex with a for loop may be able to do it. You may also want to download someone else's BBCode parser to see how they handle it. (A quick Google showed many parsers just ignored lists. The wimps :))
Essentially you could use,
/\[list\]([\w\W]+)\[\/list\]/
with a replace and loop over it until no match is found. At least that would work in theory. I have my doubts whether it would hold up when multiple lists are used in the post. It may need a fancier regex.
babyjenniferLB
May 4, 2009, 01:24 PM
// list element //
$pattern[7] = '/\[\*\]([\w\W]+?)\n?(?=(?:(?:\[\*\])|(?:\[\/list\])))/';
$replace[7] = "<li>$1</li>";
// Numbered list //
$pattern[8] = "/\[list=1\](.*?)\[\/LIST\]/is";
$replace[8] = "<ol>$1</ol>";
// unordered list //
$pattern[9] = "/\[LIST\](.*?)\[\/LIST\]/is";
$replace[9] = "<ul>$1</ul>";
if you try it out (it posts the input formated) making a list 1 2 3 it returns <li>1</li><li>2</i>[ * ]3
try me (http://development.abcomforts.com/blog/1/beginnings/newcomment)
and my other function happily shoves the <ol> ect in <p> tags >.< need more logics.
private function build_paragraphs()
{
$value = $this->bbtags_html();
// Rebuilding the paragraphs //
$paragraphs = explode("\n", $value);
for ($i = 0; $i < count ($paragraphs); $i++)
{
$paragraphs[$i] = '<p>' . $paragraphs[$i] . '</p>';
}
$value = implode ('', $paragraphs);
// Making the paragraphs HTML
$value = preg_replace('/<(.p?)>/', '<$1>', $value);
// Return the value
return $value;
}
angelwatt
May 4, 2009, 01:30 PM
It seemed to work for me.
Given:
1
2
3
I got back,
<ul>
<p></p><li>1
</li><li>2
</li><li>3
</li></ul>
babyjenniferLB
May 4, 2009, 01:35 PM
with the same data i got back
<p><ul>
</p><p><li>1
</li><li>2
</li> 3
</p><p></ul></p>
Ah i found the issue, the bbcode i am using is uppercase, but the actual replace is lower case
angelwatt
May 4, 2009, 01:37 PM
Hmm, interesting. For what it's worth I'm on Windows XP using Firefox 3.0.10.
Are you doing the paragraph part before or after the BBCode replacements?
babyjenniferLB
May 4, 2009, 01:43 PM
Hmm, interesting. For what it's worth I'm on Windows XP using Firefox 3.0.10.
Are you doing the paragraph part before or after the BBCode replacements?
The paragraph bit happens after the bbcode, i have tired it both ways here is the full class for you to have a noodle. I got the list thing working by adding a is at the end this means i think ignore case.
<?php
/************************************************************************************
Kittenbunny CMS
Filename: textarea.class.php
Class: Validate Textarea
************************************************************************************/
class textarea
{
// Variables
public $post;
// Function to remove html tags //
private function sanitize_htmltags()
{
$value = $this->post;
// Sets the allowed taggs //
$value = preg_replace('/<\/?(?:\b(?!)[^>]+?)>/i', '', $value);
// Does the htmlspecialchars bit //
$value = htmlspecialchars($value, ENT_QUOTES, "UTF-8");
return $value;
}
// Adds paragraph tags to the html
private function build_paragraphs()
{
$value = $this->bbtags_html();
// Rebuilding the paragraphs //
$paragraphs = explode("\n", $value);
for ($i = 0; $i < count ($paragraphs); $i++)
{
$paragraphs[$i] = '<p>' . $paragraphs[$i] . '</p>';
}
$value = implode ('', $paragraphs);
// Making the paragraphs HTML
$value = preg_replace('/<(.p?)>/', '<$1>', $value);
// Return the value
return $value;
}
// Function to convert the bbtages to html. //
private function bbtags_html()
{
// Variables
$value = $this->sanitize_htmltags();
/*
Formating Tags
*/
// Bold //
$pattern[0] = "/\[b\](.*?)\[\/b\]/is";
$replace[0] = "<strong>$1</strong>";
// Italic //
$pattern[1] = "/\[i\](.*?)\[\/i\]/is";
$replace[1] = "<i>$1</i>";
// Underlined //
$pattern[2] = "/\[u\](.*?)\[\/u\]/is";
$replace[2] = "<u>$1</u>";
// url //
$pattern[3] = "/\[url\](.*?)\[\/url\]/is";
$replace[3] = "<a href=\"$1\">$1</a>";
// img //
$pattern[4] = "/\[img\](.*?)\[\/img\]/is";
$replace[4] = "<img src=\"$1\" alt=\"$1\" \>";
// quote //
$pattern[5] = "/\[quote\](.*?)\[\/quote\]/is";
$replace[5] = "<div class=\"quote\">$1</div>";
// code //
$pattern[6] = "/\[code\](.*?)\[\/code\]/is";
$replace[6] = "<div class=\"code\">$1</div>";
// list element //
$pattern[7] = '/\[\*\]([\w\W]+?)\n?(?=(?:(?:\[\*\])|(?:\[\/LIST\])))/is';
$replace[7] = "<li>$1</li>";
// Numbered list //
$pattern[8] = "/\[LIST=1\](.*?)\[\/LIST\]/is";
$replace[8] = "<ol>$1</ol>";
// unordered list //
$pattern[9] = "/\[LIST\](.*?)\[\/LIST\]/is";
$replace[9] = "<ul>$1</ul>";
/*
Image Tags
*/
// smile
$pattern[10] = '/\:\)/';
$replace[10] = '<img src="/include/class/bbeditor/images/smilies/emoticon_smile.png" alt=":)" />';
// tongue
$pattern[11] = '/\:p/';
$replace[11] = '<img src="/include/class/bbeditor/images/smilies/emoticon_tongue.png" alt=":p" />';
$pattern[12] = '/\:P/';
$replace[12] = '<img src="/include/class/bbeditor/images/smilies/emoticon_tongue.png" alt=":p" />';
// happy
$pattern[13] = '/\:d/';
$replace[13] = '<img src="/include/class/bbeditor/images/smilies/emoticon_happy.png" alt=":D" />';
$pattern[14] = '/\:D/';
$replace[14] = '<img src="/include/class/bbeditor/images/smilies/emoticon_happy.png" alt=":D" />';
// Kitty Smile
$pattern[15] = '/\:3/';
$replace[15] = '<img src="/include/class/bbeditor/images/smilies/emoticon_waii.png" alt=":3" />';
// grin
$pattern[16] = '/\:grin\:/';
$replace[16] = '<img src="/include/class/bbeditor/images/smilies/emoticon_grin.png" alt=":grin:" />';
$pattern[17] = '/\:GRIN\:/';
$replace[17] = '<img src="/include/class/bbeditor/images/smilies/emoticon_grin.png" alt=":grin:" />';
// wink
$pattern[18] = '/\;\)/';
$replace[18] = '<img src="/include/class/bbeditor/images/smilies/emoticon_wink.png" alt=";)" />';
// twisted
$pattern[19] = '/\:twisted\:/';
$replace[19] = '<img src="/include/class/bbeditor/images/smilies/emoticon_evilgrin.png" alt=":twisted:" />';
$pattern[20] = '/\:TWISTED\:/';
$replace[20] = '<img src="/include/class/bbeditor/images/smilies/emoticon_evilgrin.png" alt=":twisted:" />';
// surprised
$pattern[21] = '/\:o/';
$replace[21] = '<img src="/include/class/bbeditor/images/smilies/emoticon_surprised.png" alt=":o" />';
$pattern[22] = '/\:O/';
$replace[22] = '<img src="/include/class/bbeditor/images/smilies/emoticon_surprised.png" alt=":o" />';
// sad
$pattern[23] = '/\:\(/';
$replace[23] = '<img src="/include/class/bbeditor/images/smilies/emoticon_unhappy.png" alt=":(" />';
/*
Output
*/
$value = preg_replace($pattern, $replace, $value);
return $value;
}
// Function to check for errors. //
private function error_checking()
{
$value = $this->sanitize_htmltags();
// Trimming excess space from the value. //
if(!$value || strlen($value = trim($value)) == 0)
{
// If the value is empty. //
return "The post is empty.";
}
else if (preg_match('/[\w]{1,}/', $value))
{
// Checking if the value is a number. //
if (is_numeric($value))
{
// Sets the is numeric error. //
return "The post is numeric it must be alpha-numeric.";
}
// The value is not a number so lets proceed. //
else
{
// Now making sure the string is not to short to avoid laziness //
if (strlen($value) < 5)
{
// Sets the error as to short. //
return "The post is to short, It must be greater than 5 characters.";
}
else if (strlen($value) > 12500)
{
// Sets the error as to short. //
return "The post is to long, It must be less than 12500 characters.";
}
}
}
}
// Function to return any errors. //
public function error_return()
{
return $this->error_checking();
}
// Function to return the post with bbcode. html tags removed. //
public function return_bbcode()
{
return $this->sanitize_htmltags();
}
// Function to return the post in html format for display. //
public function return_formatedhtml()
{
return $this->build_paragraphs();
}
}
?>
angelwatt
May 4, 2009, 10:11 PM
How attached are you for using list=1 for ol? Makes it quite difficult to do the regex.
Here's an update for the paragraph. I just added a if statement. It's essentially checking if the start of the line is a tag, and if so don't make it a paragraph. This stops it from wrapping ul/ol with a paragraph. It would also keep it from catching any line that starts with other tags, like strong. This can be further fined tuned if you like.
// Check is start a tagged area
if (substr($paragraphs[$i], 0,1) != '<') {
$paragraphs[$i] = '<p>' . $paragraphs[$i] . '</p>';
}
On the pattern/replace arrays you don't need to give an index as you're adding to them so you can have,
$pattern[] = '/.*/';
Here's patterns/replacements for the ul, ol, and li,
// unordered list //
$pattern[] = '/\[(\/?)list\]/is';
$replace[] = '<$1ul>';
// Numbered list //
$pattern[] = '/\[(\/?)ol\]/is';
$replace[] = '<$1ol>';
// list element //
$pattern[] = '/\[\*\]([\w\W]+?)\n?(?=(?:(?:\[\*\])|(?:<\/ul>)|(?:<\/ul>)|(?:<li>)))/is';
$replace[] = "<li>$1</li>\n";
It assumes you're up for changing the way ordered list are done.
The last piece is for handling nested lists, which gets executed right before all of the other replacements.
// handle nested lists inside list items
$value = preg_replace('/\[\*\]([^\*]+?(?:\[list\]|\[ol\]).*?(?:\[\/list\]|\[\/ol\]))/is', '<li>$1</li>', $value);
$value = preg_replace($pattern, $replace, $value);
vBulletin® v3.6.10, Copyright ©2000-2009, Jelsoft Enterprises Ltd.