Parsing Data Removing Line Breaks

Discussion in 'Web Design and Development' started by Vitaminwater, Oct 5, 2010.

  1. Vitaminwater macrumors newbie

    Aug 18, 2009
    Hello All,

    Currently, I'm trying to parse some information from another website using file_get_contents. After I have the source code I go through and parse out some information that I need. This information is then going to be put into a database. However, i'm having some trouble manipulating the last bit on information. Here is an example of how the information I need is being displayed:

    I would like to have the line breaks removed from this string of test. I have tried to use various methods of str_replace, preg_replace etc, but none have seemed to do what I want.

    When completed I would also like to add <br /> to the end of each line, (which i should be able to do) so that the plaintext output to be like this:
    Any help is greatly appreciated! Thanks!
  2. angelwatt Moderator emeritus


    Aug 16, 2005
    Can you show what you've tried? I've done this with preg_replace a number of times so I know it's possible. Just need to see where you may be messing up. A function that may help in the later part of this process is the nl2br function.
  3. Vitaminwater thread starter macrumors newbie

    Aug 18, 2009
    Well, Currently I'm using
    $dinnerMenu = ereg_replace("[\n\r]", "\t", $dinnerMenu);
    $dinnerMenu = ereg_replace("\t\t+", "\n", $dinnerMenu);
    to make it look close enough to what i'm trying to do, ( but when i try to add the <br /><br /> to the end of the line it messes up ( I'm adding the br's with this line of code:
     $dinnerMenu=str_replace("\n","<br /><br />",$dinnerMenu);
  4. Darth.Titan macrumors 68030


    Oct 31, 2007
    Austin, TX
    Well using your example page, repeated newlines aren't the only problem. There's a whole bunch of excess tabs and spaces in there as well. I chose to get rid of any newlines that were at the beginning of a line, then I filtered out the excess tabs and spaces and ran the result through nl2br.

    This seems to get the result you said you wanted in the OP.
    $content file_get_contents('');

    $cleaned preg_replace("`^[\n\r]|\t|[ ]{2,}`","",$content);
    $final nl2br($cleaned);

  5. Vitaminwater thread starter macrumors newbie

    Aug 18, 2009
    Ah yes, that did it, thank you very much for your help Darth and angelwatt.

    Have a good day.

Share This Page