Sed replace whitespace

Discussion in 'Mac Programming' started by Ti_Poussin, Aug 28, 2008.

  1. macrumors regular

    Ti_Poussin

    Joined:
    May 6, 2005
    #1
    I'm having some problem with sed on os x.5. I would like to remove leading white space from a file.

    I try those two sed command:
    sed -e 's/^[:space:]+//' file1.txt
    sed -e 's/^[ \t]+//' file1.txt

    With and without the -e and -E. Nothing seem to do it. What am I missing here?!? Is the os x sed standard?
     
  2. macrumors 68040

    lee1210

    Joined:
    Jan 10, 2005
    Location:
    Dallas, TX
    #2
    + for one or more of a pattern wasn't working in my copy, i'm not a sed guy so i'm not sure if that's standard or not. Just double it up:
    Code:
    sed -e 's/^[ \t][ \t]*//' file.txt
    or, really:
    Code:
    sed -e 's/^[ \t]*//' file.txt
    Since... what's the worst that happens? It replaces the beginning of the line with nothing?

    -Lee
     
  3. macrumors 601

    HiRez

    Joined:
    Jan 6, 2004
    Location:
    Western US
    #3
    Try:

    Code:
    sed 's/^[ \t]*//'
    EDIT: Rats...beat to it :p
     
  4. thread starter macrumors regular

    Ti_Poussin

    Joined:
    May 6, 2005
    #4
    no it's look like the old sed version that come with os x doesn't know what a tab is and doesn't match it. The only solution so far is install a real sed with Fink and solve the problem for me, but I need to distribute this to other user in my park, not cool at all.

    Really, Apple for unix tools implementation: they really s**k big time. I'm really getting tired of always having a weird install path to support, tools that behave oddly, old crap or have special implementation.

    The best solution will be a good linux in vmware or what?!

    right now I try to escape the tab char with something like that \'$'\t'', sadly not much success so far.
     
  5. thread starter macrumors regular

    Ti_Poussin

    Joined:
    May 6, 2005
    #5
    replace the + with * doesn't work either, any way remove 0 occurrence of space with nothing won't do much.
     
  6. macrumors 68040

    lee1210

    Joined:
    Jan 10, 2005
    Location:
    Dallas, TX
    #6
    You're using a different OS X than I'm using, then. OS X is a UNIX, not linux. I suppose that could be viewed as a problem but it generally just takes a little adjusting to BSD style switches to commands rather than the GNU/linux versions.

    I certainly didn't make up the regex without verifying it's functionality. If you need GNU versions of tools you can build them or use fink, but in this particular case I can't imagine why this wouldn't be working on your system.

    edit: I was wrong about the tab. I could clean this up with yet another edit, but that would be dishonest. There was a simple solution to this and from the docs this does seem to be standard behavior.

    -Lee

    Edit: Also, * means 0 or more occurrences of, not just 0 occurrences. I also verified that the regex works with typing the tab literally using ctrl+v, then typing tab.

    Edit: ... Ooops. the file i was using i had only indented with spaces. \t does not work. Apologies for the mistake. Using a literal tab character does work without issue. From this page:
    http://www.cims.nyu.edu/cgi-systems/info2html?(sed)Regular%20Expressions

    So literal tab it is, no \t
     
  7. macrumors newbie

    RidgeRacerType4

    Joined:
    May 11, 2005
    Location:
    San Antonio. TX
    #7
    Correct using \t is not part of all sets of regular expressions. (I believe '\t' and '+' are used in perl's extended version of regular expressions)

    when in doubt I always refer to this place: http://www.grymoire.com/Unix/Regular.html

    They got a convenient chart at the bottom of the page.

    They also got sed and awk tutorials as well: http://www.grymoire.com/Unix/

    got me through systems programming :D
     
  8. thread starter macrumors regular

    Ti_Poussin

    Joined:
    May 6, 2005
    #8
    yeah the literally tab should work (note the \x09 doesn't work either), haven't test it through. I found another solution in the between:

    expand -t4 oldFile > newfile
    sed ...

    The fact that bug me off the most is that it work in recent version of sed, Apple just throw us a old 2005 version of sed. The GNU version does indeed support it. The Recent BSD version too. Many tool that come with os x are really outdated, I had to update many of them for compatibility with other at work. I wish they release some update for those tools from time to time.
     
  9. macrumors 6502

    Joined:
    Jul 18, 2002
    Location:
    Orange County, CA
    #9
    I don't know if this will help, and I'm not even 100% that I have this right, because I'm just learning regular expressions today, but escaping the + sign in a regular expression on OS X 10.5.4 seems to make it mean "1 or more occurrences". Without the escape, it seems to take it literally as a plus sign.

    In other words, to get a list of all lines with a number in it, I had to specify

    Code:
    grep '[0-9]\+' testfile
    instead of

    Code:
    grep '[0-9]+' testfile
    which is backwards from how I understood it was supposed to work.
     
  10. macrumors 6502a

    Sayer

    Joined:
    Jan 4, 2002
    Location:
    Austin, TX
    #10
    Saying OS X is non-standard for UNIX and Linux is magically compatible and such is nonsense.

    Look at the configure/make files for any distro and you see just as much "hand holding" for non-standard paths for OSes other than Mac OS X (like, say, linux).

    And installing via Fink is not a "standard path" uhm ok. In this computer I have "special" paths for Fink, MacPorts and Mac OS X's UNIX layer (Darwin). Its a big freaking mess trying to be "open" no matter what OS you use.
     

Share This Page