Search and replace

Discussion in 'Mac Programming' started by mysterytramp, Oct 22, 2009.

  1. mysterytramp macrumors 65816


    Jul 17, 2008
    This should be an easy project but one that's eaten up way too much time.

    I have about 50 database files (text), between 750 and 850K, where I need to replace the pipe character ("|") for a tab so the files can be easily imported into Excel or any other spreadsheet.

    I was willing to let Automator do the job, even if it would take loads longer, but Automator looks really deficient in the text manipulation department.

    It probably looks like a job for sed or perl, but the quickie tutorials I could find left out how to handle things like the tab character. So I'm doing the weasel thing in asking for help for something I should be able to do myself.

    Can you help a guy out?


  2. HiRez macrumors 603


    Jan 6, 2004
    Western US
    I would just use the free TextWrangler, which has a multifile search and replace function. Using TextWrangler 3.0 (the menus are different in previous versions), select Multi-File Search from the Search menu, use \t (backslash-lowercase-t) to represent the Tab character, then select the folders you want to search in (make sure ONLY the files you want to modify are in this folder and make a copy before you try this).

    Certainly you could use sed, awk, etc. but TextWrangler is easy and it's a great app to have anyway, I use it for all kinds of things.
  3. chown33 macrumors 604

    Aug 9, 2009
    Sailing beyond the sunset
    +1 for TextWrangler.

    If it's a one-for-one replacement of '|' with '\t', also see the 'tr' command. Sed, awk, perl, etc. are good when you want something more than straight translation. But tr is the command of choice for plain translation.
  4. mysterytramp thread starter macrumors 65816


    Jul 17, 2008
    I had my doubts about TextWrangler. Though I use it for a lot of text processing, I didn't think it was the best tool for this job. Whew, definitely way wrong ...

    First thing I did was try to get its AppleScript dictionary. For some reason, the dictionary wouldn't open, but strangely, it's recordable (do people even make recordable apps any more?). That gave me the basics of a script that zipped through the files, as well as 300 or so more. I expect some Unix voodoo would have been even faster, but the performance was adequate.

  5. ChOas macrumors regular

    Nov 24, 2006
    The Netherlands
    I might be a bit late, but from the commandline you can just do this (in the dir where your textfiles are):

    perl -pi'*.org' -e 's/\|/\t/g' *
    This would create a copy of all your files with an .org extension and replace all
    the pipes in the files with a tab.
  6. BertyBoy macrumors 6502

    Feb 1, 2009
    +1 for the one line Unix solutions. In sed.

    for i in *.dat; do sed 's/|/<tab>/g' $i > $i.tsv; done

    where <tab> means to use the tab key.

    You could do something similar in any number of Unix scripting languages.

    OR, you could just import it into Excel, which you've stated you want to do anyway. Or MySQL, Or Oracle, or whatever. All these allow you to define the field separator.
  7. balamw Moderator


    Staff Member

    Aug 16, 2005
    New England
    What's wrong with tr?

    tr '|' '\t'
    seems to do the trick too. So many ways to skin a cat in unix!

    You could wrap that in a for loop like above

    for i in *.dat; do tr '|' '\t' < $i > $i.tsv; done
    I'm sure there's a way to do it in awk as well since both input and output are delimited, and you could probably use find/xargs instead of for depending on where the files are.


Share This Page