Using Grep patterns in Textwrangler

Discussion in 'Mac Apps and Mac App Store' started by namygirl, Feb 16, 2007.

  1. namygirl macrumors newbie

    Joined:
    Feb 16, 2007
    #1
    Hi,

    I'm new to using textwrangler and want to find out how to use a grep search pattern to extract the first line from each paragraph in my data files.

    The files consist of anywhere between 100 and 3000 sets of data, each paragraph is preceeded by a single blank line, and consists of between 3 and 5 lines, each with a line return at the end. The files are in a .rtf format. I have tried using several different grep patterns to extract the first line from each paragraph, but so far have not had any success.

    If anyone has any suggestions I would be exceedingly grateful. I have worked out that this could save me 200 hours of work!

    Thanks,

    Amy
     
  2. djdawson macrumors member

    djdawson

    Joined:
    Apr 28, 2005
    Location:
    Minnesota
    #2
    I did some testing with .rtf files (created in TextEdit, so they might be different than yours) and it looks like they precede carriage return characters (CR) with the backslash character, "\". In grep, you can use the two-character shorthand of "\r" to match a CR, and the pattern "\\" to match a single "\" character. Since a blank line is normally just two consecutive carriage returns, blank lines in a .rtf document appear in TextWrangler as lines with just a single "\" on them. You could match such a line in TextWrangler with this pattern:

    ^\\\r

    This means the beginning of a line (^) followed by a "\" (\\) followed by a carriage return (\r).

    To get the next line, you'd have to add a pattern that will match anything that's not a carriage return, which is this pattern:

    [^\r]*

    Combined, this results in this grep search pattern:

    ^\\\r[^\r]*

    I couldn't find a way in TextWrangler to have it output just the matched lines to a file, but there may be a way to do that.

    Personally, I'd try to do this on plain text files instead, since that way you don't have to worry about those pesky backslash characters. If that's an option and you have more questions about doing that, post a follow-up here and I'll try to help.

    HTH - Good luck!
     

Share This Page