Help with RegEx expression

Discussion in 'Mac Programming' started by macfaninpdx, Apr 11, 2007.

  1. macrumors regular

    Mar 6, 2007
    I am wondering if anyone can help me form a Unix RegEx command that will parse a text file. I am familiar with Unix commands, but the regular expression complexity always seems to evade my comprehension.

    Here is what I want to do:
    Read a text file (see file format below) and load into some javascript arrays. The text file looks like this:
    A bunch of text that can be ignored is at the top of the file.
    Blah blah blah.
    Filepath: [COLOR="Red"]/full/path/to/a/filename.ext[/COLOR]
    Filename: [COLOR="red"]filename[/COLOR]
    Title: filename [COLOR="red"]"Some Title"[/COLOR]
    File Contents:
    A bunch of other stuff I can igore.
    Filepath: [COLOR="Blue"]/full/path/to/a/filename2.ext[/COLOR]
    Filename: [COLOR="Blue"]filename2[/COLOR]
    Title: filename [COLOR="Blue"]"Some Other Title"[/COLOR]
    File Contents:
    A bunch of other stuff I can igore.
    So the end result will be that I will have three arrays. The "Red" text above will be the first element of the arrays, "Blue" will be second, etc. It will look like this:
    array1[0] = /full/path/to/a/filename.ext;
    array1[1] = /full/path/to/a/filename2.ext;
    array2[0] = filename;
    array2[1] = filename2;
    array3[0] = "Some Title"
    array3[1] = "Some Other Title"
    I realize that I am in need of a lot of help, so I appreciate any advice you can give. I also realize that a small script, or at least a couple of commands will probably be necessary. I am pouring over the man pages now, and scouring the web for examples as you read this.

    But if it is simple enough for someone to reply, I would tremendously appreciate it. :)

    Thanks in advance!
  2. macrumors 68000


    Jun 6, 2003
    District of Columbia
    You want the script to generate Javascript code, or you want it to generate javascript arrays?

    In either case, the regex are pretty simple. These are perl style regex:

    /^Filepath: (.*)$/
    /^Filename: (.*)$/
    /^Title: (.*)$/

    Explanation of the first one (other are two are very similar): Match a line that begins with the text "Filepath: " and capture the part between the ": " and the end of the line.

    In perl, for example, after executing "/^Filepath: (.*)$/", the information on that line would be extracted into a variable called $1. I imagine there is something similar in Javascript too. I still don't quite understand what you're trying to do.
  3. thread starter macrumors regular

    Mar 6, 2007
    It might make more sense if I give the bigger picture. I am creating a Widget. In the JavaScript code of the widget, I will be using a Unix shell command to read and parse a text file. The stdout will be stored in a Javascript variable, or a Javascript array, whichever is easier depending on the output.

    So, in a Widget JavaScript I can do something like this:
    var myresult = widget.system("/bin/egrep '<regex expression here>' ~/inputfile.txt").outputString;
    I will look at the Perl expressions you listed. If the variable myresult (above) is a text string, I can split it into an array, which will be perfect.

  4. thread starter macrumors regular

    Mar 6, 2007
    Got it working

    OK, using the perl suggestion above, I came up with the following:
    var myresult = widget.system("/usr/bin/egrep '^Filepath: (.*)$' inputfile.txt", null).outputString;
    var myarray = myresult.split(/(\n)/);
    Thanks for your help. The reason I was having so many problems in the first place was because the text file hac Mac line endings instead of Unix, so I had to use tr to convert it. Then my results were much easier to understand. ;)

    One more question: The output above returns the entire line, including the "Filepath: " text. Is there a way I can return the line not including the "Filepath: "? I can always clean it up afterward using a replace, but I am always trying to learn more.


Share This Page