PDA

View Full Version : Text extraction help




etnad
May 3, 2013, 08:20 AM
Hello and thanks in advance I hope someone can suggest an Apple script to solve this problem:

I need to extract parts of text located in various files which are in various different folders all contained in one Folder as such:

Folder Main contains Folder A Folder B Folder C and so on where the various files are. Folder Main is always the one where I need to find my files.

Each plain text UTF8 file has the same structure no matter in which folder it is.

A - Some text of various length
B - this exact unique sequence in every file in every folder: CR Student_ID tab tab CR. this sequence marks the beginning or the text I need-
C - the relevant part I need after the sequence CR Student_ID tab tab CR till the End of Each file. The length can change but it is not relevant as I need everything till the end of each file-

How can I write a script which will only return a text file with only "part C" in each file within the various folders?

The text : CR Student_ID tab tab CR does not appear anywhere else in the files. therefore it could be a starting point for extracting the text all the way to the end of each file


Thanks a lot for any help I am unable to solve this problem



Big Dave
May 3, 2013, 07:06 PM
I don't think you need an apple script to do this. Here is how I would do it.

Open a terminal.
cd to your Main folder. Ex: cd /Users/yourname/Documents/folder/

Now we want to use find to locate your files.

find ./ -name "*somecommonterm*" -exec sed -n -e '/CR Student_ID//,$p' {} \;

This command just looked for your files and then printed everything after the search term and printed it to the screen. if you want it to write out to a file, then..

find ./ -name "*somecommonterm*" -exec sed -n -e '/CR Student_ID//,$p' {} \; > outfile

This will overwrite every time you find a new file so use >> outfile to have all your results go to one file.

-Big Dave

etnad
May 13, 2013, 04:40 PM
Thanks ...