method to extract data from multiple csv files

Discussion in 'Mac Programming' started by shaunb83, Jun 8, 2010.

  1. shaunb83 macrumors member

    Dec 11, 2008

    Let me preface this by saying that I'm a programming idiot, mostly used to a little bit of visual basic and matlab stuff - nothing at all advanced. I was just wondering if anyone could give me some ideas about how I could simply pull this off. I currently do the following manually in excel, which takes me about 1 minute per file, and I have to do about 50 files a day - so this is something that I would like to solve, but isn't worth days of my life.

    I have multiple, identical comma delimited txt files generated from an analytical instrument. I need to extract data from specific parts of the csv file, compute some simple statistics (average, standard deviation) on that data, and output the statistics and the file name to a separate file.

    What I want to be able to do is the following:

    (1) Be able to select multiple comma delimited text files in a directory that I want to crunch the numbers on.

    (2) Have the macro calculate an average and standard deviation on the values in particular columns (it will be the same set of columns) in each file

    (3) Write out that data (average and standard deviation for each column) to a separate spreadsheet (or the active workbook or whatever) as a row of data from each file, with the averages and accompanying standard deviations, and also write out the name of the file that the data was generated from.

    What I have done so far is fairly inelegant I think - basically I have a macro that will insert each file into a separate worksheet in an excel workbook which I then need to manually crunch the numbers on.

    Thanks for any advice or suggestions about the best way to do this.
  2. mrbash macrumors 6502

    Aug 10, 2008
    ignoring the header, read each line as a string. Then tokenize the line with the "," character. You should then have an array of strings. The index of wich corresponds to the columns in each file.

    You don't even need to write a program to do this You can use awk and sed and some shell scripting.

    awk -F "*,*" '{print $3}' file.csv

    The above would print out the 3rd column of file.csv. You can pipe that to wc or sum to get the numbers that you want.

Share This Page