Create files from list in CSV

Discussion in 'Mac Programming' started by jasperx, Oct 17, 2012.

  1. macrumors newbie

    Joined:
    Oct 17, 2012
    #1
    I need to create several hundred files from a list exported from excel to a csv.
    The csv looks like this:
    Name1,City1,State1
    Name2,City2,State2
    continues to
    Name600,City600,State600

    The desired result is 600 files like this:
    Name1.txt which contains Name1,City1,State1
    Name2.txt which contains Name2,City2,State2

    My coding skills are quite rusty but it seems like this should be pretty simple
     
  2. macrumors 68000

    Joined:
    Mar 29, 2009
    Location:
    USA
  3. macrumors 6502a

    Joined:
    Mar 17, 2012
    Location:
    Belgium
    #3
    Try this :

    Code:
    -- csv file with Unix (LF) line feed
    set theFile to choose file with prompt "Choose your csv file"
    set theFolder to choose folder with prompt "Choose folder for the txt files"
    set theParagraphs to paragraphs of (read theFile)
    repeat with aParagraph in theParagraphs
    	try
    		set TID to AppleScript's text item delimiters
    		set AppleScript's text item delimiters to {","}
    		set theItems to text items of aParagraph
    		set theName to item 1 of theItems
    		set AppleScript's text item delimiters to TID
    		do shell script "echo " & quoted form of aParagraph & " > " & quoted form of POSIX path of (theFolder as string) & quoted form of theName & ".txt"
    	on error
    		set AppleScript's text item delimiters to TID
    	end try
    end repeat
    
     
  4. macrumors member

    Joined:
    Aug 28, 2008
    #4
    Code:
    #!/bin/bash
    
    for i in $(cat the.csv); do
      name=$(echo "$i"|awk -F',' '{print $1}')
      echo "doing $name"
      echo "$i" >${i}.txt
    done
    
    
     
  5. macrumors 6502

    Joined:
    Mar 8, 2004
    #5
    Obligatory Perl implementation:

    Code:
    #!/usr/bin/perl
    
    use strict;
    use warnings;
    
    open CSV, "<", "filename.csv" or die $!;
    
    while (<CSV>){
      my @line = split(',',$_);
      open OUTFILE, '>'.$line[0].'.csv' or die $!;
      print OUTFILE $_;
      close OUTFILE;
    }
    
    close CSV;
    
     
  6. thread starter macrumors newbie

    Joined:
    Oct 17, 2012
    #6
    OK had a go with the bash version... and something is not quite right
    The code is in a file rosetta which has:
    Code:
    #!/bin/bash
    for i in $(cat rosettaTest.csv); do
      name=$(echo "$i"|awk -F',' '{print $1}')
      echo "doing $name"
      echo "$i" >${i}.txt
    done
    Here is my output

    bash-3.2$ bash rosetta
    rosetta: line 1: {rtf1ansiansicpg1252cocoartf1138cocoasubrtf510: command not found
    rosetta: line 2: syntax error near unexpected token `}'
    rosetta: line 2: `{\fonttbl\f0\fmodern\fcharset0 Courier;}'
    bash-3.2$
     
  7. macrumors 6502a

    Joined:
    Jan 23, 2010
    Location:
    San Diego, CA USA
    #7
    You need to save the bash script as a text file. That looks like RTF. Once that is done, then...

    Code:
    bash> chmod +x rosetta
    bash> ./rosetta
     
  8. macrumors 68000

    Joined:
    Mar 29, 2009
    Location:
    USA
    #8
    Yup, use TextEdit and go to the Format menu (in the menu bar) and choose 'Make plain text'

    Then save the file. Or even better, use something like TextWrangler
     
  9. macrumors 6502

    Joined:
    Mar 8, 2004
    #9
    Or go full Command Line and use nano, vi, or emacs.

    Also if this is a one-off, you could even copy and paste that script into the command line and it should work.
     
  10. thread starter macrumors newbie

    Joined:
    Oct 17, 2012
    #10
    Making it a txt file made a huge difference. The script runs but is not doing as intended. Trying again on what the desired results would be:
    A file created for every record or line in a csv
    Each line or record in the csv has three values:
    Name1,City1,State1

    The desired result is to create a file for each record named by the first value in the record. If I have 10 records the result would be 10 files named:
    Name1.txt, Name2.txt....Name10.txt
    The contents of Name1.txt would be the record or Name1,City1,State1

    I ran this on a csv with 10 records. The output was 3 files... not 10.
    The first file contained records 1 and 2.
    The second file contained records 3,4,5,6
    The third file contained records 7,8,9,10
    Here is the code
    Code:
    #!/bin/bash
    
    for i in $(cat the.csv); do
      name=$(echo "$i"|awk -F',' '{print $1}')
      echo "doing $name"
      echo "$i" >${i}.txt
    done
     
  11. macrumors 603

    Joined:
    Aug 9, 2009
    #11
    Post your CSV data. Otherwise all anyone can do is guess what input produced your stated output.


    There's a bug in the code. Change this:
    Code:
      echo "$i" >${i}.txt
    
    to this:
    Code:
      echo "$i" >"${name}.txt"
    

    My example "the.csv" file (manually created):
    Code:
    able,akron,state
    bravo,baltimore,state
    charlie,city,state
    delta,city,state
    eeevill,eville,state
    foxtrot,tango,whiskey
    golf,glasgow,whisky
    hotel,homer,state
    indigo,indio,state
    juliet,juarez,state
    
     
  12. macrumors 6502a

    Joined:
    Mar 17, 2012
    Location:
    Belgium
    #12
    I was waiting for someone to catch that. It also doesn't handle spaces. Mac_Max's Obligatory Perl implementation creates csv files instead of txt files. (at least it did when I tested it) When fixed it's very fast. I wish I knew Perl.
     

    Attached Files:

  13. macrumors regular

    Joined:
    Aug 5, 2007
    #13
    Here's an implementation in Python:

    Code:
    import sys
    
    def createFiles(csv):
    	with open(csv, 'r') as f:
    		for line in f:
    			s = line.split(',')
    			with open(s[0] + '.txt', 'w') as w:
    				w.write(s[0] + ',' + s[1] + ',' + s[2])
                	
    createFiles(sys.argv[1])
    
    
    You can use it by running: python createFiles.py <example.csv>
     
  14. macrumors 603

    Joined:
    Aug 9, 2009
    #14
    Here's the all-awk version, which handles spaces in names:
    Code:
    awk -F',' '{ print $0 >($1 ".txt") }'  the.csv
    
    And here's the version that spits out each name:
    Code:
    awk -F',' '{ print "doing " $1 >"/dev/stderr"; print $0 >($1 ".txt") }'  the.csv
    
    So this would be the new contents of "rosetta":
    Code:
    #!/bin/bash
    awk -F',' '{ print $0 >($1 ".txt") }'  the.csv
    
    or replace that awk one-liner with the longer awk one-liner.


    Test data, in "the.csv":
    Code:
    able alfa,akron,state
    baker bravo,baltimore,state
    Charley Gordon,city,state
    delta dawn,city,state
    eeevill twin,eville,state
    foxtrot,tango,whiskey
    golf glove,glasgow,whisky
    hotel,homer,ak
    Inigo Montoya,indio,state
    Juliet is the sun,juarez,state
    
     
  15. Mac_Max, Oct 18, 2012
    Last edited: Oct 18, 2012

    macrumors 6502

    Joined:
    Mar 8, 2004
    #15
    Ah, yes in that case:

    Code:
    #!/usr/bin/perl
    
    use strict;
    use warnings;
    
    die "Please provide a filename\n" if @ARGV == 0;
    open CSV, "<", $ARGV[0] or die $!;
    
    while (<CSV>){
      my @line = split(',',$_);
      $line[0] =~ s/ /-/g; #comment this out with if you prefer to keep spaces as spaces.
      open OUTFILE, '>'.$line[0].'.txt' or die $!;
      print OUTFILE $_;
      close OUTFILE;
    }
    
    close CSV;
    
    My original version will handle spaces just fine but this version will also replace them with dashes (or whatever character you'd like, remember to escape regex control chars). It also now accepts the input file as an argument.

    Perl is a great language. It took me about three weeks to become armed and dangerous with it and about 5 months to become rather good at it. Here's the book I learned from: http://onyxneon.com/books/modern_perl/
     
  16. thread starter macrumors newbie

    Joined:
    Oct 17, 2012
    #16
    You guys are having way too much fun... running off and leaving me with my eyes crossed but each time a little closer to the desired result. Stick with me as I think we are getting close:

    This code as corrected and expanded on by chown33 :
    Code:
    #!/bin/bash
    for i in $(cat rosettaTest.csv); do
      name=$(echo "$i"|awk -F',' '{ print $0 >($1 ".txt") }'  rosettaTest.csv)
      echo "doing $name"
      echo "$i" >${name}.txt
    done
    Acting on this rosettaTest.csv which contains:
    13_01,1.70452E+11,182797
    13_02,1.7045E+11,21676
    13_03,1.7045E+11,22022
    13_04,1.70452E+11,206985
    2_01,1.70452E+11,117549
    2_02,1.7045E+11,47029
    2_03,1.70452E+11,118304
    2_04,1.7045E+11,31579
    2_05,1.7045E+11,30966
    2_06,1.7045E+11,48157
    Output was three (not 10) properly named files:
    2_03.txt
    13_01.txt
    13_03.txt

    13_01.txt contained
    13_01,1.70452E+11,182797 good this is correct
    13_02,1.7045E+11,21676 wrong it should have iterated to create file 13_02.txt

    13_03.txt contained
    13_03,1.7045E+11,22022
    13_04,1.70452E+11,206985
    2_01,1.70452E+11,117549
    2_02,1.7045E+11,47029

    2_03.txt contained
    2_03,1.70452E+11,118304
    2_04,1.7045E+11,31579
    2_05,1.7045E+11,30966
    2_06,1.7045E+11,48157

    The loops are not working right… it should iterate to 13_02 immediately after writing the first record into 13_01… instead it writes two records, skips to the third record writes 4 records skips to the 7th record.

    I got nowhere with Perl as it gave this back
    use: command not found
    I did use
    Code:
    #!/usr/bin/perl


    ----------

    Is this the problem?
    I ran a file command on rosettaTest.csv
    rosettaTest.csv: ASCII text, with CR, LF line terminators

    My csv came from Excel 2011... I read somewhere that there was a problem with the way Excel 2008 wrote out line terminators.

    I am going off in search of a fix.
     
  17. thread starter macrumors newbie

    Joined:
    Oct 17, 2012
    #17
    Code:
    bash-3.2$ tr '\15' '\n' < rosettaTest.csv
    got me
    rosettaTest.csv: ASCII text, with CR, LF line terminators
    with same incorrect results
     
  18. Freez, Oct 18, 2012
    Last edited: Oct 18, 2012

    macrumors newbie

    Joined:
    Feb 9, 2011
    #18
    Easier than all that

    Copy the text into textWrangler.

    Use find and replace to get rid of all those commas and spaces.

    Use the Prefix/Suffix line...
    To add touch with a space at begining of each line.
    Then add .txt to end of each line.

    touch billyMinneapolisMN.txt
    touch AliceDallasTX.txt



    Select All and Copy.

    Open Terminal.

    Us cd to get to your directory of choice.

    Paste.

    Done.
     
  19. thread starter macrumors newbie

    Joined:
    Oct 17, 2012
    #19
    TextWrangler fixed the problem.
    Now I get 17 perfect files before getting 500+ copies of
    Code:
    awk: 2_14.txt makes too many open files
     input record number 18, file rosetta.csv
     source line number 1
    doing 
    Seems like the file needs closing so I tried adding a close file command like this:
    Code:
    #!/bin/bash
    for i in $(cat rosetta.csv); do
      name=$(echo "$i"|awk -F',' '{ print $0 >($1 ".txt") }'  rosetta.csv)
      echo "doing $name"
      echo "$i" >${name}.txt
      close file "${name}.txt"
    done
    that was just a guess and it didn't work
     
  20. chown33, Oct 18, 2012
    Last edited: Oct 18, 2012

    macrumors 603

    Joined:
    Aug 9, 2009
    #20
    Guessing almost never works.

    Try replacing your entire shell script (not just the line that uses awk) with this:
    Code:
    #!/bin/bash
    
    mkdir -p out
    
    awk -F',' \
     '{ f="\"out/"$1".txt\""; system("echo \"" $0 "\" >" f ); }'  \
     [B][COLOR="Red"]the.csv[/COLOR][/B]
    
    I got tired of always deleting the individual output files mixed in with the CSV and shell script, so it now makes a sub-folder named "out" and creates all the files there. If you don't want that, and can't figure out how to change it, ask again and I'll revise it.

    The above is a COMPLETE REPLACEMENT for the original shell script. It is not just a few lines to be merged in with the original.


    My test CSV file:
    Code:
    able alfa,akron,state
    baker bravo,baltimore,state
    Charley Gordon,city,state
    delta dawn,city,state
    eeevill twin,eville,state
    foxtrot,tango,whiskey
    golf glove,glasgow,whisky
    hotel,homer,ak
    Inigo Montoya,indio,state
    Juliet is the sun,juarez,state
    13_01,1.70452E+11,182797
    13_02,1.7045E+11,21676
    13_03,1.7045E+11,22022
    13_04,1.70452E+11,206985
    2_01,1.70452E+11,117549
    2_02,1.7045E+11,47029
    2_03,1.70452E+11,118304
    2_04,1.7045E+11,31579
    2_05,1.7045E+11,30966
    2_06,1.7045E+11,48157
    
    EDIT

    I have hilited above in red the ONE place where you should change it to rosetta.csv, or whatever the name of your CSV file really is. Do not change anything else.

    You somehow made a change to the original shell script. You have this:
    Code:
    name=$(echo "$i"|awk -F',' '{ print $0 >($1 ".txt") }'  rosetta.csv)
    
    where the original is this:
    Code:
    name=$(echo "$i"|awk -F',' '{print $1}')
    
    These two lines do very different things.

    Accuracy is essential in programming. Seemingly innocuous changes can have unexpectedly large effects.
     
  21. macrumors 6502

    Joined:
    Mar 8, 2004
    #21


    Strange, Perl is included with OS X. Was the file chmod'ed with +x? Either way, I went ahead and attached a properly setup version. To use it type:

    ./csvparse filename
     

    Attached Files:

  22. thread starter macrumors newbie

    Joined:
    Oct 17, 2012
    #22
    It's been a very useful, informative exchange. Thankyou!
    The script is working, I got introduced to textWrangler (I used BBedit years ago), got a lesson in careful editing and my hard work is done. I will give the Perl a run in a bit. This has inspired me to pull out my Sed Awk book and do a bit more. Compared to the 50 lines of code required to do this job with VB in Excel sed and awk are amazing.
     
  23. macrumors regular

    Joined:
    Jun 13, 2007
    Location:
    Above ground.
    #23
    Late to the party I know but I felt like the Cocoa guys were missing out! :)

    Code:
    - (void)applicationDidFinishLaunching:(NSNotification *)aNotification
    {
        NSError*    error = nil;
        NSString*   cvsData = [NSString stringWithContentsOfFile:[@"~/cvsData.txt" stringByExpandingTildeInPath] encoding:NSUTF8StringEncoding error:&error];
        
        for ( NSString* line in [cvsData componentsSeparatedByString:@"\n"] )
        {
            NSArray*    lineParts = [line componentsSeparatedByString:@","];
    
            NSString*   name = [@"~/output/" stringByAppendingString:lineParts[0]];
            name = [name stringByAppendingPathExtension:@"txt"];
            name = [name stringByExpandingTildeInPath];
    
            [line writeToFile:name atomically:YES encoding:NSUTF8StringEncoding error:&error];
        }
    }
    
    The source data goes in a file named "cvsData.txt" in your home folder. The output is placed into a folder named "output" in your home folder (you'll need to make this).

    Enjoy.
     

Share This Page