Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

jasperx

macrumors newbie
Original poster
Oct 17, 2012
7
0
I need to create several hundred files from a list exported from excel to a csv.
The csv looks like this:
Name1,City1,State1
Name2,City2,State2
continues to
Name600,City600,State600

The desired result is 600 files like this:
Name1.txt which contains Name1,City1,State1
Name2.txt which contains Name2,City2,State2

My coding skills are quite rusty but it seems like this should be pretty simple
 

kryten2

macrumors 65816
Mar 17, 2012
1,114
99
Belgium
Try this :

Code:
-- csv file with Unix (LF) line feed
set theFile to choose file with prompt "Choose your csv file"
set theFolder to choose folder with prompt "Choose folder for the txt files"
set theParagraphs to paragraphs of (read theFile)
repeat with aParagraph in theParagraphs
	try
		set TID to AppleScript's text item delimiters
		set AppleScript's text item delimiters to {","}
		set theItems to text items of aParagraph
		set theName to item 1 of theItems
		set AppleScript's text item delimiters to TID
		do shell script "echo " & quoted form of aParagraph & " > " & quoted form of POSIX path of (theFolder as string) & quoted form of theName & ".txt"
	on error
		set AppleScript's text item delimiters to TID
	end try
end repeat
 

sero

macrumors member
Aug 28, 2008
91
14
Code:
#!/bin/bash

for i in $(cat the.csv); do
  name=$(echo "$i"|awk -F',' '{print $1}')
  echo "doing $name"
  echo "$i" >${i}.txt
done
 

Mac_Max

macrumors 6502
Mar 8, 2004
404
1
Obligatory Perl implementation:

Code:
#!/usr/bin/perl

use strict;
use warnings;

open CSV, "<", "filename.csv" or die $!;

while (<CSV>){
  my @line = split(',',$_);
  open OUTFILE, '>'.$line[0].'.csv' or die $!;
  print OUTFILE $_;
  close OUTFILE;
}

close CSV;
 

jasperx

macrumors newbie
Original poster
Oct 17, 2012
7
0
OK had a go with the bash version... and something is not quite right
The code is in a file rosetta which has:
Code:
#!/bin/bash
for i in $(cat rosettaTest.csv); do
  name=$(echo "$i"|awk -F',' '{print $1}')
  echo "doing $name"
  echo "$i" >${i}.txt
done
Here is my output

bash-3.2$ bash rosetta
rosetta: line 1: {rtf1ansiansicpg1252cocoartf1138cocoasubrtf510: command not found
rosetta: line 2: syntax error near unexpected token `}'
rosetta: line 2: `{\fonttbl\f0\fmodern\fcharset0 Courier;}'
bash-3.2$
 

mfram

Contributor
Jan 23, 2010
1,307
343
San Diego, CA USA
You need to save the bash script as a text file. That looks like RTF. Once that is done, then...

Code:
bash> chmod +x rosetta
bash> ./rosetta
 

robvas

macrumors 68040
Mar 29, 2009
3,240
629
USA
You need to save the bash script as a text file. That looks like RTF. Once that is done, then...

Code:
bash> chmod +x rosetta
bash> ./rosetta

Yup, use TextEdit and go to the Format menu (in the menu bar) and choose 'Make plain text'

Then save the file. Or even better, use something like TextWrangler
 

Mac_Max

macrumors 6502
Mar 8, 2004
404
1
Or go full Command Line and use nano, vi, or emacs.

Also if this is a one-off, you could even copy and paste that script into the command line and it should work.
 

jasperx

macrumors newbie
Original poster
Oct 17, 2012
7
0
Making it a txt file made a huge difference. The script runs but is not doing as intended. Trying again on what the desired results would be:
A file created for every record or line in a csv
Each line or record in the csv has three values:
Name1,City1,State1

The desired result is to create a file for each record named by the first value in the record. If I have 10 records the result would be 10 files named:
Name1.txt, Name2.txt....Name10.txt
The contents of Name1.txt would be the record or Name1,City1,State1

I ran this on a csv with 10 records. The output was 3 files... not 10.
The first file contained records 1 and 2.
The second file contained records 3,4,5,6
The third file contained records 7,8,9,10
Here is the code
Code:
#!/bin/bash

for i in $(cat the.csv); do
  name=$(echo "$i"|awk -F',' '{print $1}')
  echo "doing $name"
  echo "$i" >${i}.txt
done
 

chown33

Moderator
Staff member
Aug 9, 2009
10,747
8,421
A sea of green
I ran this on a csv with 10 records. The output was 3 files... not 10.
The first file contained records 1 and 2.
The second file contained records 3,4,5,6
The third file contained records 7,8,9,10

Post your CSV data. Otherwise all anyone can do is guess what input produced your stated output.


There's a bug in the code. Change this:
Code:
  echo "$i" >${i}.txt
to this:
Code:
  echo "$i" >"${name}.txt"


My example "the.csv" file (manually created):
Code:
able,akron,state
bravo,baltimore,state
charlie,city,state
delta,city,state
eeevill,eville,state
foxtrot,tango,whiskey
golf,glasgow,whisky
hotel,homer,state
indigo,indio,state
juliet,juarez,state
 

kryten2

macrumors 65816
Mar 17, 2012
1,114
99
Belgium
There's a bug in the code.

I was waiting for someone to catch that. It also doesn't handle spaces. Mac_Max's Obligatory Perl implementation creates csv files instead of txt files. (at least it did when I tested it) When fixed it's very fast. I wish I knew Perl.
 

Attachments

  • Picture 4.png
    Picture 4.png
    73.7 KB · Views: 169

samwich

macrumors regular
Aug 5, 2007
183
0
Here's an implementation in Python:

Code:
import sys

def createFiles(csv):
	with open(csv, 'r') as f:
		for line in f:
			s = line.split(',')
			with open(s[0] + '.txt', 'w') as w:
				w.write(s[0] + ',' + s[1] + ',' + s[2])
            	
createFiles(sys.argv[1])

You can use it by running: python createFiles.py <example.csv>
 

chown33

Moderator
Staff member
Aug 9, 2009
10,747
8,421
A sea of green
I was waiting for someone to catch that. It also doesn't handle spaces. Mac_Max's Obligatory Perl implementation creates csv files instead of txt files. (at least it did when I tested it) When fixed it's very fast. I wish I knew Perl.

Here's the all-awk version, which handles spaces in names:
Code:
awk -F',' '{ print $0 >($1 ".txt") }'  the.csv
And here's the version that spits out each name:
Code:
awk -F',' '{ print "doing " $1 >"/dev/stderr"; print $0 >($1 ".txt") }'  the.csv

So this would be the new contents of "rosetta":
Code:
#!/bin/bash
awk -F',' '{ print $0 >($1 ".txt") }'  the.csv
or replace that awk one-liner with the longer awk one-liner.


Test data, in "the.csv":
Code:
able alfa,akron,state
baker bravo,baltimore,state
Charley Gordon,city,state
delta dawn,city,state
eeevill twin,eville,state
foxtrot,tango,whiskey
golf glove,glasgow,whisky
hotel,homer,ak
Inigo Montoya,indio,state
Juliet is the sun,juarez,state
 

Mac_Max

macrumors 6502
Mar 8, 2004
404
1
I was waiting for someone to catch that. It also doesn't handle spaces. Mac_Max's Obligatory Perl implementation creates csv files instead of txt files. (at least it did when I tested it) When fixed it's very fast. I wish I knew Perl.

Ah, yes in that case:

Code:
#!/usr/bin/perl

use strict;
use warnings;

die "Please provide a filename\n" if @ARGV == 0;
open CSV, "<", $ARGV[0] or die $!;

while (<CSV>){
  my @line = split(',',$_);
  $line[0] =~ s/ /-/g; #comment this out with if you prefer to keep spaces as spaces.
  open OUTFILE, '>'.$line[0].'.txt' or die $!;
  print OUTFILE $_;
  close OUTFILE;
}

close CSV;

My original version will handle spaces just fine but this version will also replace them with dashes (or whatever character you'd like, remember to escape regex control chars). It also now accepts the input file as an argument.

Perl is a great language. It took me about three weeks to become armed and dangerous with it and about 5 months to become rather good at it. Here's the book I learned from: http://onyxneon.com/books/modern_perl/
 
Last edited:

jasperx

macrumors newbie
Original poster
Oct 17, 2012
7
0
You guys are having way too much fun... running off and leaving me with my eyes crossed but each time a little closer to the desired result. Stick with me as I think we are getting close:

This code as corrected and expanded on by chown33 :
Code:
#!/bin/bash
for i in $(cat rosettaTest.csv); do
  name=$(echo "$i"|awk -F',' '{ print $0 >($1 ".txt") }'  rosettaTest.csv)
  echo "doing $name"
  echo "$i" >${name}.txt
done

Acting on this rosettaTest.csv which contains:
13_01,1.70452E+11,182797
13_02,1.7045E+11,21676
13_03,1.7045E+11,22022
13_04,1.70452E+11,206985
2_01,1.70452E+11,117549
2_02,1.7045E+11,47029
2_03,1.70452E+11,118304
2_04,1.7045E+11,31579
2_05,1.7045E+11,30966
2_06,1.7045E+11,48157
Output was three (not 10) properly named files:
2_03.txt
13_01.txt
13_03.txt

13_01.txt contained
13_01,1.70452E+11,182797 good this is correct
13_02,1.7045E+11,21676 wrong it should have iterated to create file 13_02.txt

13_03.txt contained
13_03,1.7045E+11,22022
13_04,1.70452E+11,206985
2_01,1.70452E+11,117549
2_02,1.7045E+11,47029

2_03.txt contained
2_03,1.70452E+11,118304
2_04,1.7045E+11,31579
2_05,1.7045E+11,30966
2_06,1.7045E+11,48157

The loops are not working right… it should iterate to 13_02 immediately after writing the first record into 13_01… instead it writes two records, skips to the third record writes 4 records skips to the 7th record.

I got nowhere with Perl as it gave this back
use: command not found
I did use
Code:
#!/usr/bin/perl


----------

Is this the problem?
I ran a file command on rosettaTest.csv
rosettaTest.csv: ASCII text, with CR, LF line terminators

My csv came from Excel 2011... I read somewhere that there was a problem with the way Excel 2008 wrote out line terminators.

I am going off in search of a fix.
 

jasperx

macrumors newbie
Original poster
Oct 17, 2012
7
0
Code:
bash-3.2$ tr '\15' '\n' < rosettaTest.csv
got me
rosettaTest.csv: ASCII text, with CR, LF line terminators
with same incorrect results
 

Freez

macrumors newbie
Feb 9, 2011
15
0
Easier than all that

Copy the text into textWrangler.

Use find and replace to get rid of all those commas and spaces.

Use the Prefix/Suffix line...
To add touch with a space at begining of each line.
Then add .txt to end of each line.

touch billyMinneapolisMN.txt
touch AliceDallasTX.txt



Select All and Copy.

Open Terminal.

Us cd to get to your directory of choice.

Paste.

Done.
 
Last edited:

jasperx

macrumors newbie
Original poster
Oct 17, 2012
7
0
TextWrangler fixed the problem.
Now I get 17 perfect files before getting 500+ copies of
Code:
awk: 2_14.txt makes too many open files
 input record number 18, file rosetta.csv
 source line number 1
doing
Seems like the file needs closing so I tried adding a close file command like this:
Code:
#!/bin/bash
for i in $(cat rosetta.csv); do
  name=$(echo "$i"|awk -F',' '{ print $0 >($1 ".txt") }'  rosetta.csv)
  echo "doing $name"
  echo "$i" >${name}.txt
  close file "${name}.txt"
done
that was just a guess and it didn't work
 

chown33

Moderator
Staff member
Aug 9, 2009
10,747
8,421
A sea of green
Guessing almost never works.

Try replacing your entire shell script (not just the line that uses awk) with this:
Code:
#!/bin/bash

mkdir -p out

awk -F',' \
 '{ f="\"out/"$1".txt\""; system("echo \"" $0 "\" >" f ); }'  \
 [B][COLOR="Red"]the.csv[/COLOR][/B]
I got tired of always deleting the individual output files mixed in with the CSV and shell script, so it now makes a sub-folder named "out" and creates all the files there. If you don't want that, and can't figure out how to change it, ask again and I'll revise it.

The above is a COMPLETE REPLACEMENT for the original shell script. It is not just a few lines to be merged in with the original.


My test CSV file:
Code:
able alfa,akron,state
baker bravo,baltimore,state
Charley Gordon,city,state
delta dawn,city,state
eeevill twin,eville,state
foxtrot,tango,whiskey
golf glove,glasgow,whisky
hotel,homer,ak
Inigo Montoya,indio,state
Juliet is the sun,juarez,state
13_01,1.70452E+11,182797
13_02,1.7045E+11,21676
13_03,1.7045E+11,22022
13_04,1.70452E+11,206985
2_01,1.70452E+11,117549
2_02,1.7045E+11,47029
2_03,1.70452E+11,118304
2_04,1.7045E+11,31579
2_05,1.7045E+11,30966
2_06,1.7045E+11,48157

EDIT

I have hilited above in red the ONE place where you should change it to rosetta.csv, or whatever the name of your CSV file really is. Do not change anything else.

You somehow made a change to the original shell script. You have this:
Code:
name=$(echo "$i"|awk -F',' '{ print $0 >($1 ".txt") }'  rosetta.csv)
where the original is this:
Code:
name=$(echo "$i"|awk -F',' '{print $1}')
These two lines do very different things.

Accuracy is essential in programming. Seemingly innocuous changes can have unexpectedly large effects.
 
Last edited:

Mac_Max

macrumors 6502
Mar 8, 2004
404
1
I got nowhere with Perl as it gave this back
use: command not found
I did use
Code:
#!/usr/bin/perl



Strange, Perl is included with OS X. Was the file chmod'ed with +x? Either way, I went ahead and attached a properly setup version. To use it type:

./csvparse filename
 

Attachments

  • csvparse.zip
    417 bytes · Views: 92

jasperx

macrumors newbie
Original poster
Oct 17, 2012
7
0
It's been a very useful, informative exchange. Thankyou!
The script is working, I got introduced to textWrangler (I used BBedit years ago), got a lesson in careful editing and my hard work is done. I will give the Perl a run in a bit. This has inspired me to pull out my Sed Awk book and do a bit more. Compared to the 50 lines of code required to do this job with VB in Excel sed and awk are amazing.
 

badNameErr

macrumors regular
Jun 13, 2007
238
38
Above ground.
Late to the party I know but I felt like the Cocoa guys were missing out! :)

Code:
- (void)applicationDidFinishLaunching:(NSNotification *)aNotification
{
    NSError*    error = nil;
    NSString*   cvsData = [NSString stringWithContentsOfFile:[@"~/cvsData.txt" stringByExpandingTildeInPath] encoding:NSUTF8StringEncoding error:&error];
    
    for ( NSString* line in [cvsData componentsSeparatedByString:@"\n"] )
    {
        NSArray*    lineParts = [line componentsSeparatedByString:@","];

        NSString*   name = [@"~/output/" stringByAppendingString:lineParts[0]];
        name = [name stringByAppendingPathExtension:@"txt"];
        name = [name stringByExpandingTildeInPath];

        [line writeToFile:name atomically:YES encoding:NSUTF8StringEncoding error:&error];
    }
}

The source data goes in a file named "cvsData.txt" in your home folder. The output is placed into a folder named "output" in your home folder (you'll need to make this).

Enjoy.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.