Go Back   MacRumors Forums > Apple Systems and Services > Programming > Mac Programming

Reply
 
Thread Tools Search this Thread Display Modes
Old Oct 17, 2012, 09:59 AM   #1
jasperx
macrumors newbie
 
Join Date: Oct 2012
Create files from list in CSV

I need to create several hundred files from a list exported from excel to a csv.
The csv looks like this:
Name1,City1,State1
Name2,City2,State2
continues to
Name600,City600,State600

The desired result is 600 files like this:
Name1.txt which contains Name1,City1,State1
Name2.txt which contains Name2,City2,State2

My coding skills are quite rusty but it seems like this should be pretty simple
jasperx is offline   0 Reply With Quote
Old Oct 17, 2012, 11:24 AM   #2
robvas
macrumors 68000
 
Join Date: Mar 2009
Location: USA
Do you want to do it bash, Perl, Ruby, ?
robvas is offline   0 Reply With Quote
Old Oct 17, 2012, 01:29 PM   #3
kryten2
macrumors 6502a
 
Join Date: Mar 2012
Location: Belgium
Try this :

Code:
-- csv file with Unix (LF) line feed
set theFile to choose file with prompt "Choose your csv file"
set theFolder to choose folder with prompt "Choose folder for the txt files"
set theParagraphs to paragraphs of (read theFile)
repeat with aParagraph in theParagraphs
	try
		set TID to AppleScript's text item delimiters
		set AppleScript's text item delimiters to {","}
		set theItems to text items of aParagraph
		set theName to item 1 of theItems
		set AppleScript's text item delimiters to TID
		do shell script "echo " & quoted form of aParagraph & " > " & quoted form of POSIX path of (theFolder as string) & quoted form of theName & ".txt"
	on error
		set AppleScript's text item delimiters to TID
	end try
end repeat
__________________
Space Corps Directive 34124
kryten2 is offline   0 Reply With Quote
Old Oct 17, 2012, 01:29 PM   #4
sero
macrumors member
 
Join Date: Aug 2008
Code:
#!/bin/bash

for i in $(cat the.csv); do
  name=$(echo "$i"|awk -F',' '{print $1}')
  echo "doing $name"
  echo "$i" >${i}.txt
done
sero is offline   3 Reply With Quote
Old Oct 17, 2012, 02:28 PM   #5
Mac_Max
macrumors 6502
 
Join Date: Mar 2004
Obligatory Perl implementation:

Code:
#!/usr/bin/perl

use strict;
use warnings;

open CSV, "<", "filename.csv" or die $!;

while (<CSV>){
  my @line = split(',',$_);
  open OUTFILE, '>'.$line[0].'.csv' or die $!;
  print OUTFILE $_;
  close OUTFILE;
}

close CSV;
Mac_Max is offline   0 Reply With Quote
Old Oct 17, 2012, 04:35 PM   #6
jasperx
Thread Starter
macrumors newbie
 
Join Date: Oct 2012
OK had a go with the bash version... and something is not quite right
The code is in a file rosetta which has:
Code:
#!/bin/bash
for i in $(cat rosettaTest.csv); do
  name=$(echo "$i"|awk -F',' '{print $1}')
  echo "doing $name"
  echo "$i" >${i}.txt
done
Here is my output

bash-3.2$ bash rosetta
rosetta: line 1: {rtf1ansiansicpg1252cocoartf1138cocoasubrtf510: command not found
rosetta: line 2: syntax error near unexpected token `}'
rosetta: line 2: `{\fonttbl\f0\fmodern\fcharset0 Courier;}'
bash-3.2$
jasperx is offline   0 Reply With Quote
Old Oct 17, 2012, 04:37 PM   #7
mfram
macrumors 6502a
 
Join Date: Jan 2010
Location: San Diego, CA USA
You need to save the bash script as a text file. That looks like RTF. Once that is done, then...

Code:
bash> chmod +x rosetta
bash> ./rosetta
mfram is online now   0 Reply With Quote
Old Oct 17, 2012, 06:28 PM   #8
robvas
macrumors 68000
 
Join Date: Mar 2009
Location: USA
Quote:
Originally Posted by mfram View Post
You need to save the bash script as a text file. That looks like RTF. Once that is done, then...

Code:
bash> chmod +x rosetta
bash> ./rosetta
Yup, use TextEdit and go to the Format menu (in the menu bar) and choose 'Make plain text'

Then save the file. Or even better, use something like TextWrangler
robvas is offline   0 Reply With Quote
Old Oct 17, 2012, 08:26 PM   #9
Mac_Max
macrumors 6502
 
Join Date: Mar 2004
Or go full Command Line and use nano, vi, or emacs.

Also if this is a one-off, you could even copy and paste that script into the command line and it should work.
Mac_Max is offline   0 Reply With Quote
Old Oct 17, 2012, 08:35 PM   #10
jasperx
Thread Starter
macrumors newbie
 
Join Date: Oct 2012
Making it a txt file made a huge difference. The script runs but is not doing as intended. Trying again on what the desired results would be:
A file created for every record or line in a csv
Each line or record in the csv has three values:
Name1,City1,State1

The desired result is to create a file for each record named by the first value in the record. If I have 10 records the result would be 10 files named:
Name1.txt, Name2.txt....Name10.txt
The contents of Name1.txt would be the record or Name1,City1,State1

I ran this on a csv with 10 records. The output was 3 files... not 10.
The first file contained records 1 and 2.
The second file contained records 3,4,5,6
The third file contained records 7,8,9,10
Here is the code
Code:
#!/bin/bash

for i in $(cat the.csv); do
  name=$(echo "$i"|awk -F',' '{print $1}')
  echo "doing $name"
  echo "$i" >${i}.txt
done
jasperx is offline   0 Reply With Quote
Old Oct 17, 2012, 09:24 PM   #11
chown33
macrumors 603
 
Join Date: Aug 2009
Quote:
Originally Posted by jasperx View Post
I ran this on a csv with 10 records. The output was 3 files... not 10.
The first file contained records 1 and 2.
The second file contained records 3,4,5,6
The third file contained records 7,8,9,10
Post your CSV data. Otherwise all anyone can do is guess what input produced your stated output.


There's a bug in the code. Change this:
Code:
  echo "$i" >${i}.txt
to this:
Code:
  echo "$i" >"${name}.txt"

My example "the.csv" file (manually created):
Code:
able,akron,state
bravo,baltimore,state
charlie,city,state
delta,city,state
eeevill,eville,state
foxtrot,tango,whiskey
golf,glasgow,whisky
hotel,homer,state
indigo,indio,state
juliet,juarez,state
chown33 is offline   0 Reply With Quote
Old Oct 17, 2012, 10:49 PM   #12
kryten2
macrumors 6502a
 
Join Date: Mar 2012
Location: Belgium
Quote:
There's a bug in the code.
I was waiting for someone to catch that. It also doesn't handle spaces. Mac_Max's Obligatory Perl implementation creates csv files instead of txt files. (at least it did when I tested it) When fixed it's very fast. I wish I knew Perl.
Attached Thumbnails
Click image for larger version

Name:	Picture 4.png
Views:	37
Size:	73.7 KB
ID:	370235  
__________________
Space Corps Directive 34124
kryten2 is offline   0 Reply With Quote
Old Oct 18, 2012, 12:51 AM   #13
samwich
macrumors regular
 
Join Date: Aug 2007
Here's an implementation in Python:

Code:
import sys

def createFiles(csv):
	with open(csv, 'r') as f:
		for line in f:
			s = line.split(',')
			with open(s[0] + '.txt', 'w') as w:
				w.write(s[0] + ',' + s[1] + ',' + s[2])
            	
createFiles(sys.argv[1])
You can use it by running: python createFiles.py <example.csv>
__________________
my portfolio
samwich is offline   0 Reply With Quote
Old Oct 18, 2012, 01:11 AM   #14
chown33
macrumors 603
 
Join Date: Aug 2009
Quote:
Originally Posted by kryten2 View Post
I was waiting for someone to catch that. It also doesn't handle spaces. Mac_Max's Obligatory Perl implementation creates csv files instead of txt files. (at least it did when I tested it) When fixed it's very fast. I wish I knew Perl.
Here's the all-awk version, which handles spaces in names:
Code:
awk -F',' '{ print $0 >($1 ".txt") }'  the.csv
And here's the version that spits out each name:
Code:
awk -F',' '{ print "doing " $1 >"/dev/stderr"; print $0 >($1 ".txt") }'  the.csv
So this would be the new contents of "rosetta":
Code:
#!/bin/bash
awk -F',' '{ print $0 >($1 ".txt") }'  the.csv
or replace that awk one-liner with the longer awk one-liner.


Test data, in "the.csv":
Code:
able alfa,akron,state
baker bravo,baltimore,state
Charley Gordon,city,state
delta dawn,city,state
eeevill twin,eville,state
foxtrot,tango,whiskey
golf glove,glasgow,whisky
hotel,homer,ak
Inigo Montoya,indio,state
Juliet is the sun,juarez,state
chown33 is offline   0 Reply With Quote
Old Oct 18, 2012, 03:29 AM   #15
Mac_Max
macrumors 6502
 
Join Date: Mar 2004
Quote:
Originally Posted by kryten2 View Post
I was waiting for someone to catch that. It also doesn't handle spaces. Mac_Max's Obligatory Perl implementation creates csv files instead of txt files. (at least it did when I tested it) When fixed it's very fast. I wish I knew Perl.
Ah, yes in that case:

Code:
#!/usr/bin/perl

use strict;
use warnings;

die "Please provide a filename\n" if @ARGV == 0;
open CSV, "<", $ARGV[0] or die $!;

while (<CSV>){
  my @line = split(',',$_);
  $line[0] =~ s/ /-/g; #comment this out with if you prefer to keep spaces as spaces.
  open OUTFILE, '>'.$line[0].'.txt' or die $!;
  print OUTFILE $_;
  close OUTFILE;
}

close CSV;
My original version will handle spaces just fine but this version will also replace them with dashes (or whatever character you'd like, remember to escape regex control chars). It also now accepts the input file as an argument.

Perl is a great language. It took me about three weeks to become armed and dangerous with it and about 5 months to become rather good at it. Here's the book I learned from: http://onyxneon.com/books/modern_perl/

Last edited by Mac_Max; Oct 18, 2012 at 03:10 PM. Reason: typo
Mac_Max is offline   0 Reply With Quote
Old Oct 18, 2012, 11:46 AM   #16
jasperx
Thread Starter
macrumors newbie
 
Join Date: Oct 2012
You guys are having way too much fun... running off and leaving me with my eyes crossed but each time a little closer to the desired result. Stick with me as I think we are getting close:

This code as corrected and expanded on by chown33 :
Code:
#!/bin/bash
for i in $(cat rosettaTest.csv); do
  name=$(echo "$i"|awk -F',' '{ print $0 >($1 ".txt") }'  rosettaTest.csv)
  echo "doing $name"
  echo "$i" >${name}.txt
done
Acting on this rosettaTest.csv which contains:
13_01,1.70452E+11,182797
13_02,1.7045E+11,21676
13_03,1.7045E+11,22022
13_04,1.70452E+11,206985
2_01,1.70452E+11,117549
2_02,1.7045E+11,47029
2_03,1.70452E+11,118304
2_04,1.7045E+11,31579
2_05,1.7045E+11,30966
2_06,1.7045E+11,48157
Output was three (not 10) properly named files:
2_03.txt
13_01.txt
13_03.txt

13_01.txt contained
13_01,1.70452E+11,182797 good this is correct
13_02,1.7045E+11,21676 wrong it should have iterated to create file 13_02.txt

13_03.txt contained
13_03,1.7045E+11,22022
13_04,1.70452E+11,206985
2_01,1.70452E+11,117549
2_02,1.7045E+11,47029

2_03.txt contained
2_03,1.70452E+11,118304
2_04,1.7045E+11,31579
2_05,1.7045E+11,30966
2_06,1.7045E+11,48157

The loops are not working right… it should iterate to 13_02 immediately after writing the first record into 13_01… instead it writes two records, skips to the third record writes 4 records skips to the 7th record.

I got nowhere with Perl as it gave this back
use: command not found
I did use
Code:
#!/usr/bin/perl


----------

Is this the problem?
I ran a file command on rosettaTest.csv
rosettaTest.csv: ASCII text, with CR, LF line terminators

My csv came from Excel 2011... I read somewhere that there was a problem with the way Excel 2008 wrote out line terminators.

I am going off in search of a fix.
jasperx is offline   0 Reply With Quote
Old Oct 18, 2012, 12:01 PM   #17
jasperx
Thread Starter
macrumors newbie
 
Join Date: Oct 2012
Code:
bash-3.2$ tr '\15' '\n' < rosettaTest.csv
got me
rosettaTest.csv: ASCII text, with CR, LF line terminators
with same incorrect results
jasperx is offline   0 Reply With Quote
Old Oct 18, 2012, 12:11 PM   #18
Freez
macrumors newbie
 
Join Date: Feb 2011
Easier than all that

Copy the text into textWrangler.

Use find and replace to get rid of all those commas and spaces.

Use the Prefix/Suffix line...
To add touch with a space at begining of each line.
Then add .txt to end of each line.

touch billyMinneapolisMN.txt
touch AliceDallasTX.txt



Select All and Copy.

Open Terminal.

Us cd to get to your directory of choice.

Paste.

Done.

Last edited by Freez; Oct 18, 2012 at 01:14 PM.
Freez is offline   1 Reply With Quote
Old Oct 18, 2012, 02:08 PM   #19
jasperx
Thread Starter
macrumors newbie
 
Join Date: Oct 2012
TextWrangler fixed the problem.
Now I get 17 perfect files before getting 500+ copies of
Code:
awk: 2_14.txt makes too many open files
 input record number 18, file rosetta.csv
 source line number 1
doing
Seems like the file needs closing so I tried adding a close file command like this:
Code:
#!/bin/bash
for i in $(cat rosetta.csv); do
  name=$(echo "$i"|awk -F',' '{ print $0 >($1 ".txt") }'  rosetta.csv)
  echo "doing $name"
  echo "$i" >${name}.txt
  close file "${name}.txt"
done
that was just a guess and it didn't work
jasperx is offline   0 Reply With Quote
Old Oct 18, 2012, 02:32 PM   #20
chown33
macrumors 603
 
Join Date: Aug 2009
Guessing almost never works.

Try replacing your entire shell script (not just the line that uses awk) with this:
Code:
#!/bin/bash

mkdir -p out

awk -F',' \
 '{ f="\"out/"$1".txt\""; system("echo \"" $0 "\" >" f ); }'  \
 the.csv
I got tired of always deleting the individual output files mixed in with the CSV and shell script, so it now makes a sub-folder named "out" and creates all the files there. If you don't want that, and can't figure out how to change it, ask again and I'll revise it.

The above is a COMPLETE REPLACEMENT for the original shell script. It is not just a few lines to be merged in with the original.


My test CSV file:
Code:
able alfa,akron,state
baker bravo,baltimore,state
Charley Gordon,city,state
delta dawn,city,state
eeevill twin,eville,state
foxtrot,tango,whiskey
golf glove,glasgow,whisky
hotel,homer,ak
Inigo Montoya,indio,state
Juliet is the sun,juarez,state
13_01,1.70452E+11,182797
13_02,1.7045E+11,21676
13_03,1.7045E+11,22022
13_04,1.70452E+11,206985
2_01,1.70452E+11,117549
2_02,1.7045E+11,47029
2_03,1.70452E+11,118304
2_04,1.7045E+11,31579
2_05,1.7045E+11,30966
2_06,1.7045E+11,48157
EDIT

I have hilited above in red the ONE place where you should change it to rosetta.csv, or whatever the name of your CSV file really is. Do not change anything else.

You somehow made a change to the original shell script. You have this:
Code:
name=$(echo "$i"|awk -F',' '{ print $0 >($1 ".txt") }'  rosetta.csv)
where the original is this:
Code:
name=$(echo "$i"|awk -F',' '{print $1}')
These two lines do very different things.

Accuracy is essential in programming. Seemingly innocuous changes can have unexpectedly large effects.

Last edited by chown33; Oct 18, 2012 at 02:53 PM.
chown33 is offline   1 Reply With Quote
Old Oct 18, 2012, 03:09 PM   #21
Mac_Max
macrumors 6502
 
Join Date: Mar 2004
Quote:
Originally Posted by jasperx View Post
I got nowhere with Perl as it gave this back
use: command not found
I did use
Code:
#!/usr/bin/perl
[COLOR="#808080"]
Strange, Perl is included with OS X. Was the file chmod'ed with +x? Either way, I went ahead and attached a properly setup version. To use it type:

./csvparse filename
Attached Files
File Type: zip csvparse.zip (417 Bytes, 10 views)
Mac_Max is offline   1 Reply With Quote
Old Oct 18, 2012, 04:36 PM   #22
jasperx
Thread Starter
macrumors newbie
 
Join Date: Oct 2012
It's been a very useful, informative exchange. Thankyou!
The script is working, I got introduced to textWrangler (I used BBedit years ago), got a lesson in careful editing and my hard work is done. I will give the Perl a run in a bit. This has inspired me to pull out my Sed Awk book and do a bit more. Compared to the 50 lines of code required to do this job with VB in Excel sed and awk are amazing.
jasperx is offline   0 Reply With Quote
Old Oct 19, 2012, 10:27 PM   #23
badNameErr
macrumors regular
 
Join Date: Jun 2007
Location: Above ground.
Late to the party I know but I felt like the Cocoa guys were missing out!

Code:
- (void)applicationDidFinishLaunching:(NSNotification *)aNotification
{
    NSError*    error = nil;
    NSString*   cvsData = [NSString stringWithContentsOfFile:[@"~/cvsData.txt" stringByExpandingTildeInPath] encoding:NSUTF8StringEncoding error:&error];
    
    for ( NSString* line in [cvsData componentsSeparatedByString:@"\n"] )
    {
        NSArray*    lineParts = [line componentsSeparatedByString:@","];

        NSString*   name = [@"~/output/" stringByAppendingString:lineParts[0]];
        name = [name stringByAppendingPathExtension:@"txt"];
        name = [name stringByExpandingTildeInPath];

        [line writeToFile:name atomically:YES encoding:NSUTF8StringEncoding error:&error];
    }
}
The source data goes in a file named "cvsData.txt" in your home folder. The output is placed into a folder named "output" in your home folder (you'll need to make this).

Enjoy.
badNameErr is offline   0 Reply With Quote

Reply
MacRumors Forums > Apple Systems and Services > Programming > Mac Programming

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Similar Threads
thread Thread Starter Forum Replies Last Post
Mavericks - extremely slow to list files when selecting files in finder and Safari iosuser OS X Mavericks (10.9) 33 Aug 29, 2014 04:40 AM
Batch Rename files using CSV list diddykiddy Mac Programming 4 Jan 7, 2014 03:53 AM
IMAP tool to export list of emails to Excel/CSV ? paulCC Mac Applications and Mac App Store 2 Jun 4, 2013 03:01 PM
problem importing csv files to adress book terra OS X 0 Feb 23, 2013 02:45 PM
What be the best way to create an app with a list of editable items fstigre iPhone/iPad Programming 14 Nov 24, 2012 08:52 PM

Forum Jump

All times are GMT -5. The time now is 01:30 AM.

Mac Rumors | Mac | iPhone | iPhone Game Reviews | iPhone Apps

Mobile Version | Fixed | Fluid | Fluid HD
Copyright 2002-2013, MacRumors.com, LLC