Help creating script to sort through text files

Discussion in 'Mac Programming' started by Kirkman, Sep 25, 2005.

  1. Kirkman macrumors member

    Dec 27, 2002
    Okay... I've got about 100 individual text files that each contain a list of song titles that somebody likes.

    I want to create a script that will scan through the files and count the song titles. It would work like this:

    Open text file
    Get song title from first line
    Search through song title array. If script hasn't encounted the song, add a new entry to an array.
    Add "1" to that song's counter in the array
    Go to next line and repeat this process to the end of file
    Go to next file and repeat this process until end of all files

    At the end of the process, it would output a list of all the songs it encountered in the files, and how many times each song appeared.

    That's the process. But I have no idea how to turn this into a shell script (or AppleScript). Can anyone help me do this?

  2. robbieduncan Moderator emeritus


    Jul 24, 2002
    Can we assume that each song is on a separate line? It's make things a lot easier if they were.

    Do you want to treat "A SONG TITLE" as the same song as "A Song Title"?
  3. HiRez macrumors 603


    Jan 6, 2004
    Western US
    Try this:
    #! /usr/bin/python
    import os, sys
    if len(sys.argv) > 1:
    	rootPath = sys.argv[1]
    	rootPath = os.getcwd()
    songFiles = []
    def parseDir(dirPath):
    	for path in os.listdir(dirPath):
    		if os.path.isfile(path):
    			if os.path.splitext(path)[1] == ".txt":
    		elif os.path.isdir(path):
    print "Scanning %d files..." % len(songFiles)
    songList = {}
    for file in songFiles:
    	f = open(file)
    	for line in f.readlines():
    		if songList.has_key(line):
    			count = songList[line]
    			songList[line] = count + 1
    			songList[line] = 1
    for key in songList.keys():
    	print "%3d %s" % (songList[key], key),
    Save as and run it in Terminal by typing python <directory-to-search>, or leave off the directory to seach files in the current working directory. Only files with a .txt extension will be searched.

    Disclaimer: This is totally untested! Use at your own risk! It may not even work at all!
  4. WebMongol macrumors member

    Sep 19, 2004
    Bay Area, CA
    Well, it's trivial to do with standard Unix tools:

    $ cat *.txt | sort | uniq -c | sort -nr

    Output is a list of songs sorted by frequency.
    You can put this sequence of command into file and invoke it by name from Terminal.
    File: most
    #! /bin/bash

    cat $* | sort | uniq -c | sort -nr

    $ most filenames
    $ most SongDir/song*.txt

