Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Eazkk123

macrumors member
Original poster
I have a list of 40,000 words with the following endings .x12 .y15 an example talktous.x12 i want to remove all the names with the end .x12 and than i want to arrange all the .y15 in order of number of characters from 1 character to the maximum which i have not counted.

Help appreciated.

Eazkk123
 
Last edited by a moderator:
40,000 words right? Not files or icons or anything right?

If this is the case you could probably use Excel. Otherwise I'm gonna see what I can figure out using Automator.
 
Just words! i have excel but because it is such a big list it has difficulties importing it, and many errors occur (typical of microsoft 🙂 )if Automator or Textmate can do it that would be great.

40,000 words right? Not files or icons or anything right?

If this is the case you could probably use Excel. Otherwise I'm gonna see what I can figure out using Automator.
 
Hmm. I would have thought Excel could deal with it.

You need a database program, which are kind of programs that are specially designed to deal with very long lists.

FileMaker Pro could probably do it. I've created test databases with 200,000 files, each with 10 fields, in it and done global sorts, no problem.

You don't need the latest version, something a few years old will do fine. It should be able to import your file CSV or Excel or whatever.

shell scripts (or bash scripts) in AWK could do it too, check out

http://en.wikipedia.org/wiki/AWK_programming_language

It might be a little technical for you.

Hope this helps.

What do you need this for anyway?
 
It is a massive .txt file, and excel does import the list but there are problems such as the extensions to the words .x12 goes to the second column, but some stay in the first column so it is all scattered not one straight list, basically i just need help removing all the words with the ending .x12 and than all the endings with .y12 i want to arrange from least number of characters to the most.

Help Appreciated

Eazkk123


Hmm. I would have thought Excel could deal with it.

You need a database program, which are kind of programs that are specially designed to deal with very long lists.

FileMaker Pro could probably do it. I've created test databases with 200,000 files, each with 10 fields, in it and done global sorts, no problem.

You don't need the latest version, something a few years old will do fine. It should be able to import your file CSV or Excel or whatever.

shell scripts (or bash scripts) in AWK could do it too, check out

http://en.wikipedia.org/wiki/AWK_programming_language

It might be a little technical for you.

Hope this helps.

What do you need this for anyway?
 
Last edited by a moderator:
Well I've had no luck with Automator (and I was only trying a list of 20).

Though BBEdit has some features for removing text lines. Still not sure what syntax would be used to organize your list by short to long words. BBEdit is shareware so you can give it a try.
 
I don't think the problems you are having with Excel have anything to do with the size of the list. 40k entries should be well within its limitations. You are having problems because you are not setting the text import parameters correctly. Go back through it manually and make sure that the number of fields and the delimitations are correct.
 
I don't think the problems you are having with Excel have anything to do with the size of the list. 40k entries should be well within its limitations. You are having problems because you are not setting the text import parameters correctly. Go back through it manually and make sure that the number of fields and the delimitations are correct.

I agree. Also worth scrolling through your text file and checking by eye that all the data is clean. Many times I've found that what i thought was a clean data file had 2 items on the same line or some other small screw-up.

You could try just importing 100 into Excel, check if that works (clean data, correct import parameters, sorted properly etc), then 1000, then 2000, then 5000 words.

Or split your text file into blocks of 5000 words.

You still havn't said what all this is for. considering all the help we're giving, we'd like to know 🙄
 
I have a list of 40,000 words with the following endings .x12 .y15 an example talktous.x12 i want to remove all the names with the end .x12 and than i want to arrange all the .y15 in order of number of characters from 1 character to the maximum which i have not counted.

Help appreciated.

Eazkk123

What about find and replace in TextEdit? Simple enough. 😕

TextEdit

Edit > Find > Find...

Fill in the bits you want to take out in the 'find' box and leave the 'replace with' empty.

Thats it.
 
Last edited by a moderator:
Use TextWrangler to find and replace ".x" and ".y" with "\tx" (tab x) and "\ty" to force the extension into a different column, but to ignore any other periods that don't have a following X or Y.
Then import into Excel
Then use Excel's sorting functions to correct errors (errors show up grouped nicely together when you sort cleverly)
Then write some calculations to make values in other columns for word length etc, and a concatenation calculation to put the word back together and put the period back in.
 
I have a list of 40,000 words with the following endings .x12 .y15 an example talktous.x12 i want to remove all the names with the end .x12 and than i want to arrange all the .y15 in order of number of characters from 1 character to the maximum which i have not counted.

Help appreciated.

Eazkk123

Open up a terminal window. Use "grep" to remove all the X12 words (or keep all the Y15 words) Next to sort, this will be harder. You will have to add a second colum to the data that is the character count and then sort on the count and then delete the column . I'd use a short perl script to add the column but you could do this in excel
 
Last edited by a moderator:
I want to remove the WHOLE word with the extension .x12 and every word is different so i can't just use find. Importing it into excel, can someone help with the parameters?! It is very confusing.

What about find and replace in TextEdit? Simple enough. 😕

TextEdit

Edit > Find > Find...

Fill in the bits you want to take out in the 'find' box and leave the 'replace with' empty.

Thats it.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.