Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

occamsrazor

macrumors 6502
Original poster
Feb 25, 2007
427
17
Hi,

I have a long text document that has data in this exact format (it's a list of soccer players and their teams etc):

AUS 1 GK Andrew REDMAYNE - (Central Coast Mariners, AUS)
AUS 2 D Daniel MULLEN - (Adelaide Utd., AUS)
AUS 3 D Luke DEVERE - (Brisbane Roar, AUS)

The spaces between "AUS", "1", "GK", and "Andrew" are all Tabs. I need to get this document into this exact format:

AUS1 Andrew Redmayne
AUS2 Daniel Mullen

i.e. I need to delete the extra information, remove some of the tabs, and convert words in ALLCAPS to first letter in uppercase, rest of word lowercase.

Does anyone know a text editor that can do this? With Apple's Text Edit I can't even seem to get rid of the tabs - is there even a way to search/replace a tab character?

Any help much appreciated.....
 
I use Word for advanced stuff like this. I'm sure there are text editors that can search/replace special characters, but I prefer to keep it simple: TextEdit for simple stuff; Word for everything else.

Edit: Actually, you can do this with TextEdit. Open the file, then highlight one of the tab spaces and copy. Then search and paste the tab space for your search, replacing it with a space:
Picture 2.jpg
 
Thanks... kind've stupid but I never even thought of using Word as a Text editor! So far I've managed to get it almost how I need it. I've managed to get this:

AUS 1 GK Andrew REDMAYNE - (Central Coast Mariners, AUS)
AUS 2 D Daniel MULLEN - (Adelaide Utd., AUS)
AUS 3 D Luke DEVERE - (Brisbane Roar, AUS)

to this

AUS1 goalkeeper Andrew REDMAYNE - (Central Coast Mariners, AUS)
AUS2 defender Daniel MULLEN - (Adelaide Utd., AUS)
AUS3 defender Luke DEVERE - (Brisbane Roar, AUS)

Now I just need to do two things:

1. delete everything after and including the "-" on each line
2. Find a way to look for a word that's in "ALLCAPS" and replace it with "Allcaps"

Any ideas?
 
If you're using Word, the all caps is easily removed by Format > Change Case
For deleting everything after any "-", I would copy the text to Excel, then use Data > Text to Columns with "-" as a delimiter to truncate what you want, then copy from Excel and paste back into Word. If you don't want to use Excel, you could replace all the "-" with a paragraph mark in Word, which would put everything after the "-" on a separate line:

AUS1 goalkeeper Andrew REDMAYNE
(Central Coast Mariners, AUS)
AUS2 defender Daniel MULLEN
(Adelaide Utd., AUS)
AUS3 defender Luke DEVERE
(Brisbane Roar, AUS)​

then sort and delete those lines.
 
If you're using Word, the all caps is easily removed by Format > Change Case

Thanks again. I see. But wouldn't that require me to manually select each surname? Or is there some way to find all text in ALLCAPS and auto-apply the Format>Change Case>Title Case option?

For deleting everything after any "-", I would copy the text to Excel, then use Data > Text to Columns with "-" as a delimiter to truncate what you want, then copy from Excel and paste back into Word.

I'll have a try with Excel... It does seem like it would be easier to manipulate once the text is in different columns.

What I was thinking, but didn't succeed in doing, was if there was a way to use wildcards to say:

find everything that starts with a "-", has a variable number of wildcard characters inbetween, and ends with a ")"

Then I could just replace it with nothing.
 
Thanks again. I see. But wouldn't that require me to manually select each surname? Or is there some way to find all text in ALLCAPS and auto-apply the Format>Change Case>Title Case option?
I don't know of any text editor or word processor that could do that.
What I was thinking, but didn't succeed in doing, was if there was a way to use wildcards to say:
find everything that starts with a "-", has a variable number of wildcard characters inbetween, and ends with a ")"
Then I could just replace it with nothing.
I don't know of any other way to do that besides using the Text to Columns in Excel. When you use the Text to Columns, you're not really going to put text into separate columns. You're going to eliminate everything after the "-" by not importing that column:
Picture 3.jpg

Or you could use my other recommendation:
...If you don't want to use Excel, you could replace all the "-" with a paragraph mark in Word, which would put everything after the "-" on a separate line:

AUS1 goalkeeper Andrew REDMAYNE
(Central Coast Mariners, AUS)
AUS2 defender Daniel MULLEN
(Adelaide Utd., AUS)
AUS3 defender Luke DEVERE
(Brisbane Roar, AUS)​

then sort and delete those lines.
 
Sweet!
Using your Excel method I've now got it down to:

AUS1 goalkeeper Andrew REDMAYNE
AUS2 defender Daniel MULLEN
AUS3 defender Luke DEVERE

I just need to go through, manually, it seems, and convert all the ALLCAPS to Allcaps. Seems strange there's no way to automate it, but I'm going to try.... Thanks so much for all your help you've been really useful.
 
I just need to go through, manually, it seems, and convert all the ALLCAPS to Allcaps. Seems strange there's no way to automate it, but I'm going to try.... Thanks so much for all your help you've been really useful.

There is, but you'll need a real text editor and some knowledge of regular expressions. For a text editor, try Textwrangler. For regular expressions use Google. Or you could write a shell script, but you'll still need regular expressions to do it.

I don't know of any text editor or word processor that could do that.

Pretty much any text editor with regex matching can do this.
 
I managed to locate those words that were in ALLCAPS using the expression [{A-Z}]{2,100} but only to find and select them.
I couldn't see how you could do something along the lines of:

"find words matching [{A-Z}]{2,100} and then replace them all in one go with the same word in title/proper case i.e Smith not SMITH"

You still seemed to have to do it one by one. I'm half way through the list at the moment, using MS WOrd, and just clicking on the ALLCAPS word and then using the keyboard shortcut cmd-option-C twice to toggle from uppercase to title/proper case.
 
Wow. Textsoap does everything.
I created a "custom cleaner" with:

FIND: [{A-Z}]{2,100}
APPLY: Capitalize with Title Case

...and it converted all the ALLCAPS words instantly. Nice find - Thanks :)
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.