Advice on 'cleaning up' text files

Discussion in 'Web Design and Development' started by torndownunit, Feb 1, 2010.

  1. torndownunit macrumors regular

    Jan 4, 2009
    This question could likely also be placed in other sections of the forum, but it's a problem I am trying to solve related to a site I am working on so I am posting it here.

    I normally get clients to supply me copy for their web sites so I can cut and paste, then format. A current client has very little computer knowledge, but did their best (I guess) supplying me copy in Word documents.

    The problem is the formatting is a complete disaster. The main problem being that there are 2,3 even 4 spaces between a lot of words in the sentences/paragraphs.

    Is there a way to clean up these files without having to grow through and manually erase all those extra spaces?

    Thanks a lot.
  2. UTclassof89 macrumors 6502


    Jun 10, 2008
    Do a GREP find/replace in Word or InDesign (this has the ability to search for patterns, such as any number of spaces or paragraph breaks).
  3. SrWebDeveloper macrumors 68000


    Dec 7, 2007
    Alexandria, VA, USA
    Alternative solution, if Word drives you nuts as it does for me:

    If it's 100% text (no embedded images or objects) first save as .txt and remove all formatting. If there is embedded stuff then instead highlight/copy the text portions only and past into NoteTab (Windows) or bbedit (mac).

    Both have features to remove excess spaces, newlines, HTML in general, etc.

  4. torndownunit thread starter macrumors regular

    Jan 4, 2009
    Thanks for the tip. I looked up Grep but was a little overwhelmed by it. I tried to see if there was a GUI version but couldn't find anything.

    Are there any other eidtors for Mac that have those features? EG does Text Wrangler have it somewhere?

  5. chown33 macrumors 604

    Aug 9, 2009
    on the Western Slopes, with E. A. Poe
    Use any text editor with Replace All.

    The pattern to find is two spaces. The replacement is one space. Click Replace All until no more replacements occur.

    To understand how it works, consider a series of 7 spaces. This is also a series of 3 pairs-of-spaces (the pattern to find) plus one additional space. So those 3 pairs will each be replaced by one space. Now you have a series of 4 spaces, which is 2 pairs. The 2 pairs are replaced by one space each, or 1 pair. And that is replaced by one space.

Share This Page