Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

alwaysanewbie

macrumors newbie
Original poster
Sep 11, 2015
7
0
Yorkshire, UK
Hello. Newbie here.


I mean something like this:

I WANDERED lonely as a cloud
That floats on high o'er vales and hills

becomes after pasting into this possibly non-existent application:

- -------- ------ -- - -----
---- ------ -- ---- ---- ----- --- -----


I would love it to exist. Please say it does and it's available for Mac.
 
perl -pi -e 's/\S/\-/g' <file.txt>

Code:
$ cat test.txt
I WANDERED lonely as a cloud
That floats on high o'er vales and hills
$ perl -pi -e 's/\S/\-/g' test.txt
$ cat test.txt
- -------- ------ -- - -----
---- ------ -- ---- ---- ----- --- -----
$
 
  • Like
Reactions: ElectronGuru
perl -pi -e 's/\S/\-/g' <file.txt>

Code:
$ cat test.txt
I WANDERED lonely as a cloud
That floats on high o'er vales and hills
$ perl -pi -e 's/\S/\-/g' test.txt
$ cat test.txt
- -------- ------ -- - -----
---- ------ -- ---- ---- ----- --- -----
$


Thanks for that. I'm excited that what I asked for can be done but I'm afraid I know nothing about Perl programming or any other kind of programming. I've just looked up a few things about Perl and I know I need to use Terminal on my Mac and put the code into a text editor. Beyond that I'm struggling. Which bits of what you wrote above need to go into my text file each time I want to turn a piece of text into dashes and spaces? And how do I run it on Terminal?
 
That text block is a copy and paste of my terminal session ($ is just the shell prompt).

When I executed cat test.txt, it was meant to show you what the contents of test.txt are (a plaintext file containing the lines in your example). It's not relevant to the actual conversion. Again, it's simply just to show you what was in test.txt.

The command perl -pi -e 's/\S/\-/g' test.txt is the actual perl script. Perl is an interpreted language, not a compiled language. You can execute perl commands without putting them into a script file.

So in the command perl -pi -e 's/\S/\-/g' test.txt:
  • text.txt is the file to run the perl program on (just think "input file")
  • -p means execute the perl program for every line of the input file
  • -i means edit the file in place (edits the file directly, not read file, execute on input file, and write a new file / copy)
  • -e means execute the following "program" (perl code)
  • 's/\S/\-/g' is the actual "program" being executed
    • the three forward slashes are separators define a search and replace, where /search_for_this_pattern/replace_with_this_pattern/.
    • s before the search and replace (///) means substitute (essentially output the results of the search and replace)
    • The first half of the /// is a just a regular expression where \S is defined as any single non-whitespace character
    • The second half of the /// is another regular expression where \- defines a single hyphen (backslash required to escape the hyphen, since hyphen is a special character in regular expressions)
    • g after the search and replace (///) means run the search and replace for all matches on the line (not just once per line).
When I executed cat test.txt the second time (after the perl command), that was simply just to show the contents of test.txt after the perl code was run on it (proves the perl code works).

So essentially, whenever you need to do the character to hyphen conversions, just put that text into a plaintext file and save it.

Then run perl -pi -e 's/\S/\-/g' <filename> on it. Then the text file should be converted.

The heart of all this is knowing regular expressions. If you know regular expressions, you could also just use a text editor that accepts regular expressions in the editor's find/replace functionality, so you don't have to execute perl from CLI. So you just open said text editor, paste the text to convert into a new document, run the find/replace using a search for non-whitespace and replace with hyphen, let it run in the editor, and copy the resulting text from the document.

It looks like TextWrangler supports find/replace with regular expressions: http://www.barebones.com/products/textwrangler/
 
Last edited:
  • Like
Reactions: ElectronGuru
I managed to get my text converted to hyphens but now I realise the script needs to have an extra subtlety. When it converts my chosen text to hyphens it needs to leave in from the passage of English all the commas, colons, semi-colons, full stops, etc. Basically any punctuation mark. Is it possible you could show how to do that?

Thank you very much for the detailed help you have given me.
 
perl -pi -e 's/\S/\-/g' <file.txt>

Code:
$ cat test.txt
I WANDERED lonely as a cloud
That floats on high o'er vales and hills
$ perl -pi -e 's/\S/\-/g' test.txt
$ cat test.txt
- -------- ------ -- - -----
---- ------ -- ---- ---- ----- --- -----
$
This could probably be put into a Service created with Automator, and made to process the current selection in whatever app the Service is invoked from. There are plenty of examples and tutorials around about how to make Services using Automator.

If not a Service, it could easily be made into an Automator Application that accepts dropped files as input and converts them. Again, plenty of Automator examples.
 
I managed to get my text converted to hyphens but now I realise the script needs to have an extra subtlety. When it converts my chosen text to hyphens it needs to leave in from the passage of English all the commas, colons, semi-colons, full stops, etc. Basically any punctuation mark. Is it possible you could show how to do that?

Thank you very much for the detailed help you have given me.

s/\w/\-/g

where \w refers to any 'word' character (alphanumeric including underscore).

$ cat test.txt
I managed to get my text converted to hyphens but now I realise the script needs to have an extra subtlety. When it converts my chosen text to hyphens it needs to leave in from the passage of English all the commas, colons, semi-colons, full stops, etc. Basically any punctuation mark. Is it possible you could show how to do that?
$ perl -pi -e 's/\w/\-/g' test.txt
$ cat test.txt
- ------- -- --- -- ---- --------- -- ------- --- --- - ------- --- ------ ----- -- ---- -- ----- --------. ---- -- -------- -- ------ ---- -- ------- -- ----- -- ----- -- ---- --- ------- -- ------- --- --- ------, ------, -----------, ---- -----, ---. --------- --- ----------- ----. -- -- -------- --- ----- ---- --- -- -- ----?

Note: The above regexp will only match alphanumeric. You'll have to add more individual characters like brackets, curly braces, etc. There is no special regexp in perl for a "non-punctuation" character.

So, the above might be expanded further to,

s/\w|\[|\]|\{|\}/\-/g

which will match any "word" character, any [, any ], any {, or any }. But obviously, you'll have to expand this further to include things like ampersand (&), dollar sign ($), so the regexp might get quite long.

Then again, you could try a reverse replace, but one-line perl'ing that will get quite ugly.
 
Last edited:
s/\w/\-/g

where \w refers to any 'word' character (alphanumeric including underscore).

$ cat test.txt
I managed to get my text converted to hyphens but now I realise the script needs to have an extra subtlety. When it converts my chosen text to hyphens it needs to leave in from the passage of English all the commas, colons, semi-colons, full stops, etc. Basically any punctuation mark. Is it possible you could show how to do that?
$ perl -pi -e 's/\w/\-/g' test.txt
$ cat test.txt
- ------- -- --- -- ---- --------- -- ------- --- --- - ------- --- ------ ----- -- ---- -- ----- --------. ---- -- -------- -- ------ ---- -- ------- -- ----- -- ----- -- ---- --- ------- -- ------- --- --- ------, ------, -----------, ---- -----, ---. --------- --- ----------- ----. -- -- -------- --- ----- ---- --- -- -- ----?

Note: The above regexp will only match alphanumeric. You'll have to add more individual characters like brackets, curly braces, etc. There is no special regexp in perl for a "non-punctuation" character.

So, the above might be expanded further to,

s/\w|\[|\]|\{|\}/\-/g

which will match any "word" character, any [, any ], any {, or any }. But obviously, you'll have to expand this further to include things like ampersand (&), dollar sign ($), so the regexp might get quite long.

Then again, you could try a reverse replace, but one-line perl'ing that will get quite ugly.

That is great. It worked very nicely. It's going to be very useful.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.