PDA

View Full Version : Software for turning HTML into Plain Text?




Macette
Sep 12, 2003, 09:12 PM
Howdy,

Does anyone know of software for Mac OSX that I can use to strip html tags from a document and turn it into plain text? I've got a whole lot of horribly marked-up pages on a site that I'm redesigning using CSS, and I don't want to have to go through them one by one, getting rid of all the crud.

Something Applescriptable would be cool...



mrjamin
Sep 13, 2003, 07:41 PM
Originally posted by Macette
Howdy,

Does anyone know of software for Mac OSX that I can use to strip html tags from a document and turn it into plain text? I've got a whole lot of horribly marked-up pages on a site that I'm redesigning using CSS, and I don't want to have to go through them one by one, getting rid of all the crud.

Something Applescriptable would be cool...

a few lines of PHP could do it for you using a regexp to strip out everything inbetween and including < and >'s

Macette
Sep 13, 2003, 07:50 PM
thanks - that's a good idea. Although it will probably take me as long to write the script as it would just to delete the tags by hand (but at least I'd learn something...)

ta.

mrjamin
Sep 13, 2003, 07:59 PM
here's a hint on how to get started:

http://www.php.net/striptags

that pretty much does it for you!

make sure you read through the comments on the page for further advice as it's not 100% reliable

Macette
Sep 13, 2003, 08:06 PM
thanks! i've been doing a bit of php stuff recently and feel like i'm getting the hang of the syntax, but of course i still don't know a quarter of the available functions. this one is good.

mrjamin
Sep 13, 2003, 08:54 PM
Originally posted by Macette
thanks! i've been doing a bit of php stuff recently and feel like i'm getting the hang of the syntax, but of course i still don't know a quarter of the available functions. this one is good.

i doubt many programmers do! there's no need to know many of them as the php site tells you all you need.

with php (and most other languages) all you need is an idea of how its structured then you can find a function that'll do the job.

best piece of advice for any programming? FLOW DIAGRAMS!

Rower_CPU
Sep 13, 2003, 09:24 PM
How about regexp matching in BBEdit?

It'd be easier to play with expressions and see which works best in there than in PHP, IMO.

mrjamin
Sep 14, 2003, 10:29 AM
Originally posted by Rower_CPU
How about regexp matching in BBEdit?

It'd be easier to play with expressions and see which works best in there than in PHP, IMO.

good idea but BBEdit costs money, and solving problems in PHP does your programming the world of good - its amazing how versatile PHP actually is, i've started using it for a lot of non-webbased stuff.

Rower_CPU
Sep 14, 2003, 01:55 PM
*ahem*

BBEdit Lite (http://www.versiontracker.com/dyn/moreinfo/macosx/604)

Doctor Q
Sep 14, 2003, 02:31 PM
Will you get what you want if you open the page in a web browser and then save the page as text?

mrjamin
Sep 14, 2003, 03:33 PM
Originally posted by Rower_CPU
*ahem*

BBEdit Lite (http://www.versiontracker.com/dyn/moreinfo/macosx/604)

rock on!

BareBones took the download link off their site a while ago and i've been looking for it for a while - quite why versiontracker.com wasn't by first port of call i don't know...

matthew24
Sep 14, 2003, 03:41 PM
Netscape works fine for me (Save as).
I would like to know if it is simple with BBEdit.

tjwett
Sep 14, 2003, 03:56 PM
or you could view the site from a browser and simply copy and paste into Text Edit and save it out.

Macette
Sep 14, 2003, 05:03 PM
thanks for suggestions - I'd thought of the browser one already, but i've got, well, ten years of quarterly journals at about 12-15 articles an issue... so that's... 600 pages or something.

is it possible to applescript something like that? I've only ever used applescripts made by other people, so I'm not really sure of what i can do with it.

i've got bbedit - the full version! registered! paid for! - but again, am looking to do this thing as a batch, and haven't worked out how to make bbedit do that.

tjwett
Sep 14, 2003, 05:54 PM
Originally posted by Macette
...is it possible to applescript something like that?...

absolutely! i'm almost certain a script like exists for BBEdit. it's pretty simple. check out http://www.macscripter.net for scripts or look for one over at the Apple boards.