hello!
I am using the following code to display the popular Digg stories on my desktop using GeekTool:
In addition to my lovely Digg articles, I get some nasty HTML. All of them start with '<a' and end with '/%gt' (without quotes). Like this:
Is there a way I can filter these out, or just keep in what I want?
Thanks!
Donald J
I am using the following code to display the popular Digg stories on my desktop using GeekTool:
Code:
#!/bin/sh
URL="http://feeds.digg.com/digg/popular.rss"
if [ $# -eq 1 ] ; then
headarg=$(( $1 * 2 ))
else
headarg="-8"
fi
curl --silent "$URL" | grep -E '(title>|description>)' | \
sed -n '3,$p' | \
sed -e 's/<title>//' -e 's/<\/title>//' -e 's/<description>/ /' \
-e 's/<\/description>//' | \
sed -e 's/<!\[CDATA\[//g' |
sed -e 's/\]\]>//g' |
sed -e 's/<[^>]*>//g' |
head $headarg | sed G | fmt
In addition to my lovely Digg articles, I get some nasty HTML. All of them start with '<a' and end with '/%gt' (without quotes). Like this:
Triumph Takes on Bonnaroo
The Tonight Show's Triumph The Insult Comic Dog was everywhere
at Bonnaroo '09 -- backstage, onstage, Tent City, the Discotheque
Arcade, Scratch DJ Academy.
<a
href="http://feedads.g.doubleclick.net/~at/d5vQ3Wnr3c5yLVVPb2lZGfWopQk/1/da"><img
src="http://feedads.g.doubleclick.net/~at/d5vQ3Wnr3c5yLVVPb2lZGfWopQk/1/di"
border="0" ismap="true"></img></a></p><img
src="http://feeds.feedburner.com/~r/digg/popular/~4/5kjuVK9EVp4"
height="1" width="1"/>
A troubled week in Iran [Pics]
Is there a way I can filter these out, or just keep in what I want?
Thanks!
Donald J