Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

MrSugar

macrumors 6502a
Original poster
Jul 28, 2003
614
0
Hi, I have a task here at work of taking a webpage that is full of email address, each one seperately embeded in a table. I need to get them into a list so all of them can be emailed at once. Is there any way to scan a webpage and pick up anything that has an email address and none of the other HTML? What are my options here? thanks so much guys!!
 
MrSugar said:
Hi, I have a task here at work of taking a webpage that is full of email address, each one seperately embeded in a table. I need to get them into a list so all of them can be emailed at once. Is there any way to scan a webpage and pick up anything that has an email address and none of the other HTML? What are my options here? thanks so much guys!!

pretty easily done with regular expressions in perl or a similar language.

So write a shell script to
1. curl or wget the page and save to a file
2. pass that filename to a perl script and let it extract the email addresses and save that list to a file


With that said, I hope youre not a spammer.
 
kingjr3 said:
pretty easily done with regular expressions in perl or a similar language.

So write a shell script to
1. curl or wget the page and save to a file
2. pass that filename to a perl script and let it extract the email addresses and save that list to a file


With that said, I hope youre not a spammer.

I asked my friend wade, and he did what you said. Check it out

cat userinfo.txt | grep -E
"([A-Za-z0-9_]|\\-|\\.)+@([A-Za-z0-9_]|\\-|\\.|\\&)+([A-Za-z0-9_]|\\-|\\.)+"
| awk '{print $1}' | grep -v "<mailto"


=) .. I am not a spammer btw
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.