Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

toddburch

macrumors 6502a
Original poster
Dec 4, 2006
748
0
Katy, Texas
This is a general question - not necessarily pinned to anything Mac related.

How does one go about extracting data from a web site, designed for humans running searches, to programmatically capture information and process it?

Let me give you an example.

On http://www.realty.com, I can search for houses. Let's say I want write an application that will find all houses in the Houston area, in Katy, that are selling for < $100,000 and have 3 bedroom and 2 baths. Then, once I fond all those, I want to dig deeper.

If I go to that website, I click on Texas on a map of the US, then pick "Houston Area", then pick Katy from a drop down list. Then, a new window opens and I can specify price, rooms, baths, and then "search now". The screen is then populated with data about price, square feet, blah blah blah.

And, then let's say I'm only interested in homes with a price per square foot under a certain threshold.

How would you go about writing a ruby script or java application that would do this automatically? Or, based on this type of interface, could you even do that?

Todd
 

bbarnhart

macrumors 6502a
Jan 16, 2002
824
1
It is very possible that your data mining proposal is against the license to use the site.

Realty.com maintains this site (the "Site") for your personal entertainment, information, education, and communication. Please feel free to browse the Site. You may download material displayed on the Site for non - commercial, personal use only, provided you also retain all copyright and other proprietary notices contained on the materials. You may not, however, distribute, modify, transmit, reuse, report, or use the contents of the Site, including the text, images, audio, and video, for public or commercial purposes, without the written permission of Realty.com.

Stepping off my soapbox, if you search the tubes for "screen scraper" you will find how to do what you want. It will require some effort.
 

toddburch

macrumors 6502a
Original poster
Dec 4, 2006
748
0
Katy, Texas
I had considered that they would have a statement such as that, but didn't look for it. Thanks for posting. In the end, my intention would not violate their terms. Everything would be for my personal use.

I'll look into a screen scraper. Thanks.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.