Data Extraction - how's it work?

Discussion in 'Mac Programming' started by toddburch, Apr 13, 2007.

  1. toddburch macrumors 6502a

    Joined:
    Dec 4, 2006
    Location:
    Katy, Texas
    #1
    This is a general question - not necessarily pinned to anything Mac related.

    How does one go about extracting data from a web site, designed for humans running searches, to programmatically capture information and process it?

    Let me give you an example.

    On www.realty.com, I can search for houses. Let's say I want write an application that will find all houses in the Houston area, in Katy, that are selling for < $100,000 and have 3 bedroom and 2 baths. Then, once I fond all those, I want to dig deeper.

    If I go to that website, I click on Texas on a map of the US, then pick "Houston Area", then pick Katy from a drop down list. Then, a new window opens and I can specify price, rooms, baths, and then "search now". The screen is then populated with data about price, square feet, blah blah blah.

    And, then let's say I'm only interested in homes with a price per square foot under a certain threshold.

    How would you go about writing a ruby script or java application that would do this automatically? Or, based on this type of interface, could you even do that?

    Todd
     
  2. bbarnhart macrumors 6502a

    bbarnhart

    Joined:
    Jan 16, 2002
    Location:
    Stilwell, Kansas
    #2
    It is very possible that your data mining proposal is against the license to use the site.

    Realty.com maintains this site (the "Site") for your personal entertainment, information, education, and communication. Please feel free to browse the Site. You may download material displayed on the Site for non - commercial, personal use only, provided you also retain all copyright and other proprietary notices contained on the materials. You may not, however, distribute, modify, transmit, reuse, report, or use the contents of the Site, including the text, images, audio, and video, for public or commercial purposes, without the written permission of Realty.com.

    Stepping off my soapbox, if you search the tubes for "screen scraper" you will find how to do what you want. It will require some effort.
     
  3. toddburch thread starter macrumors 6502a

    Joined:
    Dec 4, 2006
    Location:
    Katy, Texas
    #3
    I had considered that they would have a statement such as that, but didn't look for it. Thanks for posting. In the end, my intention would not violate their terms. Everything would be for my personal use.

    I'll look into a screen scraper. Thanks.
     
  4. SC68Cal macrumors 68000

    Joined:
    Feb 23, 2006

Share This Page