Web page (text only?) searchable archive

Discussion in 'Web Design and Development' started by chesterville, Nov 1, 2008.

  1. chesterville macrumors newbie

    Sep 26, 2006
    Does anyone know if there is a program (or a way to develop a program other than a cumbersome Access database) to easily archive and search web pages - specifically the text? An RSS feed has been suggested, but will this allow me to archive and search the text for keywords?

    My question probably isn't all that clear, so please allow me to elaborate: Basically, I'm a finance geek. I read 20+ finance blogs on a regular basis and several newspapers (online editions). I frequently wish I could type in a keyword for information that I want (say, "foreclosures" or "US GDP") and then be able to see that text of blog posts or newspaper articles on the subject that I found interesting. This is particularly tricky when I'm looking for specific information from a blog post or newspaper, but I don't remember which one. Also, my dream (twisted, I know) is to eventually build up an archive spanning years of information so that I can refer back to it in the future (something about not knowing history dooms you to repeat it). So if the database actually stored the text on my computer or on some "cloud computer" where it is not subject to the future existence of the blog that would be nice too (I'm not looking to break copyright laws, though - it would be strictly for personal use). If you made it to this part of the thread, wow! Thanks for reading all of this.

    Thanks in advance for any assistance.


  2. chilipie macrumors 6502a


    May 8, 2006
    Most RSS feed readers will archive the data on your computer, so you won't be relying on the sites to still be there. I think using feeds would be the most sensible way to do it, unless you want to manually copy and paste the relevant text from every new item.
  3. Nugget macrumors 68000


    Nov 24, 2002
    Houston Texas USA
    You might want to take a look at browseback which is sort of designed to do what you describe (although it stores as pdf so that you retain formatting and image information as well).

    It's really slick, but can be a bit resource intensive.
  4. angelwatt Moderator emeritus


    Aug 16, 2005
  5. chesterville thread starter macrumors newbie

    Sep 26, 2006
    Thanks everyone for the responses. I'll be looking into all of your suggestions.

Share This Page