PDA

View Full Version : Check for Changes without Downloading?




ArtOfWarfare
Mar 7, 2013, 08:28 PM
I'm making a web browser. One of the features I'd like is to have it only show a reload button if the page has changed since it last loaded.

I figured a good way of doing that would be if I could somehow request a MD5 checksum of the page every so many seconds and compare it against the checksum of the currently loaded page. Is that somehow possible?

Alternatively, is there some way I could only download a portion of a page? Like, maybe my browser could just download a random portion of a page, and if it appears to be unchanged, it halts?



ytk
Mar 7, 2013, 11:21 PM
I don't believe there's any way to request a checksum of a web page. That would have to be something supported by HTTP and the web server, and I've never heard of such a thing.

Downloading a portion of the page isn't really possible either as far as I know. You could simply request the hypertext portion of the page and not load linked images and whatnot. Not really sure why you'd want to do this, though, because it's not going to be terribly reliable in determining whether the overall “page” has changed, and it's going to use far more bandwidth and system resources than are really merited for the “problem” you're trying to solve.

Lastly, constantly hitting a server in the background just to see if the page has changed is rude. If I were running a web server, and I noticed that a certain type of browser is constantly reloading the page, I'd block that browser pretty quick.

chown33
Mar 8, 2013, 12:13 AM
I'm making a web browser. One of the features I'd like is to have it only show a reload button if the page has changed since it last loaded.

I advise against this.

For one thing, there's no way the browser can determine why I might want to perform a reload. There are plenty of reasons I might reload other than that the page changed. Preventing the user from doing something as simple as this seems misguided to me.

What are you trying to save?

If the goal is to save on unnecessary transfers, then caching is a good idea. Preventing users from overriding the cache-control is not.

I figured a good way of doing that would be if I could somehow request a MD5 checksum of the page every so many seconds and compare it against the checksum of the currently loaded page. Is that somehow possible?

You should take a closer look at the HTTP protocol, as described in RFCs. In particular, learn how HEAD requests work, what ETags are, and how to perform a conditional GET request.

A starting point for ETags:
http://en.wikipedia.org/wiki/HTTP_ETag

In my experience, an ETag is often a hash; MD5 and SHA-1 are not uncommon. AFAIK there is no universal standard for how an ETag is calculated. And given the purpose of ETags, there doesn't need to be one.

Alternatively, is there some way I could only download a portion of a page? Like, maybe my browser could just download a random portion of a page, and if it appears to be unchanged, it halts?

HTTP requests support byte ranges. Servers may or may not support the capability. See the HTTP protocol specs as written in RFCs.


This seems like you're adding a lot of complexity just to disable a Refresh button in a browser. What do you hope to gain? What benefit would users gain?

ArtOfWarfare
Mar 8, 2013, 09:44 AM
I know plenty of people who simply mash the refresh button every 5 seconds because they're waiting for something to appear (a response to a post, a news article, an edit they just uploaded from some other application, etc.) Many of them, in their paranoid mistrust of technology, will insist that the browser isn't properly reloading.

I feel like these people might feel better if there was an indicator that tells them nothing has changed, so don't try reloading. These people trust the popup in the Facebook iOS app that says "10 new comments!" at the top of the screen, so I figure they might similarly trust that if the browser isn't saying updates have occurred (in the form of the refresh button not being ghosted,) that updates must not have occurred yet.

The ETags sound like they'd be great if they were required in any way... + it sounds like sites like Hulu screw the concept all up.

SPDY sounds like it does what I want... except it looks like not many websites use that...

960design
Mar 8, 2013, 10:17 AM
Have you tried reading up on websockets? (that read as snide, sorry, I'm probably way out of my league here). You could push content to their page when it changes. This still will not prevent the paranoid, but nothing will, nor should you try, it makes them happy. Haha.

robvas
Mar 8, 2013, 10:34 AM
Build your site as a big fat Javascript app (using some framework like Angular) that polls the server for updates. You build an api on your web app so the clients can say stuff like 'has anything been updated since this time, or this last update that I got', and if so, give me the new stuff'

Then you're only pushing down the comments/stories you don't have saved locally on the client, instead of all of them.

960design
Mar 8, 2013, 10:59 AM
Build your site as a big fat Javascript app (using some framework like Angular)or node