Fastest way to get DOMDocument from URL

Discussion in 'Mac Programming' started by Thethuthinang, Mar 18, 2011.

  1. Thethuthinang macrumors member

    Jan 3, 2011
    I am writing a program that parses the DOMDocument of a given URL. If I use a WebView, this process takes a very long time. I suspect it is because a WebView will process data for display, even if the WebView is not connected to a window (is this true?). An alternative would be to bypass WebView and retrieve the data as follows:

    NSData* nsdata;
    nsdata = [NSData dataWithContentsOfURL:[NSURL URLWithString:someURL]];
    NSString* myString;
    myString = [[NSString alloc] initWithData:nsdata encoding:NSASCIIStringEncoding];
    I now have the data as an NSString. But how can I turn this into a DOMDocument so that I can use the tools provided in the DOM libraries?

    Is there a better way to do this?

    A related question: What data does "dataWithContentsOfURL:" get? What if there are several files located at the given URL? Does it retrieve them all?
  2. kainjow, Mar 18, 2011
    Last edited: Mar 18, 2011

    kainjow Moderator emeritus


    Jun 15, 2000
    I don't know of a way outside of WebKit. There may be third-party libraries you could incorporate but they probably won't handle things like JavaScript and such. It depends on what you need.

    You could play around with WebView and try disabling certain features, like images, etc., and with the delegates to try to cut down on unnecessary loading.

    BTW, don't use those contentsOfURL methods for loading HTTP data. While it works, it's bad because it's synchronous (will block the main thread) and cannot be controlled (cache, canceling, etc). While you can thread it to not block, you still have no control over it. Use NSURLConnection instead. They should only be used for local file URLs.

    It gets the HTTP data. If the browser gives you a directory listing as a web page, you'll get that back.
  3. gnuguy macrumors newbie

    Nov 25, 2006
  4. Thethuthinang thread starter macrumors member

    Jan 3, 2011
    Thanks. I disabled automatic display of images. That seemed to speed things up a bit. I will look into NSXMLDocument more, but it seems at first glance that it does not enable me to use the DOM library directly.

Share This Page