problem with reading html contents and store to NSData

Discussion in 'Mac Programming' started by saleh.hi.62, Aug 1, 2011.

  1. saleh.hi.62 macrumors member

    Joined:
    Jul 25, 2011
    #1
    Hello guys,

    i am trying to read the contents of a HTML file and store them into a NSData.

    Code:
    NSData *htmlData = [NSData dataWithContentsOfFile:@"index.html"];
    	NSLog(@"%@", htmlData);
    But after it runs, it prints out only null !

    what is wrong here ?
     
  2. chown33 macrumors 604

    Joined:
    Aug 9, 2009
    #2
    My first guess is that the path for the file is wrong. You're using a relative pathname, so there must be a file named "index.html" located in the current working directory. The current working directory is probably NOT the location where your app is.

    To find what the current working directory is, use NSFileManager's currentDirectoryPath method.


    You should also try using the NSData method +dataWithContentsOfFile:eek:ptions:error:, and pass in a non-null error pointer. Read the description of the method so you know how to use it properly. If you can't get a useful NSError result, post your code.


    You should also re-read the NSData class reference for the +dataWithContentsOfFile: method. It says:
    If you need to know what was the reason for failure, use dataWithContentsOfFile:eek:ptions:error:.​
    So the way to get the reason for failure is right there in the basic reference documentation.
     
  3. saleh.hi.62 thread starter macrumors member

    Joined:
    Jul 25, 2011
    #3
    thanks my friend!
    i fixed the URL prblem. it is ok now

    but still my program does not work!

    here is my code:
    Code:
    #import <Foundation/Foundation.h>
    #import "SSJuju.h"
    #import "SSNode.h"
    
    int main (int argc, const char * argv[]) {
        NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
    
        NSError * error = nil;
    	NSData *htmlData = [NSData dataWithContentsOfFile:@"/index.html" options: NSMappedRead error: &error];
    
    	
    	
    	SSJuju *doc = [[[SSJuju alloc] initWithHTMLData:htmlData] autorelease];
    	NSArray *elements = [doc search:@"//div[@class='gwblock']"];
    	for (SSNode *e in elements) {
    		SSINode *bNode = [e firstChild];                                // Node "b"
    		do {
    			NSString *idValue = [bNode attributeByName:@"id"];          // Easy access to attribute
    			NSString *nodeName = [bNode name];                          //tag name
    			NSString *nodeValue = [[bNode firstChild] description];
    			NSLog(@"%@ : %@ : %@",idValue,nodeName,nodeValue);
    		} while ((bNode = bNode.right));                                // Access to sibling node
    	}
    	
        [pool drain];
        return 0;
    }
    
    when i tried to print out the contents of the htmldata(NSData) this was the output :
    [​IMG]

    this is a simple parser program that use JUJU HTML Parser which inspired by Hpple Parser.

    aside from that print my program does not have any error . but also does not have any result !
     

    Attached Files:

  4. chown33 macrumors 604

    Joined:
    Aug 9, 2009
    #4
    Use simpler and smaller data.

    Use a simpler and smaller xpath expression.

    Post the HTML text you're parsing, so we know what to expect. It could be the program is working perfectly, but your data doesn't contain the xpath expression you're expecting it to find. We can't read your disk. We can't see your files. We can't read your mind for what you're expecting to happen.


    Start small, prove that it works with simple data, then work your way up from that. If the complex thing doesn't work, go back to the simplest thing that does work. That's a basic principle of testing and debugging.


    Do you understand what the displayed values for the NSData represent? It's hex. The data is a sequence of bytes. Grouped by fours.

    For example, the last line:
    Code:
    6970743e 0a0a3c2f 626f6479 3e0a3c2f 68746d6c 3e0a
    Look at an ASCII code table (or a UTF-8 code table), and decode what characters are represented:
    Code:
    ipt>
    
    </body>
    </html>
    
     

Share This Page