go through html code

Discussion in 'iOS Programming' started by DennisBlah, Feb 4, 2014.

  1. DennisBlah macrumors 6502

    DennisBlah

    Joined:
    Dec 5, 2013
    Location:
    The Netherlands
    #1
    Hi there, I'm trying to run through some htmlcode to fetch which tags are in
    Basicly I'm looking for form input tags and I want to get the ID and/or Class and/or default value and all. And then I will save them to an tableview.

    For now I got this:
    Code:
    long length = [curPageSource length];
        BOOL openCode = NO;
        long openCodeLoc = 0;
        BOOL closeCode = NO;
        long closeCodeLoc = 0;
        for(int chr = 0; chr < length; chr++) {
            char curCharacter = [curPageSource characterAtIndex: chr];
            if(!openCode && curCharacter == '<') {
                NSString *checkCode = [curPageSource substringWithRange:NSMakeRange(chr, 10)];
                if([checkCode rangeOfString:@"<input"].location != NSNotFound) {
                    openCode = YES;
                    openCodeLoc = chr;
                    NSLog(@"Found input");
                }
            }
            
            
            
            if(openCode && curCharacter == '>') {
                closeCode = YES;
                closeCodeLoc = chr;
            }
            
            if(openCode && closeCode) {
                NSLog(@"Going through the found code...");
                openCodeLoc = 0;
                closeCodeLoc = 0;
                openCode = NO;
                closeCode = NO;
            }
        }
    
    The first few will go fine. and eventually ending up with an crash
    -[__NSCFString substringWithRange:]: Range {23491, 10} out of bounds; string length 23498

    Why does this happen, and how could I fix it ?
     
  2. ArtOfWarfare macrumors 604

    ArtOfWarfare

    Joined:
    Nov 26, 2007
    #2
    Your problem is that your loop will run over each and every character, and in the loop it'll try grabbing the next 10 characters after that character.

    Here's the two parts of code causing your problem:

    Code:
    for(int chr = 0; chr < length; chr++)
    [curPageSource substringWithRange:NSMakeRange(chr, 10)];
    Either make length 9 characters shorter, or change your NSMakeRange to grab fewer characters.

    Here's some example html that will instantly fail:

    Code:
    <HTML>
    It'll enter your loop, see that you have a <, then try grabbing the next ten characters and crash because there aren't ten left to grab.

    You could add that as a unit test (good topic to learn about.)

    You could also learn about Regex patterns, which would allow you to write all of the code you have so far (and actually work) in about 1 line. (In Cocoa/CocoaTouch, the class for using Regex patterns is called NSRegularExpression.)
     
  3. waterskier2007 macrumors 68000

    waterskier2007

    Joined:
    Jun 19, 2007
    Location:
    White Lake, MI
    #3
    I didn't really check out your code but if you are looking at doing html parsing you may want to check out hpple
     
  4. DennisBlah thread starter macrumors 6502

    DennisBlah

    Joined:
    Dec 5, 2013
    Location:
    The Netherlands
    #4
    Thanks for the replies and link to the github. I already found out the error. I didnt resetted the positions once found and walked through :)
    thanks again
     

Share This Page