How to convert/decode html entities

Discussion in 'iOS Programming' started by pashik, Nov 25, 2008.

  1. macrumors member

    Joined:
    Jul 16, 2008
    #1
    Hello.

    i parse xml using NSXMLParser.
    But i get xml encoded strings looks like

    Code:
    <![CDATA[We & #8217;ve been ....]]>
    
    (i inserted space between & #8217; otherwise it is converted to correct symbol here)

    and text extracted from this looks like
    Code:
    We’ve been ....
    instead of We&'ve been ....

    How i can avoid/remove/convert all such html encoded entities?

    thanx for tip.

    Here is code i use
    Code:
    NSData *dat = [NSData dataWithContentsOfURL:[NSURL URLWithString:@"someurl"]];
    		
    NSString* data = [[[NSString alloc] initWithData:dat encoding:NSUTF8StringEncoding] autorelease];
    		
    XMLParser* parser = [[XMLParser alloc] init];
    [parser parseXMLData: data];
    NSData* ndata = [data dataUsingEncoding: NSUTF8StringEncoding];
    NSXMLParser *parser = [[NSXMLParser alloc] initWithData:ndata];
    
     
  2. Moderator emeritus

    robbieduncan

    Joined:
    Jul 24, 2002
    Location:
    London
    #2
    NSString has methods to both encode and decode HTML entities. Look at the NSString documentation.
     
  3. thread starter macrumors member

    Joined:
    Jul 16, 2008
    #3
    There are only encoding parameter NSStringEncoding to work with string.
    can u tell what is method/fucntion for decoding html entitites?
     
  4. Moderator emeritus

    robbieduncan

    Joined:
    Jul 24, 2002
    Location:
    London
    #4
    Did you even look at the NSString documentation as I suggested. The ability to read and understand the documentation is a key programming skill, and one that you appear to be lacking.

    Anyway the method I was referring to is this one.
     
  5. thread starter macrumors member

    Joined:
    Jul 16, 2008
    #5
    Its strange that i really looked at it and what i see there: just string encoding parameters and CFURLCreateStringByReplacingPercentEscapes

    But anyway - thanx for link
     
  6. macrumors regular

    xsmasher

    Joined:
    Jul 18, 2008
    #6
    That is for URL encoding/decoding, not HTML entity encoding/decoding.

    Here's HerbertHansen's solution, from the apple boards:
    http://discussions.apple.com/message.jspa?messageID=8064367#8064367

    Code:
    @implementation MREntitiesConverter
    @synthesize resultString;
    - (id)init
    {
    	if([super init]) {
    		resultString = [[NSMutableString alloc] init];
    	}
    	return self;
    }
    - (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)s {
    		[self.resultString appendString:s];
    }
    - (NSString*)convertEntiesInString:(NSString*)s {
    	if(s == nil) {
    		NSLog(@"ERROR : Parameter string is nil");
    	}
    	NSString* xmlStr = [NSString stringWithFormat:@"<d>%@</d>", s];
    	NSData *data = [xmlStr dataUsingEncoding:NSUTF8StringEncoding allowLossyConversion:YES];
    	NSXMLParser* xmlParse = [[NSXMLParser alloc] initWithData:data];
    	[xmlParse setDelegate:self];
    	[xmlParse parse];
    	NSString* returnStr = [[NSString alloc] initWithFormat:@"%@",resultString];
    	return returnStr;
    }
    - (void)dealloc {
    	[resultString release];
    	[super dealloc];
    }
    @end
    
     

Share This Page