PDA

View Full Version : how to get data, HTML tags in XML file, while xml parsing




psudheer28
Aug 25, 2010, 04:41 AM
Hi all,

Thanks in advance,

I am using NSXMLParser to parse xml file, in my application,

my xml file is like this

< item >
< ID >
123456
< /ID >
< category >
Films
< /category >
< Heading >
HollyWood films
< /Heading >
< Author >
samule
< /Author >
< imageFull >
http://tree_one.jpg
< /imageFull >
< contentFull >
New York, the stars will fly to Las Vegas for another one. On New Yearís eve no shoot because itís been left free for partying.
< strong >
Costly choices
< /strong >
< b > A source < /b >
adds that Sajid wants to make up for the missed family
< br >time by allowing them to have
a blast without bothering about anything.
< /contentFull >
< PubDate >
Monday, 14 December 2009
< /PubDate >
< /item >

my question is when i get the content in < contentFull > tag, it is not coping the content to the string.
i think because of the internal HTML tags its not getting the content.

how can i solve this, to ignore HTML tags to perform as mentioned in the xml file(bold, break, strong,... etc).

plz guide me, is it possible in NSXMLParser?,



ianray
Aug 25, 2010, 05:11 PM
No XML parser is going to be happy with inter-mixed XML and HTML elements.

Are you parsing an RSS feed? RSS feeds typically include HTML as CDATA, for example:


<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Foo</title>
<entry>
<content type="html" xml:base="http://example.com/" xml:lang="en"><![CDATA[
<strong>Costly choices</strong>
<b>A source</b> adds that Sajid wants to make up for the missed family
<br>time by allowing them to have a blast without bothering about anything.
]]></content>
</entry>
</feed>


Hope this helps :)

psudheer28
Aug 25, 2010, 11:28 PM
Hi ianray,

thanks for reply,

i don't know XML (language), but i am using that xml file for parsing.

i googled about this, i get some information about CDATA, but i didn't understand how to use.

in XML, is coding format has to be changed??? or
in my code, i have to change the code???

this is my code, for xml parsing::::

- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI
qualifiedName:(NSString *)qualifiedName attributes:(NSDictionary *)attributeDict{
if(nil != qualifiedName){
elementName = qualifiedName;
}
if ([elementName isEqualToString:@"item"]) {
self.currentItem = [[[BlogRss alloc]init]autorelease];
} else if([elementName isEqualToString:@"ID"] ||
[elementName isEqualToString:@"category"] ||
[elementName isEqualToString:@"imageFull"]||
[elementName isEqualToString:@"Heading"] ||
[elementName isEqualToString:@"Author"] ||
[elementName isEqualToString:@"contentFull"]||
[elementName isEqualToString:@"PubDate"]) {
self.currentItemValue = [NSMutableString string];
} else {
self.currentItemValue = nil;
}
}



- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName {
if(nil != qName){
elementName = qName;
}
if([elementName isEqualToString:@"ID"]){
self.currentItem.ID = self.currentItemValue;

}else if([elementName isEqualToString:@"category"]) {
self.currentItem.category = self.currentItemValue ;

}else if([elementName isEqualToString:@"Heading"]){
self.currentItem.heading = self.currentItemValue;

}else if([elementName isEqualToString:@"Author"]){
self.currentItem.author = self.currentItemValue;

}else if([elementName isEqualToString:@"imageFull"]){
self.currentItem.imageUrl = self.currentItemValue;

}else if([elementName isEqualToString:@"contentFull"]){
self.currentItem.content = self.currentItemValue;
;
}else if([elementName isEqualToString:@"PubDate"]){
self.currentItem.pubDate = self.currentItemValue;

}else if([elementName isEqualToString:@"item"]){
[selectedNewsArray addObject:self.currentItem];
}

}

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {
if(nil != self.currentItemValue){
[self.currentItemValue appendString:string];
}
}



thank you

Sykte
Aug 26, 2010, 08:55 AM
In your DidStartElement init a new currentItemValue you will need to ignore <Strong>, <b>, <br startingElements and continue appending the string until you hit didEndElement </contentFull>.


edit:

I wanted to add KVC is your friend when working with XML. The example you posted is eek.

ranguvar
Aug 26, 2010, 10:10 AM
You might want to read this (http://cocoawithlove.com/2008/10/using-libxml2-for-parsing-and-xpath.html). That way, you can use XPath and extract all the data you want really quick.

ianray
Aug 26, 2010, 12:50 PM
In your DidStartElement init a new currentItemValue you will need to ignore <Strong>, <b>, <br startingElements and continue appending the string until you hit didEndElement </contentFull>.

If I'm not mistaken that strategy will not work. A lone "br" tag is not valid XML, and the NSXMLParser will fail.

2010-08-26 20:49:30.634 PhonePlayground[13456:207] didStartElement contentFull
2010-08-26 20:49:30.634 PhonePlayground[13456:207] didStartElement b
2010-08-26 20:49:30.635 PhonePlayground[13456:207] didEndElement b
2010-08-26 20:49:30.635 PhonePlayground[13456:207] didStartElement br
2010-08-26 20:49:30.637 PhonePlayground[13456:207] parseErrorOccurred Error Domain=NSXMLParserErrorDomain Code=76 "The operation couldnít be completed. (NSXMLParserErrorDomain error 76.)"