NSXMLParser and Windows encoding.. How?

Discussion in 'iOS Programming' started by Soulstorm, Oct 16, 2009.

  1. macrumors 68000

    Soulstorm

    Joined:
    Feb 1, 2005
    #1
    I have an application and I want to make it read greek characters based on NSWindowsCP1253StringEncoding . NSXMLParser refuses to parse string created with this encoding, but I need that encoding, since my application loads data from these pages.

    Creating an NSString with dataUsingEncoding didn't work...

    Is there any way to convert NSWindowsCP1253StringEncoding to NSUTF8StringEncoding?
     
  2. macrumors 68030

    PhoneyDeveloper

    Joined:
    Sep 2, 2008
    #2
    Creating an NSString with dataUsingEncoding didn't work...

    Why not? How not?

    If that doesn't work then your text isn't in the specified encoding, or you did something wrong.
     
  3. thread starter macrumors 68000

    Soulstorm

    Joined:
    Feb 1, 2005
    #3
    [rant]You are right, I should be more specific. I was angry at myself because Apple rejected my app update for the second time, and I wasn't thinking at the moment :)[/rant]

    Suppose you have already downloaded the data in NSData format:

    Here is my code:

    Code:
    NSString *str = [[[NSString alloc]initWithData:self.xmlData encoding:NSWindowsCP1253StringEncoding]autorelease];
    rssParser = [[NSXMLParser alloc]initWithData:[str dataUsingEncoding:NSUTF8StringEncoding]];
    And here is how I handle the errors:

    Code:
    - (void)parser:(NSXMLParser *)parser parseErrorOccurred:(NSError *)parseError {
    	NSString * errorString = [NSString stringWithFormat:@"Unable to download story feed from web site (Error code %i )", [parseError code]];
    	NSLog(@"error parsing XML: %@", errorString);
    	
    	UIAlertView * errorAlert = [[UIAlertView alloc] initWithTitle:@"Error loading content" message:errorString delegate:self cancelButtonTitle:@"OK" otherButtonTitles:nil];
    	[errorAlert show];
    	[errorAlert release];
    }
    The parser throws an error of type 31, which is an unknown encoding error. I am positive that the content is formatted in this encoding, because I pull my data from a site, and that site has the following header:

    Code:
    <?xml version="1.0" encoding="windows-1253" ?>
    <rss version="2.0">
    ..............................
    <channel>
    
     
  4. Moderator

    dejo

    Staff Member

    Joined:
    Sep 2, 2004
    Location:
    The Centennial State
    #4
    Have you tried using NSString's canBeConvertedToEncoding: to ensure it can be converted without any loss of information?
     
  5. macrumors 68030

    PhoneyDeveloper

    Joined:
    Sep 2, 2008
    #5
    Why are you converting the string to utf-8? If you convert it to utf-8 but the data says it's in windows-1253 the data will almost certainly be wrong.

    I haven't used the xml parser but if it accepts an NSData* then just give it the NSData* that you get from the network.

    Alternatively if the xml parser doesn't accept windows-1253 and you must convert it to utf-8 then I think you need to modify the xml code so it says that it's utf-8.

    I assume that your str string isn't nil.
     
  6. Moderator

    dejo

    Staff Member

    Joined:
    Sep 2, 2004
    Location:
    The Centennial State
    #6
    What happens if you just do this?:

    Code:
    rssParser = [[NSXMLParser alloc] initWithData:self.xmlData];
     
  7. thread starter macrumors 68000

    Soulstorm

    Joined:
    Feb 1, 2005
    #7
    The parser tells me that there is an error of type 31. Which means an unsupported encoding. That's why I am trying to convert it to UTF-8.

    As for NSString's canBeConvertedToEncoding, how can I use that? I am accepting NSData from the site, and converting that data to nsstring requires me to specify the encoding, which defies the purpose, right?

    this is my code:
    Code:
    //
    //  FeedURLConnection.m
    //  RSSTest2
    //
    //  Created by Christos Sotiriou on 10/10/09.
    //  Copyright 2009 Tei of Pireus. All rights reserved.
    //
    
    #import "FeedURLConnection.h"
    #import "NewsPapersSingleton.h"
    #import <CFNetwork/CFNetwork.h>
    
    @implementation FeedURLConnection
    @synthesize stories, xmlFeedConnection, xmlData, url;
    
    - (id) init
    {
    	self = [super init];
    	if (self != nil) {
    		
    	}
    	return self;
    }
    
    - (id) initWithURL:(NSString *)urlString
    {
    	self = [super init];
    	if (self != nil) {
    		self.url = urlString;
    	}
    	return self;
    }
    
    - (void)connectAndParse
    {
    	NSURLRequest *feedURLRequest = [NSURLRequest requestWithURL:[NSURL URLWithString:self.url]];
    	//NSURLResponse *response;
    	//NSData *data = [NSURLConnection sendSynchronousRequest:feedURLRequest returningResponse:&response error:NULL];
    	//[self.xmlData setData:data];
    	self.xmlFeedConnection = [[[NSURLConnection alloc]initWithRequest:feedURLRequest delegate:self]autorelease];
    }
    #pragma mark -
    #pragma mark NSURLConnection delegate methods
    - (void)connection:(NSURLConnection *)connection didReceiveResponse:(NSURLResponse *)response {
    	//NSLog(@"Did Receive Response with name: %@", [response textEncodingName]);
        self.xmlData = [NSMutableData data];
    	myResponce = [response retain];
    }
    
    - (void)connection:(NSURLConnection *)connection didReceiveData:(NSData *)data {
    	//NSLog(@"did receive data! %@ with length: %i", [[[NSString alloc]initWithData:data encoding:NSASCIIStringEncoding]autorelease], [data length]);
    	
        //[xmlData appendData:[self dataFromData:data withEncoding:[myResponce textEncodingName]]];
    	[xmlData appendData:data];
    	//if ([[[NSString alloc]initWithData:data encoding:NSUTF8StringEncoding]canBeConvertedToEncoding:NSUTF8StringEncoding]) {
    	//	NSLog(@"yes, it can!");
    	//}
    	
    }
    
    //I wonder what this does... I found it on Apple
    - (NSData *)dataFromData:(NSData *)data withEncoding:(NSString *)encoding
    {
    	NSStringEncoding nsEncoding = NSUTF8StringEncoding;
    	if (encoding) {
    		CFStringEncoding cfEncoding = CFStringConvertIANACharSetNameToEncoding((CFStringRef)encoding);
    		if (cfEncoding != kCFStringEncodingInvalidId) {
    			nsEncoding = CFStringConvertEncodingToNSStringEncoding(cfEncoding);
    		}
    	}
    	NSString *formattedString = [[[NSString alloc]initWithData:data encoding:nsEncoding]autorelease];
    	NSLog(formattedString);
    	return [[formattedString dataUsingEncoding:nsEncoding]retain];
    }
    
    - (void)connection:(NSURLConnection *)connection didFailWithError:(NSError *)error {
        [UIApplication sharedApplication].networkActivityIndicatorVisible = NO;   
        if ([error code] == kCFURLErrorNotConnectedToInternet) {
            // if we can identify the error, we can present a more precise message to the user.
            NSDictionary *userInfo = [NSDictionary dictionaryWithObject:NSLocalizedString(@"No Connection Error",                             @"Error message displayed when not connected to the Internet.") forKey:NSLocalizedDescriptionKey];
            NSError *noConnectionError = [NSError errorWithDomain:NSCocoaErrorDomain code:kCFURLErrorNotConnectedToInternet userInfo:userInfo];
            [self handleError:noConnectionError];
        } else {
            // otherwise handle the error generically
            [self handleError:error];
        }
        self.xmlFeedConnection = nil;
    	[[NSNotificationCenter defaultCenter]postNotificationName:FEED_TABLEVIEW_NEEDS_REFRESH_NOTIFICATION object:self];
    }
    
    - (void)handleError:(NSError *)error {
        NSString *errorMessage = [error localizedDescription];
        UIAlertView *alertView = [[UIAlertView alloc] initWithTitle:NSLocalizedString(@"Error Title", @"Title for alert displayed when download or parse error occurs.") message:errorMessage delegate:nil cancelButtonTitle:@"OK" otherButtonTitles:nil];
        [alertView show];
        [alertView release];
    }
    
    - (void)connectionDidFinishLoading:(NSURLConnection *)connection {
        self.xmlFeedConnection = nil;
        //[UIApplication sharedApplication].networkActivityIndicatorVisible = NO;   
        // Spawn a thread to fetch the earthquake data so that the UI is not blocked while the application parses the XML data.
        //
        // IMPORTANT! - Don't access UIKit objects on secondary threads.
        //
        //[NSThread detachNewThreadSelector:@selector(parseXMLFileAtURL:) toTarget:self withObject:sel];
        // earthquakeData will be retained by the thread until parseEarthquakeData: has finished executing, so we no longer need
        // a reference to it in the main thread.
        //self.xmlData = nil;
    	//NSLog(@"content: %@", [[[NSString alloc]initWithData:self.xmlData encoding:NSUTF8StringEncoding]autorelease]);
       [NSThread detachNewThreadSelector:@selector(parseXMLFileAtURL:) toTarget:self withObject:self.url];
    }
    
    #pragma mark -
    #pragma mark NSXMLParser Delegations
    
    - (void)parserDidStartDocument:(NSXMLParser *)parser{	
    	NSLog(@"found file and started parsing");
    	
    }
    
    - (void)parseXMLFileAtURL:(NSString *)URL
    {	
    	NSAutoreleasePool *pool = [[NSAutoreleasePool alloc]init];
    	
    	stories = [[NSMutableArray alloc] init];
    	
        //you must then convert the path to a proper NSURL or it won't work
        //NSURL *xmlURL = [NSURL URLWithString:URL];
    	
        // here, for some reason you have to use NSClassFromString when trying to alloc NSXMLParser, otherwise you will get an object not found error
        // this may be necessary only for the toolchain
    	//NSString *xmlString = @"SADfsd";
    	//[xmlString dataUsingEncoding:NSWindowsCP1250StringEncoding];
        //rssParser = [[NSXMLParser alloc] initWithContentsOfURL:xmlURL];
    	//NSString *str = [[[NSString alloc]initWithData:self.xmlData encoding:NSWindowsCP1253StringEncoding]autorelease];
    	//NSLog(str);
    	
    	//rssParser = [[NSXMLParser alloc]initWithData:self.xmlData];
    	//NSData *data = [NSData dataWithContentsOfURL:[NSURL URLWithString:self.url]];
    	//NSString *result = [NSString stringWithContentsOfURL:[NSURL URLWithString:self.url] encoding:NSUTF8StringEncoding error:NULL];
    	
    	rssParser  = [[NSXMLParser alloc]initWithData:self.xmlData];
    	//rssParser = [[NSXMLParser alloc]initWithData:[result dataUsingEncoding:NSUTF8StringEncoding]];
        // Set self as the delegate of the parser so that it will receive the parser delegate methods callbacks.
        [rssParser setDelegate:self];
    	
        // Depending on the XML document you're parsing, you may want to enable these features of NSXMLParser.
        [rssParser setShouldProcessNamespaces:YES];
        [rssParser setShouldReportNamespacePrefixes:YES];
        [rssParser setShouldResolveExternalEntities:YES];
    	
        [rssParser parse];
    	
    	[pool release];
    }
    
    - (void)parser:(NSXMLParser *)parser parseErrorOccurred:(NSError *)parseError {
    	NSString * errorString = [NSString stringWithFormat:@"Unable to download story feed from web site (Error code %i )", [parseError code]];
    	NSLog(@"error parsing XML: %@", errorString);
    	
    	UIAlertView * errorAlert = [[UIAlertView alloc] initWithTitle:@"Error loading content" message:errorString delegate:self cancelButtonTitle:@"OK" otherButtonTitles:nil];
    	[errorAlert show];
    	[errorAlert release];
    }
    
    - (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict{			
        //NSLog(@"found this element: %@", elementName);
    	currentElement = [elementName copy];
    	if ([elementName isEqualToString:@"item"]) {
    		// clear out our story item caches...
    		item = [[NSMutableDictionary alloc] init];
    		currentTitle = [[NSMutableString alloc] init];
    		currentDate = [[NSMutableString alloc] init];
    		currentSummary = [[NSMutableString alloc] init];
    		currentLink = [[NSMutableString alloc] init];
    	}
    	
    }
    
    - (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName{     
    	//NSLog(@"ended element: %@", elementName);
    	if ([elementName isEqualToString:@"item"]) {
    		// save values to an item, then store that item into the array...
    		[item setObject:currentTitle forKey:@"title"];
    		[item setObject:currentLink forKey:@"link"];
    		[item setObject:currentSummary forKey:@"summary"];
    		[item setObject:currentDate forKey:@"date"];
    		
    		[stories addObject:[item copy]];
    		[item release];
    		//NSLog(@"adding story: %@", currentTitle);
    	}
    	
    }
    
    - (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string{
    	//NSLog(@"found characters: %@", string);
    	// save the characters for the current item...
    	if ([currentElement isEqualToString:@"title"]) {
    		[currentTitle appendString:string];
    	} else if ([currentElement isEqualToString:@"link"]) {
    		[currentLink appendString:string];
    	} else if ([currentElement isEqualToString:@"description"]) {
    		[currentSummary appendString:string];
    		//NSLog(string);
    	} else if ([currentElement isEqualToString:@"pubDate"]) {
    		[currentDate appendString:string];
    	}
    	
    }
    
    - (void)parserDidEndDocument:(NSXMLParser *)parser {
    	
    	//[activityIndicator stopAnimating];
    	//[activityIndicator removeFromSuperview];
    	
    	//NSLog(@"all done!");
    	//NSLog(@"stories array has %d items", [stories count]);
    	//[[NSNotificationCenter defaultCenter]postNotificationName:FEED_TABLEVIEW_NEEDS_REFRESH_NOTIFICATION object:self];
    	//[newsTable reloadData];
    	[self performSelectorOnMainThread:@selector(returnToMainThreadWithNotificationPosted) withObject:nil waitUntilDone:NO];
    }
    
    - (void)returnToMainThreadWithNotificationPosted
    {
    	NSLog(@"all done!");
    	NSLog(@"stories array has %d items", [stories count]);
    	[[NSNotificationCenter defaultCenter]postNotificationName:FEED_TABLEVIEW_NEEDS_REFRESH_NOTIFICATION object:self];
    }
    
    #pragma mark -
    - (void) dealloc
    {
    	[url release];
    	[stories release];
    	[xmlFeedConnection release];
    	[xmlData release];
    	[super dealloc];
    }
    
    @end
    
     
  8. thread starter macrumors 68000

    Soulstorm

    Joined:
    Feb 1, 2005
    #8
    Seems every developer has problems with Windows encodings, but I think I have found a workaround. I will post back with results.
     
  9. macrumors 68030

    PhoneyDeveloper

    Joined:
    Sep 2, 2008
    #9
    One suggestion

    download a response from your server and write it out to a file. Then make that file work with the XMLParser. You can open the file with a text editor like TextWrangler and it will tell you the encoding of the file. You can inspect the file to see if it looks OK or if there are obvious problems with characters that aren't of the specified encoding.
     
  10. thread starter macrumors 68000

    Soulstorm

    Joined:
    Feb 1, 2005
    #10
    Thanks. Apple has an example for OS X that is called "XMLBrowser". It asks for an XML from a server and it then explores its properties. I tried using that with the page that I'm having problems and I see that there is no problem whatsoever.

    However, Apple uses NSXMLDocument in this example, which isn't available on the iPhone. Nevertheless, NSXMLParser should implement the same encoding mechanics.

    In the data I accept from the server, I will try to erase the header that specifies the encoding of the file, and I will check if that makes a difference.
     
  11. macrumors 68030

    PhoneyDeveloper

    Joined:
    Sep 2, 2008
    #11
    Does the parser throw the error before parsing anything or does it throw the error somewhere in the middle?
     
  12. Moderator

    dejo

    Staff Member

    Joined:
    Sep 2, 2004
    Location:
    The Centennial State
    #12
    Well, if the data from the site is using NSWindowsCP1253StringEncoding, you shouldn't need to check if you can convert it to an NSString with the same encoding. There should be no problem there. If there is, than I would suspect the remote file is not in that encoding, although there may be other reasons for it not converting. You should check your string to make sure it's not nil before proceeding because, as the doc says, it "returns nil if the initialization fails for some reason (for example if data does not represent valid data for encoding)."

    Then, if you still need to convert to NSUTF8StringEncoding, you should be able to ensure that's possible without loss of information by checking canBeConvertedToEncoding:, like so:
    Code:
    if ([str canBeConvertedToEncoding:NSUTF8StringEncoding) ...
    Most of this is just speculation since, without access to the remote file itself, there's not much I can do to actually test out what I'm suggesting.
     
  13. thread starter macrumors 68000

    Soulstorm

    Joined:
    Feb 1, 2005
    #13
    It seems that the fault was in the header. No matter what encoding the file was, the parser was always using the line "<?xml version="1.0" encoding="windows-1253" ?>". Removing the line from the string before the parser touched it solved the problem.
     

Share This Page