PDA

View Full Version : Encoding on NSURLConnection




joaogalli
May 25, 2011, 03:49 PM
Hi,

I am trying to load a file with NSURLConnection but it always get messed, I tried a lot of NSStringEncoding types but none has worked.
I uploaded two examples that have an accentuation in <a>Joo</a> in my S3 bucket, please if someone manage to read it in iOS the right way: HELP ME.

These two only have the charset difference:
https://s3.amazonaws.com/cardcell/web.xml
https://s3.amazonaws.com/cardcell/web2.xml

This is the code on the delegate method:

- (void)connectionDidFinishLoading:(NSURLConnection *)connection
{
// receivedData is an NSMutableData that appends all received data.
NSString *dataString = [[NSString alloc] initWithBytes: [receivedData bytes] length: [receivedData length] encoding: NSUTF8StringEncoding];
NSLog(@"dataReceived String: \"%s\"\n", dataString);
}


This is the type of result I get:
Data received: `ˇ



chown33
May 25, 2011, 05:15 PM
The data you stored on S3 is simply wrong.

I used the 'curl' command in Terminal to get the data, piped to 'hexdump' to show me what the actual bytes are.


curl -s "https://s3.amazonaws.com/cardcell/web2.xml" | hexdump -C

00000000 3c 3f 78 6d 6c 20 76 65 72 73 69 6f 6e 3d 22 31 |<?xml version="1|
00000010 2e 30 22 20 65 6e 63 6f 64 69 6e 67 3d 22 49 53 |.0" encoding="IS|
00000020 4f 2d 38 38 35 39 2d 31 22 20 73 74 61 6e 64 61 |O-8859-1" standa|
00000030 6c 6f 6e 65 3d 22 6e 6f 22 3f 3e 0a 3c 77 65 62 |lone="no"?>.<web|
00000040 2d 61 70 70 20 78 6d 6c 6e 73 3d 22 68 74 74 70 |-app xmlns="http|
00000050 3a 2f 2f 6a 61 76 61 2e 73 75 6e 2e 63 6f 6d 2f |://java.sun.com/|
00000060 78 6d 6c 2f 6e 73 2f 6a 32 65 65 22 20 78 6d 6c |xml/ns/j2ee" xml|
00000070 6e 73 3a 78 73 69 3d 22 68 74 74 70 3a 2f 2f 77 |ns:xsi="http://w|
00000080 77 77 2e 77 33 2e 6f 72 67 2f 32 30 30 31 2f 58 |ww.w3.org/2001/X|
00000090 4d 4c 53 63 68 65 6d 61 2d 69 6e 73 74 61 6e 63 |MLSchema-instanc|
000000a0 65 22 20 76 65 72 73 69 6f 6e 3d 22 32 2e 34 22 |e" version="2.4"|
000000b0 20 78 73 69 3a 73 63 68 65 6d 61 4c 6f 63 61 74 | xsi:schemaLocat|
000000c0 69 6f 6e 3d 22 68 74 74 70 3a 2f 2f 6a 61 76 61 |ion="http://java|
000000d0 2e 73 75 6e 2e 63 6f 6d 2f 78 6d 6c 2f 6e 73 2f |.sun.com/xml/ns/|
000000e0 6a 32 65 65 20 68 74 74 70 3a 2f 2f 6a 61 76 61 |j2ee http://java|
000000f0 2e 73 75 6e 2e 63 6f 6d 2f 78 6d 6c 2f 6e 73 2f |.sun.com/xml/ns/|
00000100 6a 32 65 65 2f 77 65 62 2d 61 70 70 5f 32 5f 34 |j2ee/web-app_2_4|
00000110 2e 78 73 64 22 3e 0a 20 20 20 3c 61 3e 6a 6f c3 |.xsd">. <a>jo.|
00000120 a3 6f 3c 2f 61 3e 20 0a 3c 2f 77 65 62 2d 61 70 |.o</a> .</web-ap|
00000130 70 3e 0a |p>.|
00000133

curl -s "https://s3.amazonaws.com/cardcell/web.xml" | hexdump -C

00000000 3c 3f 78 6d 6c 20 76 65 72 73 69 6f 6e 3d 22 31 |<?xml version="1|
00000010 2e 30 22 20 65 6e 63 6f 64 69 6e 67 3d 22 55 54 |.0" encoding="UT|
00000020 46 2d 38 22 20 73 74 61 6e 64 61 6c 6f 6e 65 3d |F-8" standalone=|
00000030 22 6e 6f 22 3f 3e 0a 3c 77 65 62 2d 61 70 70 20 |"no"?>.<web-app |
00000040 78 6d 6c 6e 73 3d 22 68 74 74 70 3a 2f 2f 6a 61 |xmlns="http://ja|
00000050 76 61 2e 73 75 6e 2e 63 6f 6d 2f 78 6d 6c 2f 6e |va.sun.com/xml/n|
00000060 73 2f 6a 32 65 65 22 20 78 6d 6c 6e 73 3a 78 73 |s/j2ee" xmlns:xs|
00000070 69 3d 22 68 74 74 70 3a 2f 2f 77 77 77 2e 77 33 |i="http://www.w3|
00000080 2e 6f 72 67 2f 32 30 30 31 2f 58 4d 4c 53 63 68 |.org/2001/XMLSch|
00000090 65 6d 61 2d 69 6e 73 74 61 6e 63 65 22 20 76 65 |ema-instance" ve|
000000a0 72 73 69 6f 6e 3d 22 32 2e 34 22 20 78 73 69 3a |rsion="2.4" xsi:|
000000b0 73 63 68 65 6d 61 4c 6f 63 61 74 69 6f 6e 3d 22 |schemaLocation="|
000000c0 68 74 74 70 3a 2f 2f 6a 61 76 61 2e 73 75 6e 2e |http://java.sun.|
000000d0 63 6f 6d 2f 78 6d 6c 2f 6e 73 2f 6a 32 65 65 20 |com/xml/ns/j2ee |
000000e0 68 74 74 70 3a 2f 2f 6a 61 76 61 2e 73 75 6e 2e |http://java.sun.|
000000f0 63 6f 6d 2f 78 6d 6c 2f 6e 73 2f 6a 32 65 65 2f |com/xml/ns/j2ee/|
00000100 77 65 62 2d 61 70 70 5f 32 5f 34 2e 78 73 64 22 |web-app_2_4.xsd"|
00000110 3e 0a 20 20 20 20 3c 61 3e 6a 6f c3 a3 6f 3c 2f |>. <a>jo..o</|
00000120 61 3e 0a 3c 2f 77 65 62 2d 61 70 70 3e 0a |a>.</web-app>.|
0000012e

The red hilites show that web2.xml is declaring its content to be in 8859-1 encoding, but when you actually look at the data, the bytes are not 8859-1, but are identical to the UTF8-encoded bytes (blue hilite).

If you're going to declare the XML as 8859-1, you must actually encode the data as 8859-1.

joaogalli
May 25, 2011, 07:29 PM
Thanks for your time, but I am sorry about this cause I think this is not the case.

First because both files gives me the same error in iOS. (On Android and Java Desktop they work fine).

Second because these files are examples for my problem, their errors in iOS are equal to the web services I actually have to read with NSURLConnection.

Using NSURL like the following code gives me a correct string formation but I need to use HTTP headers in the connection and I think they are not supported in NSURL.

NSURL *nsurl = [[NSURL alloc] initWithString:@"http://www..."];

NSString *data = [[NSString alloc] initWithContentsOfURL:nsurl encoding: NSUTF8StringEncoding error: nil];

NSLog(@"Data received: %s", data);

chown33
May 25, 2011, 08:20 PM
NSURL *nsurl = [[NSURL alloc] initWithString:@"http://www..."];

NSString *data = [[NSString alloc] initWithContentsOfURL:nsurl encoding: NSUTF8StringEncoding error: nil];

NSLog(@"Data received: %s", data);


This code is wrong.

The NSLog uses a %s formatting code, but 'data' is not a C string, which is the required type for a %s formatting code. You need to use %@ as the formatting code, because 'data' is an NSString*.

You have the same bug in your originally posted code.

joaogalli
May 25, 2011, 08:56 PM
This code is wrong.

The NSLog uses a %s formatting code, but 'data' is not a C string, which is the required type for a %s formatting code. You need to use %@ as the formatting code, because 'data' is an NSString*.

You have the same bug in your originally posted code.

Thanks buddy, but this is not the error. Please read what I have written: My problem is with NSURLConnection, not with the code I posted. This posted code works fine but I can't use headers with it.

chown33
May 26, 2011, 12:21 AM
From your original post:


- (void)connectionDidFinishLoading:(NSURLConnection *)connection
{
// receivedData is an NSMutableData that appends all received data.
NSString *dataString = [[NSString alloc] initWithBytes: [receivedData bytes] length: [receivedData length] encoding: NSUTF8StringEncoding];
NSLog(@"dataReceived String: \"%s\"\n", dataString);
}


This is the type of result I get:
Data received: `ˇ†

The reason you get Data received: `ˇ† printed by NSLog is because you have a %s in the NSLog formatting code, but the dataString argument is not a C string. That is, you have the wrong type for %s. If you change the %s to %@, you should no longer get garbage printed by NSLog.

If that is the error you are referring to (the garbage-like characters in NSLog), then there is no other reason I can see. The %s is simply wrong when dataString is an NSString* type.

If there is some other error, you have not yet identified what it is.


You say you need to "use HTTP headers", but you haven't said which headers you think you need to use, nor why you need them. Exactly what are the headers you think you need, and exactly what difference do you think they will make?

The S3 reply headers, as returned by the 'curl' command, are as follows for the web2.xml file:
HTTP/1.1 200 OK
x-amz-id-2: 8DxcirzjbOuEfSceEYFzivVSAWA9wukZA69ynxFbGQgNGJkk/AvvOmjxqv+D5LbD
x-amz-request-id: D3E8B79DA72E2479
Date: Thu, 26 May 2011 05:07:46 GMT
Last-Modified: Tue, 24 May 2011 19:46:45 GMT
ETag: "8afb04f3c7b88b358e87c9d0dbd055e0"
Accept-Ranges: bytes
Content-Type: text/xml
Content-Length: 307
Server: AmazonS3

As you can see, the Content-Type header says the data is XML, but it doesn't identify a charset. This is exactly what one would expect if the data is uploaded to S3 without a Content-Type header that specifies a charset.

The command that produced the above is:
curl -i "https://s3.amazonaws.com/cardcell/web2.xml"


I am quite familiar with S3, and I can't tell from what you've posted whether you think you need request headers or reply headers.

I don't know which request headers could possibly make a difference here, since S3 is primarily a storage service. It only returns what you store (except for values it calculates by itself). Since request headers don't make any sense here, I can only assume you mean reply headers.

S3 will store some headers that are uploaded with the data, but not all of them. It will also calculate a small number of headers, such as the ETag header. But if you don't upload other headers with the data, then you won't get them back when you request the data later.

So again, what are the headers you think you need, and why?


If need to specify headers, then you're doing it wrong. The solution is simple: don't use initWithContentsOfURL:. That method gives no control over the headers at all.

You need to use the NSMutableURLRequest class, which has a method for setting headers. Refer to the "URL Loading System Programming Guide":
http://developer.apple.com/library/ios/#documentation/Cocoa/Conceptual/URLLoadingSystem/Concepts/URLOverview.html%23//apple_ref/doc/uid/20001834-BAJEAIEE