macOS Objective C syntax question on pointers

neutrino23 · Oct 16, 2011

Hi,
I'm going through Learn Objective C on the Mac and came across this line of code:

const char *words[4] = { "aardvark", "abacus", "allude", "zygote"};

I understand pointers and indirection quite well, but the C syntax is still not clear to me.

I think that
const char *word
would create a pointer to a constant.

And I think that
const char *words[]
would tell the compiler to make a pointer to an array of constants.

So why put the index 4 in the brackets? If I run the test program without the number 4 it still runs, but that may just be luck.

Does this code

const char *words[4]

create an array of four addresses pointing to four items?

Or does it create one address with some information telling it that it points to four items?

Thanks.

lee1210 · Oct 16, 2011

So let's break this apart:
const is a modifier, meaning that the values in the array (the char pointers) can't be modified.
char * is the type, a pointer to a character
[4] indicates that this is an array of four pointers to a character. If you omit the 4 and use an initializer, the compiler can figure out the dimension for you.

So you end up with an array of 4 const char pointers, initialized to point to the four string literals.

If you apply const to a pointer, the stored address can't be changed, but it may be possible to modify what this points to. If you tried:
words[1] = "test";
You'd get a compiler error because you're trying to change the address stored in words[1] can't be modified because it was declared const.

-Lee

chown33 · Oct 16, 2011

neutrino23 said:
So why put the index 4 in the brackets? If I run the test program without the number 4 it still runs, but that may just be luck.

It works without the 4 because the number of initializers establishes the length of the array.

Add initializers and the array length is automatically extended by the compiler. Remove them, and the length is reduced.

Try mismatched initializers and length. E.g. 4 initializers with a declared length of 3, or a declared length of 7.

Also try no initializers, and omitting the declared length.

Code:
Does this code

Code:

const char *words[4]

create an array of four addresses pointing to four items?

Your code is incomplete as given. It's missing a semicolon, which is important because it tells the compiler (and programmers) that you're finished with the statement. Without a semicolon, it's entirely possible, and syntactically legal, to have an initializer (or something else) on a following line, like so:

Code:

const char *words[4]
  = { "aardvark", "abacus", "allude", "zygote"};

So, given this code:

Code:

const char *words[4];

You defined an array that holds 4 elements.
Each element is a pointer to a char.
No pointers are stored in the array.
The entire array is unininitialized.

The array's actual initial contents depends on the storage class. An auto array has explicitly undefined initial contents (i.e. the spec says the initial contents is undefined). A static array has zeroed initial contents (blank static storage).

C storage class:
http://en.wikipedia.org/wiki/C_syntax#Storage_duration_specifiers

Sydde · Oct 16, 2011

chown33 said:
given this code:

Code:

const char *words[4];

You defined an array that holds 4 elements.
Each element is a pointer to a char.
No pointers are stored in the array.
The entire array is unininitialized.

And you cannot subsequently assign a value to any of the pointers (without a cast) because the const prevents them from being changed?

chown33 · Oct 16, 2011

Sydde said:
And you cannot subsequently assign a value to any of the pointers (without a cast) because the const prevents them from being changed?

Sounds like a question to ask a C compiler:

Code:

#include <stdio.h>

int 
main( int argc, char *argv[] )
{
	const char *words[4] = { "aardvark", "abacus", "allude", "zygote"};

	for ( int i = 0;  i < 4;  ++i )
	{  printf( "%d: %s\n", i, words[ i ] );  }


	const char *foo[4];

	foo[ 0 ] = "bark";
	foo[ 1 ] = "like";
	foo[ 2 ] = "a";
	foo[ 3 ] = "tree";

	for ( int i = 0;  i < 4;  ++i )
	{  printf( "%d: %s\n", i, foo[ i ] );  }


	const char bar[] = "inviolable";

	foo[ 0 ] = bar;
	*foo[ 0 ] = 'u';

	return 0;
}

Where does it say an error occurs?

If you comment out the error, what does the rest of the program do?

neutrino23 · Oct 17, 2011

Thanks, this helps a lot. I followed the links and did more reading elsewhere and I think I'm getting the hang of this. What was throwing me off was how things behave differently depending on context. For example

Code:

int *px[4];
int y;
int xx[100];

px[0] = &xx[0];

xx[5] = 1;

y = px[5];

This last line transfers the address in px[5] to y (with a warning from the compiler). Now if I do something similar:

Code:

int *px;
int y;
int xx[100];

px = &xx[0];

xx[5] = 1;

y = px[5];

In this case the last line transfers the value at the address so y now contains 1. So the exact same code behaves differently depending on context. Seems like this would be easy to mixup if there were a lot of lines of code in between the declaration and the use of px.

One tutorial I ran across showed this syntax for pointers to arrays.

int *px[4]; This would create an array of pointers.

int (*px)[4]; This would create a pointer to an array of 4 integers.

Since C doesn't bounds check arrays and since this doesn't allocate space for the array does the index do any good? Why not just declare it as

int (*px)[1]; then you can use it as a pointer to any integer array.

----
chown33
You are right. I missed the semicolon. I do that often and Xcode chastises me every time. Eventually it will become second nature to add the semicolon.

jiminaus · Oct 17, 2011

neutrino23 said:
Code:

int *px[4]; int y; int xx[100]; px[0] = &xx[0]; xx[5] = 1; y = px[5];

This last line transfers the address in px[5] to y (with a warning from the compiler).

No. This behaviour of this code is undefined. You've accessed past the end of an array. You've accessed element 5 of px, but px is only 4 elements long.

BTW px[0] = &xx[0] can be simplified to px[0] = xx;

neutrino23 said:
Now if I do something similar:

Code:

int *px; int y; int xx[100]; px = &xx[0]; xx[5] = 1; y = px[5];

In this case the last line transfers the value at the address so y now contains 1.

Again, px = &xx[0] can be simplified to px = xx. After this px and xx now point to the same array. px is an alias for xx.

neutrino23 said:
So the exact same code behaves differently depending on context.

Get very used to this. It will be even more so even start dealing with objects and classes. What happens will be dependent solely on the type of the object being operated on.

neutrino23 said:
One tutorial I ran across showed this syntax for pointers to arrays.
int *px[4]; This would create an array of pointers.
int (*px)[4]; This would create a pointer to an array of 4 integers.
Since C doesn't bounds check arrays and since this doesn't allocate space for the array does the index do any good? Why not just declare it as
int (*px)[1]; then you can use it as a pointer to any integer array.

int *px can point to any array. In C, there's no difference between a pointer to a single thing and a pointer to a array of things. A pointer to an array of things just to points to first thing in the array.

BTW Don't get hung up on the crazy stuff like int (*px)[4]. You won't use it in practical Objective-C, because you're must more likely to use higher-level constructs like NSArray than C arrays.

chown33 · Oct 17, 2011

neutrino23 said:
Code:

[COLOR="red"]int *px[4];[/COLOR] int y; int xx[100]; [COLOR="Blue"]px[0][/COLOR] = &xx[0]; xx[5] = 1; y = px[5];

This last line transfers the address in px[5] to y (with a warning from the compiler). Now if I do something similar:

Code:

[COLOR="Red"]int *px;[/COLOR] int y; int xx[100]; [COLOR="blue"]px[/COLOR] = &xx[0]; xx[5] = 1; y = px[5];

In this case the last line transfers the value at the address so y now contains 1. So the exact same code behaves differently depending on context.

The problem with the underlined statement is it's not the exact same code.

The type of px is different (red hilite). So is the actual code (blue hilite). There's no way you can say "the exact same code", when simply by looking at the code you can see differences.

You can say "similar code", which you did, but then why would you expect identical results? And that's not even taking into account the semantic problems (i.e. the compiler warned you about it), or the array-overflow problem jiminaus pointed out.

If you think the code does the same thing, you're wrong. A hint should be given by the compiler warning.

You could fix the array-overflow by declaring px to have 10 elements, and leaving the 5 as-is. That still wouldn't fix the semantic problems related to the compiler warning. So if you want to understand why the "similar code" doesn't produce the same result, you should probably look very carefully at the compiler warning. Perhaps even post the exact text of the warning, so we can all see it.

int *px[4]; This would create an array of pointers.

More specifically, an array of 4 pointers-to-int. There is always a type associated with a pointer, even if it's the void type. There's no such thing in C as a plain untyped pointer.

int (*px)[4]; This would create a pointer to an array of 4 integers.

In the real world, I've never seen code like that. Every real-world C programmer knows that any pointer can be subscripted, and there's no bounds-checking. No sensible person writing sane code intentionally complicates it for no good reason.

int (*px)[1]; then you can use it as a pointer to any integer array.

Nobody writes code like that, either.

A pointer to any type is implicitly a pointer to an array of that type. There is neither a need nor a reason to write complex type expressions in that fashion.

Code:

int * px;

px is a pointer to an int.
px can always be subscripted. The range of valid subscripts is a different question.
*px is identical to px[0]. (Identical storage location, identical type.)
*(px+1) is identical to px[1].
*(px-1) is identical to px[-1].

Whether it's valid to refer to elements at locations other than *px depends entirely on the context of px, and what the code is written to do. For example, look at the C library function strcpy(). The types are char*, but there is clearly an implication that the pointer is to the first char of several chars, the actual number of chars being determined by the nul (0x00) terminator. Compare and contrast strcpy() with the memcpy() library function.

neutrino23 · Oct 17, 2011

Thank you. This is great.

What I meant is that the last line y = px[5]; behaves differently depending on the previous code. This is undoubtedly second nature to you. To me this is part of the journey of learning the language. Historically I've worked with languages that were much less context sensitive so I tend to stumble on these things until they are pointed out to me. These replies have been a great revelation for me. Thanks again.

I apologize for the use of 5 as an index. That was clearly wrong.

chown33 · Oct 17, 2011

neutrino23 said:
What I meant is that the last line y = px[5]; behaves differently depending on the previous code.

Because px has different types in your two examples.

I suggest that you write some comments for each line of the first example, which should be your explanation of what you understand that line of code to be doing.

This is undoubtedly second nature to you. To me this is part of the journey of learning the language. Historically I've worked with languages that were much less context sensitive so I tend to stumble on these things until they are pointed out to me.

It's not a question of it being second nature. That comes from experience.

It's not a question of being "context sensitive", either. There's only one context in each case, and it's determined entirely by how the variables are declared for each case. That is, the meaning of px[5] depends entirely on what the type of px is. Honestly, I don't know of any computer language where the meaning of an expression doesn't depend on what the type of the variable is. If that's what you mean by "context sensitive", then every language I can think of is context sensitive. They're just more or less tolerant of the programmer doing something stupid. C belongs to the set of languages that generally allow the programmer to do something stupid, because one of the language's fundamental design assumptions is that the programmer knows what they're doing.

It's really a question of types, and the types are different for your two cases. If you don't see why they're different, then we need to understand how you think they're the same. To do that, we need you to explain what you think each line of the first example does.

It's not uncommon for beginners to make mistakes in declaring C types, especially when it involves pointers. To an experienced C programmer, the intended meaning is clear from the declaration. All we have from you so far is the C declarations. We don't know what your intended meaning is. Hence the need for explaining your intent, so we can determine exactly where the mismatch between intended meaning and actual declaration lies.

As an example:

Code:

int *px[4];
  // declare px as an array of 4 pointers-to-int.
  // For example, px[1] is a pointer-to-int.

Notice that it's NOT "declare px as a pointer to an array of 4 ints".

If your intent wasn't to declare an array of 4 pointers-to-int, then that's the problem. It that was your intent, then the problem lies elsewhere, and we need your explanation of intent to figure out where.

Search

Search

macOS Objective C syntax question on pointers

neutrino23

macrumors 68000

lee1210

macrumors 68040

chown33

Moderator

Sydde

macrumors 68030

chown33

Moderator

neutrino23

macrumors 68000

jiminaus

macrumors 65816

chown33

Moderator

neutrino23

macrumors 68000

chown33

Moderator

Our Staff