PDA

View Full Version : Why *pointer = &foo and not pointer=&foo?




zippyfly
Aug 13, 2009, 02:31 AM
int foo = 7;
int *pointer = &foo;

Why not

int pointer = &foo;

since the address of foo is an int, so you should assign it to an int variable. Then to dereference it, you apply *pointer to get the final value of foo, which is 7.

This code

int *pointer = &foo;


seems to me to be putting the address of foo into the final dereferenced memory location of *pointer, which doesn't make sense since you assign it using

*pointer = 7;

right?

This is one of the most unintuitive aspects of pointers that keeps annoying me, so if anyone can help give me any insight into something I've overlooked, to make it make sense, it would be much appreciated!

:-)



admanimal
Aug 13, 2009, 02:47 AM
You are on the right track, but you have to remember that


int *pointer = &foo;


is not just doing an assignment; it's also declaring pointer as a pointer to an int so that it can be dereferenced later. So the * in that line is not dereferencing pointer, it's declaring it as a pointer to an int. It is equivalent to doing this:


int *pointer;
pointer = &foo;


So if you do this:

int pointer = &foo;


it will usually only generate a warning, since you are right that the address of foo is often (but not always) just an integer anyway. However, then trying to do this:


*pointer = 5;


is an error because pointer was not declared as a pointer and therefore cannot be an argument of the unary * operator.

So whether you do this


int pointer = &foo;


or


int *pointer = &foo;


the same value (the address of foo) gets stored in pointer, but the compiler will limit what you can do with pointer in the first case, just to keep you honest with your variables, since there are times when a pointer is not just an integer.

zippyfly
Aug 13, 2009, 03:00 AM
Excellent. Thanks a ton. Makes more sense to me now because it is an int, just that it needs to be declared to be used as a pointer first.

Now why can't most programming books (authors) explain it as clearly as you did?!

:)

Bakerman
Aug 13, 2009, 03:45 AM
You shouldn't treat
int *pointer;
as an int. In your example, it is pure coincidence that the address of foo happens to be an int; this is platform dependent. It is easiest to consider int* as a completely separate type (which the designers of C# made a point of; there int and int* cannot be declared in the same statement).

Cromulent
Aug 13, 2009, 03:48 AM
since the address of foo is an int

Bad assumption.

The address could be absolutely anything and is defined by the architecture and implementation that the C compiler is running on.

zippyfly
Aug 13, 2009, 04:13 AM
OK, thanks.

It's still not intuitive then.

So I guess if I wanted to hard code an address, I'd then need to know the exact representation of an address that a specific compiler/platform uses, and assign that to the pointer variable, and then dereferencing it, whether such a representation is integer, hex, or whatever (?)

I'm just bothered that, logically, the * operator acts on a variable, and I should be able to directly adjust that variable, instead of blindly passing an ethereal item that is acquired only through the & operator. I want to know what exactly is that "address thing" that is obtained via &.

It just doesn't make sense that I can't just increment the address to move up the memory (assuming I want to edit something contiguous) if this "address thing" is some mysterious dark matter.

I hope you see where I am finding it hard to get my head around the concept....

ChOas
Aug 13, 2009, 04:21 AM
does this article bring enlightenment ? :

http://boredzo.org/pointers/

Cromulent
Aug 13, 2009, 05:11 AM
I'm just bothered that, logically, the * operator acts on a variable, and I should be able to directly adjust that variable, instead of blindly passing an ethereal item that is acquired only through the & operator. I want to know what exactly is that "address thing" that is obtained via &.

This is where learning the basics of assembler come in handy. You don't actually need to be able to do much with it but it will certainly expand your knowledge of things such as this.

I suggest you look at the Intel CPU developer manuals. They are available as a free PDF download from Intel's website.

gnasher729
Aug 13, 2009, 06:23 AM
int foo = 7;
int *pointer = &foo;

Why not

int pointer = &foo;

since the address of foo is an int

What are you on about? The address of an int is not an int. Just like you are not your name. Look, I type your name, eight characters, then I hit backspace eight times, and your name is gone. You are still there. Lucky you. That's because your name and you are not the same thing. Just like &foo is not an int, it is a pointer to an int.

hope you see where I am finding it hard to get my head around the concept....

What is the difference between your house and the address of your house? If I want to buy I house, I likely have to pay a few hundred thousand. If I want to buy the address of a house, there are companies that will sell me a million addresses for a few dollars. When you write a letter, you don't put house on it, you put the address of a house on the letter.

The pointer is like a sign post pointing to the int. If you have a pointer that points to a single element of an array of ints, then you can "add" to the pointer which makes it point to the next int in that array. Or you can assign to it the address of a different int, so the sign post now points into a different direction. The "&" address operator takes an object and creates a signpost pointing to that object; that is why you can't write &(x+1) because x+1 is not an object, it is just a value. The "*" dereference operator takes a pointer and goes to the place where that signpost points to.

imaxx
Aug 13, 2009, 09:01 AM
the size of address (to any kind of data, be an integer, a char, a double etc.) is always of the size of an int. That's why in many bad code examples you see a pointer casted to an int.
A pointer (in a 32 bit world) is a number ranging from 0 to 4Gb that indicates the VIRTUAL address where your variable is stored in memory. However, despite the same size, their value has a different MEANING. One is a number, the other is a reference to a memory area.

lee1210
Aug 13, 2009, 09:04 AM
OK, thanks.

It's still not intuitive then.


It rarely is, but once it clicks you'll "get it" for good.


So I guess if I wanted to hard code an address, I'd then need to know the exact representation of an address that a specific compiler/platform uses, and assign that to the pointer variable, and then dereferencing it, whether such a representation is integer, hex, or whatever (?)


You will never hard-code an address while programming C. If you are writing assembly and you have your data laid out at "known" locations, you might, but in C the only way you should get the address to things is via the & operator, or the return from the malloc family of functions (or other functions that return pointers). Think about this: where would you get this address from to hard-code? How do you know the next time you compile things won't have moved about on the stack/heap and that address points somewhere else entirely? What if you compile your code on a platform where sizeof(void *) is different from where you wrote this in the first place?


I'm just bothered that, logically, the * operator acts on a variable, and I should be able to directly adjust that variable, instead of blindly passing an ethereal item that is acquired only through the & operator. I want to know what exactly is that "address thing" that is obtained via &.

It just doesn't make sense that I can't just increment the address to move up the memory (assuming I want to edit something contiguous) if this "address thing" is some mysterious dark matter.


That's what the %p format specifier is for. You can easily print a pointer and see what it is, and add to it, etc.

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {
int pos = 0;
int *myIntList = NULL;
int *middleOfList = NULL;
int *singleIntPtr = NULL;
myIntList = (int *)malloc((size_t)(100*sizeof(int))); //Allocate enough room for 100 ints. Assign the address to the myIntList pointer.
for(pos = 0;pos < 100;pos++) {
myIntList[pos] = pos;
}
middleOfList = (myIntList+50);
for(pos = 0;pos < 100;pos++) {
singleIntPtr = (myIntList+pos);
printf("The value at address %p is: %d\n",singleIntPtr,*singleIntPtr);
}
printf("\n");
for(pos = 0;pos < 50;pos++) {
singleIntPtr = (middleOfList+pos);
printf("The value at address %p is: %d\n",singleIntPtr,*singleIntPtr);
}
printf("\n");
for(pos = 0; pos < 10; pos++) { //Let's print without using an intermediate pointer
printf("The value at address %p is: %d\n",myIntList+pos,*(myIntList+pos));
}
free((void *)myIntList);
}


As you can see, you can see what's in the pointer without issue, it's not some murky, intangible thing. It's an address in memory, but decidedly NOT an int. It is some number of bytes, different per platform, that indicate a position in memory. You can add to it, but pointer addition behaves differently than regular addition. If you add 1 to an int, its result will just be 1 greater than the value stored in the int. If you add 1 to an int *, the resulting value will be sizeof(int), which is often 4, greater than the original pointer stored in that int *. This is the same for every pointer type, so adding 1 to a double * will be sizeof(double) greater, adding one to a int ** will be sizeof(int *) greater, etc. You can apply ++, etc. to iterate through an array if you have a pointer to its base, but using the [] operator is generally much easier and clearer to someone who is reading your code.

int *myList = NULL;
int x,y;
myList = (int *)malloc((size t)(10*sizeof(int)));
myList[5]=101;
x = myList[5];
y = *(myList+5);


At the end of this code, x and y will both contain the value of 101. But in my mind, myList[5] to get the 6th element of myList is much easier than *(myList+5), but they are equivalent.


I hope you see where I am finding it hard to get my head around the concept....

Don't worry, most people have the same kind of problems. It is not a concept that comes up in other areas, so it's something that's "brand new" when you start programming. Read more examples, ask more questions, and hopefully it will become clear.

-Lee

lee1210
Aug 13, 2009, 09:11 AM
Sorry for the double post, but after i finished my first reply i saw this and couldn't let it stand...

the size of address (to any kind of data, be an integer, a char, a double etc.) is always of the size of an int. That's why in many bad code examples you see a pointer casted to an int.

This is wrong. If you compile and run this code on a system running a 64-bit OS in 64-bit mode, you will get a different answer than on a 32-bit OS:

#include <stdio.h>

int main(int argc,char *argv) {
if(sizeof(int) == sizeof(void *)) {
printf("On this platform, an int is the same size as a pointer.\n");
} else {
printf("On this platform, an int is not the same size as a pointer.\n");
}
}


This is the reason code that assumes sizeof(int) == sizeof(void *) breaks when moving to 64-bit platforms.

No offense intended to imaxx, but this misconception leads to all sorts of problems.

-Lee

Flynnstone
Aug 13, 2009, 11:45 AM
You will never hard-code an address while programming C.
-Lee

For embedded system, this done quite often when accessing a particular port.

It sounds like this is what he's up to. But ... really need to know what you're doing. A need to remember "volatile".

lee1210
Aug 13, 2009, 11:51 AM
For embedded system, this done quite often when accessing a particular port.

It sounds like this is what he's up to. But ... really need to know what you're doing. A need to remember "volatile".

Perhaps i should have added "if you ever do need to do this, by then you ought to know this stuff like the back of your hand.". I didn't mean to exclude the niche (though important) uses for this, and "never" was a bit too strong, but typing "in the vast majority of cases you will never do this" is a lot longer than just saying "you will never do this" =).

-Lee

zippyfly
Aug 13, 2009, 01:32 PM
Thanks everyone. And I'm off to read http://boredzo.org/pointers/ now.

I do understand the usage of pointers; it's just that because there are some quirks about it that seems counter-intuitive to me right now, I am not 100% sure. Maybe 90% or even 99% sure. But due to the trace of doubt, I might have to stop, open a book and double check. I wouldn't, for instance, need to double check for other items that I totally get (say, looping, or even the OOP aspects of class extension, etc.).

So I am just trying to pinpoint those nits about pointers (as mentioned in this thread).

(BTW I am totally not there doing embedded stuff, haha, but your bringing it up makes sense to me and I can see where it is applicable, which is why I tend to ask the questions I do because I would imagine there might be some use for hard coded addressing, and it turns out there is).

Again, thanks. Hope I am more clear after reading that Web page. We shall see...

zippyfly
Aug 13, 2009, 08:22 PM
OK, I got a question, since it's been brought up that we don't know where things are stored in memory:

I was checking out an example in a tutorial and the following was used:

int abc[total];
int *myPtr;

...

for (myPtr = &abc[0]; myPtr < &abc[total]; myPtr++)

...

Question is, how do we know that when we increment myPtr++ that the next item is the next in the array? Seeing that we don't know where things are stored and they might not be contiguous. Or are they contiguous? Incrementing the pointer seems to just be "adding one" to the memory address?

I also don't quite understand how the condition part works

myPtr < &abc[total];


???

Thanks.

mdeh
Aug 13, 2009, 08:33 PM
I also don't quite understand how the condition part works



This may not be the answer you wish to hear, but really top-notch members have already tried to help you. I was in a similar position as you trying to understand everything, and in the end, I went to the books, as , and I may be wrong, what you need is an orderly progression and then it will make sense.

So, if you really want to start somewhere, here is a suggestion. Steve Kochan's book about Objective-C covers both C and Obj-C and soon these questions, although still relevant, will be put more in context and therefore may not drive you as crazy as they appear to be doing.
Just my 2c worth.

lee1210
Aug 13, 2009, 08:37 PM
&abc[total] is the address just off the end of abc, as there are total elements, from 0 to total-1. So the condition in the for loop checks that the memory address of myPtr is less than just off the end of abc, meaning it's still "in" the memory of abc.

Arrays are stored contiguously in memory, so incrementing myPtr using ++ moves you to the next element. It does NOT increment myPtr by 1. As discussed in my previous post, adding to pointers is done based on the size of the element it points to, so in this case each myPtr++ increments myPtr by sizeof(int). Effectively, this steps through every element of abc using pointer addition.

Don't worry about where, specifically, in memory things are stored. That's what the pointer is for. It holds the address so you know how to get to the memory you need.

-Lee

zippyfly
Aug 13, 2009, 10:23 PM
This may not be the answer you wish to hear, but really top-notch members have already tried to help you. I was in a similar position as you trying to understand everything, and in the end, I went to the books, as , and I may be wrong, what you need is an orderly progression and then it will make sense.

Hi. Yes, I know and agree. Also I really appreciate all the help everyone has offered. But I'm certain that I can't be the only one hung up about this stuff so perhaps this "exploration" would help others in the future, as part of the Web literature.

I have and have read Steve Kochan's excellent book. 2nd Edition. As I mentioned, the pointer stuff right now for me is just "brute force" memorization. Like many things in life that don't make sense, we just have to "live with it" and do it the way we are told.

But I'm also sure that (most things) in computer science are infinitely more logical than the rest of the world, so I'm struggling to figure out that logic.

I'll give another example:

#include <Foundation/Foundation.h>

// define function

void printMessage (NSString *msg)
{
NSLog(msg);
}

int main (int argc, const char * argv[]) {

// call it

printMessage(@"Hello folks! Thanks for your help!");

return 0;
}

So, we're passing the string literal to the function.

Whereas in other places, if I dereference the pointer, I get the ultimate value it is pointing to, and the pointer without the * operator holds the address.

So in this case printing to NSLog without the * prints the string object. But I just don't get why we shouldn't be using *msg instead, to obtain the final value that the object pointer is holding.

Writing [msg printMe] to call the printMe method of object msg makes sense to me in ObjC, but not the above C syntax.

Cinder6
Aug 13, 2009, 10:33 PM
printMessage(), in your example, is sending a pointer. When the compiler encounters that string literal, it sets aside an area in memory for it, then slaps a pointer into the function call.

It may not feel like you're sending it a pointer, but you are, via the compiler.

zippyfly
Aug 13, 2009, 10:38 PM
Yup, I understand that. The function is defined as taking an object pointer argument.

My gripe is why it's not

void printMessage (NSString *msg)
{
NSLog(*msg);
}

and is rather

NSLog(msg);

Because like you said, the address is being passed. So, not using * should be printing the address, and not the final value it resolves to.

autorelease
Aug 13, 2009, 11:07 PM
My gripe is why it's not

Code:
void printMessage (NSString *msg)
{
NSLog(*msg);
}
and is rather

Code:
NSLog(msg);

Because you never dereference object pointers when working with Objective-C. That happens deep inside the library.

The reason why NSLog takes an (NSString *) as opposed to an NSString is that it's much more efficient to pass the address of a large object to a function (called "passing by reference") than to take the entire contents of the object and copy them onto the stack (called "passing by value"). An NSString or other object might take up a huge amount of memory, but a pointer is always 4 (or 8) bytes.

One of the main difficulties stems from the fact that they gave the * symbol multiple meanings. The first is actually part of the type name:
int *p = &foo;
declares a variable called p with a type of (int *).
The second use is the dereference operator, which says "interpret this variable as a memory address, and operate on the value at that address":
*p = someothernumber;

So the first use of * acts like a 'pointer' keyword:
int pointer p = &foo;
And the second use of * acts like a 'valueAt' function:
valueAt(p) = someothernumber;

Guiyon
Aug 13, 2009, 11:10 PM
...and is rather

NSLog(msg);Because like you said, the address is being passed. So, not using * should be printing the address, and not the final value it resolves to.

Not really, because when you're calling NSLog you don't care what the final value resolves to, it's not printing whatever is at that address directly. Instead it's handing the address of the object off to the Objective-C runtime in order to extract out the printable information. In Objective-C, all object references are pointers so you are simply telling NSLog that you have an object located at that address which you would like to print.

Edit: autorelease beat me to it, with a much better explanation no less.

lee1210
Aug 13, 2009, 11:11 PM
http://developer.apple.com/documentation/Cocoa/Reference/Foundation/Miscellaneous/Foundation_Functions/Reference/reference.html#//apple_ref/doc/uid/20000055-BCIJAAIA

NSLog takes an NSString * for a format string, then a variable number of arguments that will be filled in at the positions of format specifiers in the format string. There is nothing, ever, that takes an NSString instead of an NSString *. That's not how the object system in Objective-C works.

You're also changing course pretty dramatically. There is never a local object in Objective-C. Objects live on the heap. Period. You can only have local pointers to them. You do not dereference them. You access them via message passes, and pass around references to them via pointers.

-Lee

Cinder6
Aug 13, 2009, 11:31 PM
At this point, I'm inclined to say that the answer to many of your questions is: "That's the way it's done. It's also much cheaper to pass around small pointers than large objects."

zippyfly
Aug 14, 2009, 02:10 AM
OK thanks yet again. I can sense the frustration from some of you, and apologize for that. It's also the main reason why I've not really pursued this train of thought for so long, but just decided to keep at it this time despite annoying people.

And it has been worth it because there were a number of key aspects you guys just highlighted, which pretty much escaped me in my infinite loop of questioning why it is done the way it is (since I was oblivious to the "goings on" in the background).

So, I guess I can probably say, "I get it now!" Thanks to all of you.

The key things that I didn't really know before (and kept hammering on the syntax of the * operator usage being inconsistent) are:

"One of the main difficulties stems from the fact that they gave the * symbol multiple meanings."

and

"...it's not printing whatever is at that address directly. Instead it's handing the address of the object off to the Objective-C runtime in order to extract out the printable information."

and

"It's also much cheaper to pass around small pointers than large objects."

(same as "it's much more efficient to pass the address of a large object to a function.")

These things might obvious to you old hands but I didn't know this. I didn't, for example, realize that it's done this way to gain higher efficiency, or that the method doesn't actually handle the task directly and keeps passing it on to lower level routines (which is why the passing by reference is involved, so as to optimize this process).

I guess I'm the type of person that needs these explanations about something before I will be happy to move onto to the next thing.

Once again, thanks. I am currently having a lot of fun building GUI apps, following along in Cocoa® Programming for Mac® OS X, Third Edition (Aaron H.).

Cheers!

imaxx
Aug 14, 2009, 03:48 AM
This is wrong.


hehe my fault, i did remove a whole lengthy chunk of my post explaining the difference between 32 and 64 bit pointers.
No offence, after all you just implicitly affirmed that a skilled reverse code engineer doesnt know such difference ;)

Cromulent
Aug 14, 2009, 08:11 AM
Ultimately, pointers and memory management is quite a complex area. Using direct addresses in modern computers is often meaningless as the hardware enables the use of segmented memory. This basically means that each application has its own address space. This has an impact on the use of direct memory access as the same memory address used in two different applications actually points to completely different areas of memory depending on the underlying architecture.

Also in modern operating systems programs are restricted to the memory that they themselves have been allocated and can not access memory outside of their memory space (with the exception of shared memory - but this must be explicitly allocated by the process).

As you can probably tell there is far more under the surface than you can actually see when using C which is why C is still considered a high level programming language. You do not need to get involved in such areas and can just make assumptions based on the C standard and let the implementer and operating system developer work out this kind of thing for you.

Hopefully that is somewhat helpful.

Cinder6
Aug 14, 2009, 12:18 PM
This is why I think a lot of schools are crazy backwards in the order they teach languages. At mine, it went Java -> Assembly -> C, which just seemed...odd. I bet people would have gotten much better grades (my assembly class's average grade was a 60, I think) had the progression been C -> Assembly -> Java.

zippyfly
Aug 14, 2009, 11:26 PM
I would agree although I'd reckon doing very simple Assembler first might give a good idea of how things work at the CPU level, then going up to either C or Java. Although Java (which I learned some 10 years ago, also by reading books) does not have pointers! ;-)

Anyway eventually I will dabble into modern Assembler to understand how things really translate into object code but will stick with Obj-C so long as OS X continues kicking ass in the real world ;-)