pointers - please explain

Discussion in 'Mac Programming' started by satans_banjo, Dec 16, 2005.

  1. macrumors regular

    Joined:
    Sep 12, 2005
    Location:
    SE London
    #1
    okay - i've scoured the internet for an explanation i understand. what are pointers, why would i use them and is there anything else i need to know?
     
  2. Moderator

    balamw

    Staff Member

    Joined:
    Aug 16, 2005
    Location:
    New England
    #2
    Have a look at this: http://www.cprogramming.com/tutorial/lesson6.html

    B
     
  3. thread starter macrumors regular

    Joined:
    Sep 12, 2005
    Location:
    SE London
    #3
    yeah ive already looked at that. i'm wondering why anyone would use pointers? have you got any examples of applications for pointers (by applications i dont mean programs, i mean practical uses)
     
  4. macrumors 68000

    Fukui

    Joined:
    Jul 19, 2002
    #4
    Try not to think about pointers as anything special, don't let the &blabla and *blabla syntax confuse you.

    Think of a straight list of items (the memory in a computer), from 0 to 128.

    Normal variables like an int or a float takes a certain amount of those 128 items in the list.

    For example, an int takes 4 slots in the list. So somewhere in that list of 128 there is a group of 4 items that holds the value of the int. But where do you start at? We dont know, so a pointer points to the first (usually) slot in the list. So (int *myInteger) might mean, starting at slot 124 an int variable is held.

    BUT, using * just points to its location, to use it it needs to be "dereferenced."

    Also, when you create an int without a pointer, internally its acutally still pointing to the location 124, as in the example, but the reason you use int *myInt, is so that YOU can access the location in memory of the variable. You can also pass a pointer, and change the pointer (location in memory) without changing the actually value of the original.

    A double pointer ** just means that this pointer points to a location in memory, and that location also points to another location, so you can just think of it as links in a web page, they just point to different places and at the end, is the first slot of the actuall item.

    Its faster to pass a pointer (to a function) than to pass a variable because some things such as structures are large to copy, but a pointer is just 4 bytes long (in a 32 bit system), so passing the memory location is like saying (its located here) instead of passing the whole large thing to someone.
     
  5. thread starter macrumors regular

    Joined:
    Sep 12, 2005
    Location:
    SE London
    #5
    so if i declare a pointer to an int, then printf the pointer, would that return the memory address of the int?
     
  6. macrumors 6502a

    Joined:
    Jun 17, 2003
    #6
    Every application you use relies heavily on pointers.

    A pointer (or memory reference) ultimately allows the programmer to accomplish one of two things:

    1. dynamically allocate memory for data elements and operate over them where the number of elements is not known until application runtime.
    2. build arbitrary memory data structures to represent and/or store some appliation-specific information.

    An example of the first is in a Mail application. The developer does not know how many pieces of mail you will have in your inbox. So instead of building an array to hold the messages the developer builds up some form of list structure. Generally this means that the code only holds onto a pointer to the start of the list, then knows how to traverse the list, add elements to the list and remove elements from the list. The upper size limit for the list is now dependent on the amount of memory available.

    An example of the second item is a List as roughly described above. But alternatively, you can think of a tree data structure. A simple binary tree consists of nodes that hold a data value and a "left" and a "right" pointer. Data is stored in the tree so that the data is sorted according to particular criteria. Once common set of criteria is that elements are stored so that elements with a "lower" value are stored in the left branches of the tree and "higher" values are stored to the right. This type of scheme allows for much faster access of individual data elements than can be performed with flat structures. A tree, though not a binary tree is used as the primary internal structure used to represent a web page during rendering of the page in a browser. Safari along with every other graphical web browser uses some form of tree while rendering a page.

    If you are studying computer science then the next class after you learn how to use pointers is usually a class on different types of data structures and the trade-offs with each structure.

    The key concept and often one of the most difficult for people to grasp is that a byte of memory in a computer just stores a number. That number can represent an integer, it can represent a character, it can represent the colour of a pixel, or it can be some other memory location since memory locations are numbered from 0 starting at the beginning of memory and increasing by one for each byte. A pointer is just a piece of memory that stores the address of another piece of memory.
     
  7. thread starter macrumors regular

    Joined:
    Sep 12, 2005
    Location:
    SE London
    #7
    so i've gathered this so far:

    Code:
    int variable;        //declares a variable
    variable = 4;       //declares the value of the variable
    int *pointer;        //declares the pointer
    pointer = &variable   //assigns the value of the pointer to the memory address of the variable
    *pointer = 5;     // gets the value of variable (pointed to by the pointer) and changes it to 5
    
    so is that correct?
     
  8. macrumors 68000

    Fukui

    Joined:
    Jul 19, 2002
    #8
    Yes, exactly.
     
  9. macrumors 6502a

    Joined:
    Jun 17, 2003
    #9
    If you declare:

    Code:
    int a = 10;
    int *b;
    
    then
    printf("%d\n", a) will output "10".
    printf("%d\n", b) will output whatever the memory address of the variable "b".
    printf("%d\n", *b) will output whatever value you have assigned to the int variable that b points to. (Assuming you have previously allocated the variable using malloc and assigned it a value.)
    printf("%d\n", &a) will output the memory address of the variable "a". (The storage for variable a is automatically allocated by the compiler.)
    printf("%d\n", &b) will output the memory address for the pointer variable "b". (This is the location where the value of the pointer is stored, not the location for the value of the integer that b points to.)

    A good way to understand all of this is to write up a small program such as and play with the print statements to understand what is going on:

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    
    main() {
        int a = 10;
        int *b;
    
        b = (int *)malloc(sizeof(int));  // allocate memory for int pointed to by b
    
        *b = 20;                         // set value of int pointed to by b
    
        printf("value of 'a': %d\n", a);
        printf("value of 'b': %d\n", b);
        printf("value of '*b': %d\n", *b);
        printf("value of '&a': %d\n", &a);
        printf("value of '&b': %d\n", &b);
    }
    
    The output will be something like:

    Code:
    value of 'a': 10
    value of 'b': 5243120
    value of '*b': 20
    value of '&a': -1073745000
    value of '&b': -1073744996
    
    In this run, the pointer "b" is allocated the memory address 5243120. This means that "20" is stored in the memory location 5243120. The pointer itself is stored in memory location -1073744996 (this is a negative number because it is relative to the layout of application memory which you don't need to worry about right now.)

    So, in this code, variable "a" is stored in memory address "-1073745000" so the value of this memory location is "10". Variable "b" is stored in memory address "-1073744996" and has the value "5243120". The memory address "5243120" is the location pointed to by "b" and contains the value "20".
     
  10. macrumors 68000

    Fukui

    Joined:
    Jul 19, 2002
    #10
    Yes, as long as *pointer was declared as int *pointer.
    So if its,

    int *pointer;
    *pointer = 5;

    Then yes, but if its,

    int *pointer;
    pointer = 5;

    then this just changes the memory address that its pointing to.
     
  11. thread starter macrumors regular

    Joined:
    Sep 12, 2005
    Location:
    SE London
    #11
    ah okay. i've grasped the concept. i guess some practice will help me learn the syntax a bit better

    thanks to everyone who posted. you know you're on a great forum when people can explain a difficult programming concept to someone like me (i dont do much programming) and so quickly too!:)

    EDIT: one last question: out of curiosity, do any developers target a particular memory address for a specific variable? for example, would they change the value of &variable to make the program run more predictably?
     
  12. macrumors 68000

    Fukui

    Joined:
    Jul 19, 2002
    #12
    They might pass a reference to the variable if the variable is too big to pass effeciently. For example, in Cocoa an object could be very big, so when you call a function say doSomethingWithObject(NSObject) it would pass a copy of the variable (the NSObject), which would be very costly if it were big (variables passed to a function are copied), but doSomethingWithObject(NSObject *) passes the reference (pointer), so its only copying 4 bytes on a 32-bit system, but if you copied the whole thing it might be 1 or 2 MB or more. Plus, you can't change the original thing passed to you if its copied, you have to pass it back again and waste memory.

    Its like placing a link in a mail message instead of including the whole web-page in the mail, you can click it and go if you want, and the mail is smaller and faster to download.

    Sometimes, if variables are located next to eachother, it can be faster if you need to run inside the cache of a processor, but in general its an unnecessary hassle to worry about the particular address a variable is located and place it "there," a good OS and runtime will handle this transparently well.
     
  13. thread starter macrumors regular

    Joined:
    Sep 12, 2005
    Location:
    SE London
    #13
    that's the best explanation i've seen. i'll remember that
     
  14. macrumors 6502a

    Joined:
    Jun 17, 2003
    #14
    Also, different languages handle passing arguments differently. Some languages default to passing a copy and you need to use a different syntax to pass a reference. Other languages only pass references so in this case you need to do some extra work as the programmer if you want pass by copy semantics.
     
  15. macrumors 68000

    Fukui

    Joined:
    Jul 19, 2002
    #15
    Yea, C# and Java pass by reference by default don't they.(I have limited exp.)

    Do you know any OO languages that pass by copy by default?
     
  16. macrumors 603

    jeremy.king

    Joined:
    Jul 23, 2002
    Location:
    Fuquay Varina, NC
    #16
    With exception of primitives and String.
     
  17. macrumors 6502a

    Joined:
    Jun 17, 2003
    #17
    Arguably C++ is pass by value as a default. But that is a bit misleading.

    Java as you say is pass by reference for objects but pass by value for atomic types, except in the case of using Remote Method Invocation. Using RMI the semantics for passing objects and atomic types are pass by value.

    C# is pure pass by reference.

    Python is also pass by reference.

    Ruby only allows pass by reference.

    PHP uses pass by value.

    Lisp when using CLOS I believe uses pass by value semantics if my memory serves.

    Smalltalk uses pass by reference semantics.

    Oberon, like Pascal has explicit pass by value and pass by reference syntax.

    Object COBOL (yes such evil exists in the world) allows pass by reference, pass by value and pass by content semantics. Pass by content means that a pointer to a copy of the data item is passed. How's that for twisted? :)
     
  18. macrumors 68000

    Fukui

    Joined:
    Jul 19, 2002
    #18
    Wow, thats a lot of info.
    Hmm, most pass by reference, thats pretty interesting...
     
  19. macrumors 6502a

    Joined:
    Jun 14, 2005
  20. macrumors 68000

    Fukui

    Joined:
    Jul 19, 2002
    #20
    Didn't your mother ever tell U its rude to point. ;)
     
  21. macrumors 6502a

    Joined:
    Jun 17, 2003
    #21
    Most OO languages pass by reference. A key reason is that most OO languages are designed to eliminate pointers. There is a subtle distinction between a pointer and a reference... a pointer is a memory location and can be incremented and decremented whereas a reference is just a handle to an object. Under the covers a reference is often implemented using a memory address but does not have to be implemented that way. The key thing is that a programmer should not be able to modify a reference to get access to another object. But a programmer can for example increment a pointer to access the next memory block.

    Pass by value was also mostly promoted as a way to avoid reliance on side-effects. Better program structuring theory and more focus on avoiding side-effects in programming courses has virtually eliminated the reliance on side-effects in modern code.
     
  22. macrumors 68000

    Fukui

    Joined:
    Jul 19, 2002
    #22
    Its interesting, I wonder if it would even be possible to implement a programming language and runtime in say C# that implements C#... IOW programming in C, one could make another C runtime, C compiler or other languages, so the C runtime is implemented in C, but could a Java runtime or C# runtime be implemented in those languages? If they hide pointers? Isnt it a kind of weakness?

    I'm not sure there's actually a reason to get rid of pointers, though it sounds nice at first... since the basic design of every computer is to use a buffer of memory why hide it?
     
  23. macrumors 6502a

    Joined:
    Jun 17, 2003
    #23
    Implementing a language runtime, or more commonly a compiler for a language since many languages are compiled directly to native machine code, in the language itself is common practice. For example, a C compiler is generally written in C. There is a well defined process for getting to the point where a compiler for a new language is implemented using the new language itself and you also have a compiled version of the compiler. The process is known as "bootstrapping".

    Can a Java runtime be implemented in Java? Yes, there have been several projects to do just that. The fact that pointers are hidden in the langauge makes some aspects of the implementation difficult but it can be worked around. The more difficult issues in implementing a Java runtime in Java is the fact that there are no explicit mechanisms for the programmer to allocate/deallocate memory, no atomic locking mechanisms, nor ways to control OS level threads in Java. But these issues can also be worked around.

    In some ways the omission of pointers from a language could be seen as a weakness. But what is really happening in these languages is that the language designer is removing the need for a programmer to explicitly manage memory allocation. Memory allocation and pointer manipulation is one of the biggest sources of bugs, complexity and inefficiency in code. So the argument is that if you remove memory allocation and pointers from the langauge and rely on an automatic memory management mechanism then there is less chance for dangerous and difficult to track down memory related bugs. In effect removing pointers makes the langauge easier to use for most tasks.

    The trade off is that some types of programs are a little more difficult to write. In most cases the programs that are made more complex are things like runtimes and operating systems that very few programmers ever actually implement. If you have automatic memory collection, specifically garbage collection, then it is generally not possible to provide pointers in a language. A reason for this is that the garbage collector moves objects around in memory. With references, these object movements can be transparent since the reference does not refer to a specific location in memory. Pointers however refer to a specific location in memory so will break in the context of a garbage collector.

    Another way to look at it is to say that we have a langauge which is perfectly designed for implementing runtimes and operating systems. That language is C. A language like Java doesn't have the same level of flexibility as C since Java is abstracted further away from the underlying machine than C is. However, Java is a much safer language in which programmers are generally more efficient in terms of the time it takes to produce working code to solve a problem.

    The real question is whether every language needs to be able to easily solve every possible programming problem, or is it better to accept limitations in some langauges when those limitations are the result of providing features that would otherwise be impossible to provide in the language?

    BTW, though C# is a reference-based garbage collected language it does allow the use of pointers by the programmer. The trick that C# uses is to allow the programmer to explicitly pin an object into its current memory location until the programmer unpins the object.
     
  24. thread starter macrumors regular

    Joined:
    Sep 12, 2005
    Location:
    SE London
    #24
    the language i'm learning is C, but my main aim is to move on from C to ObjC/Cocoa
     
  25. macrumors 68000

    Fukui

    Joined:
    Jul 19, 2002
    #25
    Right, but I wonder there couldn't be a kind of compromise, instead of "Must Garbage Collect" or "Must Hide Pointers" is to provide a layered approach, like a base implemenation using functions, pointers, and no collection or bounds checking, then based on that build on an object layer, then add collection etc, then there wouldn't be any translation "layers" like JNI or the C# bridge...

    I didn't know garbage collectors manipulate pointers... I though they just keep references to the memory (pointer) and the null it once it had no references... but then again I guess thats why they hide the pointers, thats how they count references! I thought instead the runtime would check if the code had pointers to the location of an object or struct, then if all the pointers were nulled to that particular location, then it would be freed...

    Its interesting, I wonder how they'll implement the garbage collector in Obj-C, probablly only objects could be collected, but then they don't hide the pointers...

    Thanks for the info.
     

Share This Page