# how the size of an integer is decided?

Discussion in 'Mac Programming' started by celia, Jun 25, 2007.

1. ### celia macrumors newbie

Joined:
Jun 24, 2007
#1
Hi,

how the size of an integer is decided?
- is it based on processor or compiler or OS?

Thanks.

2. ### gauchogolfer macrumors 603

Joined:
Jan 28, 2005
Location:
American Riviera
#2
Do you want to know what determines the maximum size of permitted integers (i.e. 32-bit versus 64-bit)?

3. ### garethlewis2 macrumors 6502

Joined:
Dec 6, 2006
#3
For Java, it is built into the runtime. That is always 32-bit.

For OS X and Windows, if memory serves me correctly, it is defined in types.h which specifies how much memory is allocated for each type. It was done in terms of char which was defined as

#define char 1

So an int used to be defined as

#define int char*4

Probably no longer determined in terms of char, as what is a char? It is 8 bits, 16 bits, or is it UTF8 which can be any number of bits. w_tchar is 16 bits, but isn't set in stone.

4. ### celia thread starter macrumors newbie

Joined:
Jun 24, 2007
#4
is it based on the 32/64 bit processor or implementation dependant?

Code:
```#define char 1

So an int used to be defined as

#define int char*4```
So it is not depending on the processor(32 / 64 bit)?

5. ### cblackburn macrumors regular

Joined:
Jul 5, 2005
Location:
London, UK
#5
Yes and no. C is/was designed to be a portable language so you could take your programs and run them on another computer easily. Hence it will make the size of an int whatever you tell it, however if you set an int to be 64 bit on a 32 bit machine then the computer has to break the the 64 bit word down into two 32 bit words, perform the operation on both parts sequentially, then stich it back together which is a *lot* slower than just doing it on a 32 bit integer.

Hence, usually, the size of an int is set to whatever the size of the processor ALU is, however this is purely a performance choice

Chris

6. ### Krevnik macrumors 68030

Joined:
Sep 8, 2003
#6
Uhm, this is a bit strange, since int is defined as a language keyword in C. It is defined by the compiler, not headers. Hence why you could (in theory) write an app with no headers if all it did was return a code based on some arguments/etc. (since main is supposed to return an int, but can return void)

The compiler defines int as the ALU size (processor bit-size), and I believe if you read K&R C and the ANSI C spec, this is the intended design. You /can/ override it with custom compilers, but then your compiler doesn't adhere to either C standard if you do.

In a C program, if you need to know the size of an int, you can use the sizeof(int) expression to do it (if for some reason you run on multiple architectures). Most platform APIs provide defined types which represent the preferred integer of the platform, to make it easier to write code for a platform that has both 32-bit and 64-bit APIs.

7. ### GeeYouEye macrumors 68000

Joined:
Dec 9, 2001
Location:
State of Denial
#7
The most careful thing you can do is use header-based typedef'ed integers:

int32_t, uint32_t, int64_t, uint64_t, etc., assuming whatever you're coding will have the same libraries on all platforms. Otherwise, just watch out and use sizeof() liberally.

8. ### toddburch macrumors 6502a

Joined:
Dec 4, 2006
Location:
Katy, Texas
#8
I would have to say the size of an integer is predicated on the size of the CPU's registers. However, certainly, a compiler could define any length it wanted and subsequently leverage, or work-around, the actual hardware.

In the early days of PC's, registers were 2 bytes (16 bits), thus, a "word" was 2 bytes, and so was an integer.

As processor's evolved, and addressing did too, registers moved to 4 bytes long, and thus, it made sense to define an integer as 4 bytes.

With 64 bit processors, registers (on the mainframe, at least) are 8 bytes. However, we have not seen integers evolve to 8 bytes, and I'm guessing they will be 4 bytes for some time to come. On an IBM 64-bit box, for example, with 8 byte registers, if the 4-byte instructions are used (AKA 31-bit instructions), only the right half of an 8-byte register is used. No muss, no fuss.

Todd

9. ### Krevnik macrumors 68030

Joined:
Sep 8, 2003
#9
This is how it works with the Core 2 Duo (x64) and the G5 (PPC64) as well. 32-bit mode only uses half of the 8-byte register available, but if you are running a 64-bit clean app (compiled for 64-bit), then your int will be 8 bytes, as will your pointers.

10. ### toddburch macrumors 6502a

Joined:
Dec 4, 2006
Location:
Katy, Texas
#10
Yeah, same on the mainframe. It's a complier option to exploit full 64-bit or not. However, the terminology for an "integer" is still 4 bytes. If refering to 64-bit integers, we say "8-byte integers", or, "double-word integers". A word is still 4 bytes.

Todd

PS: Just as an FYI, I want to point out that I did not make a typo when I said "31-bit". PCs / Macs / other machines might be 32 bit, but IBM machines, up until lately, only addressed 31 bits. The high order bit (left-most bit) was used to indicate addressing mode, which could be 24-bit mode (<= 16MB) when off or 31-bit mode (<=2GB) when on. On the 64-bit machines, there are other mechanisms to set and query addressing mode.

(And, to be complete, yes, registers on a mainframe are certainly a full 32-bits, allowing for 32 bits of precision, but only 31 bits of memory can be addressed)

11. ### iSee macrumors 68040

Joined:
Oct 25, 2004
#11
Here's what the C99 spec has to say:

12. ### Krevnik macrumors 68030

Joined:
Sep 8, 2003
#12
This seems to speak to the mainframes you have been exposed to. Different architectures use different terminology. Windows still defines a word as 16-bit, and a 32-bit int is a double word. Mainframes I have worked on used 48-bit words (as you can imagine, anything using bitpacking and assuming 48-bit word boundaries are interesting to port to home architectures).

In C-speak, an int is an int and is the size of the CPU registers (as stated quite simply by iSee). char is a byte, short is a 16-bit integer, longs are 32-bits, and long longs are 64-bit. wchar is a 16-bit unsigned integer for UTF-16 support. Those definitions don't change just because the metal does. Those who do assembly work tend to be a bit more interested in the specific definitions of what a word is.

13. ### Sayer macrumors 6502a

Joined:
Jan 4, 2002
Location:
Austin, TX
#13
platform bickering aside, the size of an integer is determined by the total number of possible values it may represent.

E.g. a byte, or 8 bits, can have one of 256 different possible values from 0-255 or 0x00 to 0xff in HEX. This would be an unsigned int, btw. Meaning it has only positive vales. A signed integer can be positive or negative, and this affects the range of possible values such that you can represent from -128 to +127.

A "short" int in old-school Mac parlance, typically two bytes or 16 bits and typed as UInt16 in the Mac now, has a max. (unsigned) value of 32,767 or 0xffff in hex.

A "long" is a 32 bit value, or four bytes, and has a max. (unsigned) value of 4,294,967,296 or 0xffffffff in hex. The type in Mac programming (carbon typ.) is UInt32.

A float is typ. 32 bits wide, a double is 64 bits I believe (may depend on implementation what the max. value is).

The PowerPC accesses memory in chunks sized as multiple of 4-bytes, so making memory structures aligned to four bytes is more efficient for the CPU. Example

Code:
```struct myStruct {

UInt32   myVal; // 4 bytes
UInt16   halfVal;
UInt16   otherHalfVal; // two 16s = 32 bits, or 4 bytes
UInt16   someOtherHalfVal; // only 2 bytes
UInt16   fillerVal; // Fills up the left over 16 bits make this struct 4-byte
// aligned, also gives room for expansion later on

}```

14. ### garethlewis2 macrumors 6502

Joined:
Dec 6, 2006
#14
For C, int is defined by your compiler. The compiler is compiled using a C environment and an int is defined using types.h.

This is why you cannot trust that an int is 32 or 64 bit. You must always, always using sizeof when mallocing memory. NSInteger in Leopard is just syntatic sugar that is globablly set in your project.

It doesn't matter what your registers hold on the CPU, it hasn't done for a good 20 years. When ANSI made C an actual certified language, the sizeof an int was set to 4 bytes. This is true for all C environments. Those that code on ****** compilers on embedded environments use a variant of C known as ****tard C.

15. ### techgeek macrumors member

Joined:
Jun 11, 2004
Location:
UK
#15
No need to get abusive.
Some of us have to make a living in theose embedded environments thank you very much!

A char is 8 bits
An int is at least 16 bits
These values can be greater, but not smaller.

From the C99 standard:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf

Section 5.2.4.2.1 Sizes of integer types <limits.h>

"...implementation defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign."

 number of bits for smallest object that is not a bit-field (byte)
CHAR_BIT 8

 minimum value for an object of type int
INT_MIN -32767 // −((2^15) − 1)
 maximum value for an object of type int
INT_MAX +32767 // (2^15) − 1
 maximum value for an object of type unsigned int
UINT_MAX 65535 // (2^16) − 1

16. ### gekko513 macrumors 603

Joined:
Oct 16, 2003
#16
Using gcc:

int - 32 bit
long - 32 bit in a 32-bit environment, 64 bit in a 64-bit environment
long long - always 64 bit

17. ### Krevnik macrumors 68030

Joined:
Sep 8, 2003
#17
Scary thing is... you are right and showed us all we were mostly wrong.

To verify, I built a command-line app that was 32-bit and 64-bit that ran the following code:

Code:
```#include <stdio.h>

int main (int argc, const char * argv[]) {

printf("Variable Sizes...\n");
printf("short: %d bytes\n", sizeof(short));
printf("long: %d bytes\n", sizeof(long));
printf("int: %d bytes\n", sizeof(int));
printf("long long: %d bytes\n", sizeof(long long));

return 0;
}```
Results on 32-bit:

Code:
```Variable Sizes...
short: 2 bytes
long: 4 bytes
int: 4 bytes
long long: 8 bytes```
Results on 64-bit:

Code:
```Variable Sizes...
short: 2 bytes
long: 8 bytes
int: 4 bytes
long long: 8 bytes```

18. ### lazydog macrumors 6502a

Joined:
Sep 3, 2005
Location:
Cramlington, UK
#18
There's something I don't quite understand. sizeof() returns a value of type size_t which, if I remember correctly, is defined as unsigned int. So I guess the results above are fine for 32bit architecture, but for 64 bit size_t wouldn't be big enough for all cases.

Out of interest perhaps you could print out the result of sizeof( size_t )?

thanks

b e n

19. ### Krevnik macrumors 68030

Joined:
Sep 8, 2003
#19
size_t is 8 bytes. If you want the actual definition itself, it isn't unsigned int:

Code:
```#if defined(__GNUC__) && defined(__SIZE_TYPE__)
typedef __SIZE_TYPE__		__darwin_size_t;	/* sizeof() */
#else
typedef unsigned long		__darwin_size_t;	/* sizeof() */
#endif```
And elsewhere there is a typedef of __darwin_size_t to size_t. So the compiler can define __SIZE_TYPE__ (or the programmer) to force it to a particular size, but otherwise it is an unsigned long.

20. ### lazydog macrumors 6502a

Joined:
Sep 3, 2005
Location:
Cramlington, UK
#20
Krevnik

Thanks for taking the time to answer.

b e n

21. ### gekko513 macrumors 603

Joined:
Oct 16, 2003
#21
Even if size_t was only 32 bit it would be more than big enough to hold the values returned by sizeof(long long). Since sizeof(long long) is 8, it only really needs a four bit unsigned value to store the answer (binary 1000).

22. ### lazydog macrumors 6502a

Joined:
Sep 3, 2005
Location:
Cramlington, UK
#22
Well, as an example why size_t needs to be 64bit, you could have something like

Code:
```int big_array[ 16000000000 ] ;
…
size_t size = sizeof( big_array )
```

b e n

23. ### gekko513 macrumors 603

Joined:
Oct 16, 2003
#23
Oh right. I misunderstood your question then.

24. ### MongoTheGeek macrumors 68040

Joined:
Sep 13, 2003
Location:
Its not so much where you are as when you are.
#24
You could cause some interesting breakages with that.

25. ### Krevnik macrumors 68030

Joined:
Sep 8, 2003
#25
Even malloc() takes a size_t, so size_t needs to be the same bit size as your memory pointers.