PDA

View Full Version : Segmentation fault

se6git
Feb 17, 2009, 02:13 PM
Trying to learn c from a very simple book (c programming in easy steps) which is working ok for a duffer like me.
I have though stumbled across an issue not discussed in my simple book. The book goes through a short example of recursive functions.
The code below works up to a point, that point being somewhere between 100,000 and 1,000,000. Entering "1000000" gets a Segmentation fault at 737975
#include <stdio.h>

void count_down_from(int x);

int main()
{
int num;
printf("Enter a positive integer to count down from: ");
scanf("%d", &num);
count_down_from(num);
return 0;
}

void count_down_from(int x)
{
printf("%d\n",x);
--x;
if(x<0) return;
else count_down_from(x);
}

Cromulent
Feb 17, 2009, 02:25 PM
The code below works up to a point, that point being somewhere between 100,000 and 1,000,000. Entering "1000000" gets a Segmentation fault at 737975

That number is too big to store in an int.

A signed int can only store numbers between -32,767 and 32,767 and an unsigned int can only store numbers between 0 and 65,535.

You need to use a long if you want to store numbers that size (or a long long if you want to store astronomically large numbers).

For reference a signed long can store numbers between -2,147,483,647 and 2,147,483,647 and an unsigned long can store numbers between 0 and 4,294,967,295.

autorelease
Feb 17, 2009, 03:01 PM
That number is too big to store in an int.

That probably isn't the problem. On a 32- or 64-bit machine, an int is 32 bits, and thus can store a (signed) number between -2,147,483,648 and 2,147,483,647.

You're probably getting a segfault because you're running out of stack space. Your count_down_from() function is written recursively, so if your compiler isn't doing tail-call optimization, your code will create a new stack frame for every call of count_down_from(). If this function is calling itself hundreds of thousands of times, you could wind up using several _megabytes_ of memory. On my computer, and probably yours too, the maximum stack size is 8MB, so it's reasonable that your code is blowing that away and dying.

Recursive functions definitely have there uses, but this is one example where an iterative version (a for or while loop) would be many orders of magnitude more efficient. Function calls don't come free, there is overhead involved when calling and returning from a function. Normally it's negligible, but when recursion is involved it can be a severe detriment to performance.

After you've got a good grasp on the concepts of C, I highly recommend that you (and anyone else learning to program) learn about the inner workings of computer systems (how the stack and the heap work, how caches work, how the processor pipeline works, assembly language, etc.) since it will make you a _much_ smarter and more efficient C programmer.

lee1210
Feb 17, 2009, 03:26 PM
This thread reminded me of another that deals with recursion, stack space, etc. that might be of interest:

-Lee

gnasher729
Feb 17, 2009, 06:52 PM
The code below works up to a point, that point being somewhere between 100,000 and 1,000,000. Entering "1000000" gets a Segmentation fault at 737975

Interesting that people still use the term "Segmentation fault" when these Intel processors haven't made use of segments for many years.

se6git
Feb 18, 2009, 12:29 AM
Thanks all. The limit on signed integers was my first thought but the fact that 100000 worked seemed to discount this.
The book mentions that recursive functions aren't the optimal way of performing the task done in the code provided but doesn't really say why. Now I know.
Just out of curiosity, is there any way of clearing the stack between recursions so that it doesn't fill?
One more thing, no pressure, can anyone recommend a good (i.e. big pictures and crayon) source of info on the workings of the stack, heap etc.?
I have to say that the answers given here where remarkably lucid. Similar inquiries on a unix/linux site would result in a nose bleed.

Sander
Feb 18, 2009, 04:00 AM
"Clearing the stack" would defeat the purpose. What goes on the stack is the "context" of the current function, so that when you return from the function and "pop" this context, you'll end up at the place it was called from.

If you could clear this, returning from functions would no longer work, and your recursion would break.

Think of it as one of those spikes onto which you can push papers. When you're asked to go do something, you write down what you're currently working on, push it onto the spike, and go do that new thing. When you're done, you take the top paper off the spike and read what you were working on, i.e. "restore your state" and continue where you left off. This works recursively: when you get another interruption during your interruption, you'll simply push another paper onto the spike, etc.

Cromulent
Feb 18, 2009, 06:40 AM
That probably isn't the problem. On a 32- or 64-bit machine, an int is 32 bits, and thus can store a (signed) number between -2,147,483,648 and 2,147,483,647.

Not quite true. The C standard states that an int is at least 16 bits in size. So if you are writing code that you want to be cross platform compatible you really should use a long for 32 bit integers.

But I know what you mean and I am being pedantic but it should be kept in mind.

autorelease
Feb 18, 2009, 12:16 PM
Not quite true. The C standard states that an int is at least 16 bits in size. So if you are writing code that you want to be cross platform compatible you really should use a long for 32 bit integers.

But I know what you mean and I am being pedantic but it should be kept in mind.

This is true, but on a modern PC, int is usually the size of a machine word, which is 32 bits.

To make things more confusing, a long int is 32 bits on a 32-bit machine and 64 bits on x86_64.

If you need to ensure that a variable is always a specific size across platforms (for example, if you're writing binary data to a file or communicating over a network), include <stdint.h>. It defines integer types with explicit sizes, like int16_t, int32_t, int64_t, etc., which are typedef'd to the appropriate primitive types for the respective platform.