macOS Random number generator for scientific computation

MrFusion · Jan 13, 2011

Hi all

I need to implement a random number generator at work for a monte carlo
simulation. The generator has to provide millions of random numbers, without too much of a discernable pattern. Unfortunately, I don't know anything about random number generators or how to judge their quality. Perhaps someone could point me towards a decent one and, if available, a tutorial on how the algorithm works and how to implement it?

robbieduncan · Jan 13, 2011

You really want to implement this yourself? Surely there is a good-enough one out there you could use instead?

subsonix · Jan 13, 2011

I've heard good things about arc4random, not sure if that is enough for what you need.

http://www.manpagez.com/man/3/arc4random/

gnasher729 · Jan 13, 2011

MrFusion said:
Hi all

I need to implement a random number generator at work for a monte carlo
simulation. The generator has to provide millions of random numbers, without too much of a discernable pattern. Unfortunately, I don't know anything about random number generators or how to judge their quality. Perhaps someone could point me towards a decent one and, if available, a tutorial on how the algorithm works and how to implement it?

Google for "Marsaglia". That should eventually lead you to a random number generator with a period of 10^45000 or so.

dmi · Jan 13, 2011

http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html

If you need one for which it is computationally infeasible to discover a pattern,
http://csrc.nist.gov/groups/ST/toolkit/random_number.html

MrFusion · Jan 13, 2011

subsonix said:
I've heard good things about arc4random, not sure if that is enough for what you need.

http://www.manpagez.com/man/3/arc4random/

Yes, from what I have read this one is better than random(). From what I understood, however, it starts repeating after a few thousands numbers. That is unfortunately not good enough.

robbieduncan said:
You really want to implement this yourself? Surely there is a good-enough one out there you could use instead?

No, I don't necessarily have to implement it myself. I just need a decent one for millions of random numbers. random or arc4random is not going to cut it, from what I have read.

balamw · Jan 13, 2011

gnasher729 said:
Google for "Marsaglia".

Based on his Wikipedia page, I must say the man picks great titles for his papers. Will have to read some if I ever need an RNG.

I'm particularly curious about the Monty Python method.

B

subsonix · Jan 13, 2011

MrFusion said:
Yes, from what I have read this one is better than random(). From what I understood, however, it starts repeating after a few thousands numbers. That is unfortunately not good enough.

I think you will have a hard time implementing something better, artificial randomness is never completely random. I think there's hardware solutions that offer better results, perhaps look into that.

dmi · Jan 13, 2011

gnasher729 said:
Marsaglia

http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.aoap/1177005878
http://tbf.coe.wayne.edu/jmasm/vol2_no1.pdf

MrFusion · Jan 13, 2011

MrFusion said:
Hi all

I need to implement a random number generator at work for a monte carlo
simulation. The generator has to provide millions of random numbers, without too much of a discernable pattern. Unfortunately, I don't know anything about random number generators or how to judge their quality. Perhaps someone could point me towards a decent one and, if available, a tutorial on how the algorithm works and how to implement it?

Thanks everyone for the quick replies!

Wikipedia says this about the Mersenne twister:
"The Mersenne Twister is designed with Monte Carlo simulations and other statistical simulations in mind. Researchers primarily want high quality numbers but also benefit from its speed and portability." Exactly what I need.

subsonix said:
I think you will have a hard time implementing something better, artificial randomness is never completely random. I think there's hardware solutions that offer better results, perhaps look into that.

I know it is only a pseudo-random generator when choosing software over hardware. But surely, one can do better than a few thousand random numbers?

subsonix · Jan 13, 2011

MrFusion said:
I know it is only a pseudo-random generator when choosing software over hardware. But surely, one can do better than a few thousand random numbers?

You might be right, the requirements for encryption is probably different.

MrFusion · Jan 13, 2011

subsonix said:
You might be right, the requirements for encryption is probably different.

No idea. How many random numbers does one need when encrypting/decrypting something?

subsonix · Jan 13, 2011

MrFusion said:
No idea. How many random numbers does one need when encrypting/decrypting something?

Well, at least you need it to be non predictable, without patterns. All I'm saying is that you often see these types of functions referred to as "cryptographically good" or not, not necessarily being the same. Same thing with hashing functions like md5 or sha1 btw.

chown33 · Jan 13, 2011

MrFusion said:
Yes, from what I have read this one is better than random(). From what I understood, however, it starts repeating after a few thousands numbers. That is unfortunately not good enough.

Where did you read or hear that arc4random() starts repeating after a few thousand?

It's supposed to a cryptographically strong algorithm, and repeats after that short an interval would be completely unacceptable to cryptography.

Hansr · Jan 13, 2011

Go with MT19937 the period is 2^19937 -1 that's the one we use for monte carlo sims. Marsaglia's Ziggurat RNG is also an option, it works a bit faster but IIRC had some issues with larger dimensional problems. MT19997 is useful up to 623 dimensions. MT 2203 is useful up to 1024 dimensions but slower than MT19997.

AlmostThere · Jan 14, 2011

Mersenne Twister

C++ available as part of Boost
http://www.boost.org/doc/libs/1_45_0/boost/random/mersenne_twister.hpp

Java available as part of Colt
http://acs.lbl.gov/software/colt/api/cern/jet/random/engine/package-summary.html

Python - used by the standard python library
http://docs.python.org/library/random.html

mobilehaathi · Jan 14, 2011

I use GSL (GNU Scientific Library, http://www.gnu.org/software/gsl/) daily. Their PRNGs are pretty good (http://www.gnu.org/software/gsl/manual/html_node/Random-Number-Generation.html). Best of luck!

Bill McEnaney · Jan 16, 2011

How about the ones in the GNU Scientific Library?

http://www.gnu.org/software/gsl/

MrFusion · Jan 17, 2011

Bill McEnaney said:
How about the ones in the GNU Scientific Library?

http://www.gnu.org/software/gsl/

A good option, but gsl uses gpl rather than a mit license.

gnasher729 · Jan 17, 2011

mobilehaathi said:
I use GSL (GNU Scientific Library, http://www.gnu.org/software/gsl/) daily. Their PRNGs are pretty good (http://www.gnu.org/software/gsl/manual/html_node/Random-Number-Generation.html). Best of luck!

Just posted on comp.lang.c and other places:

http://groups.google.com/group/comp.lang.c/browse_thread/thread/3cb5ab44ff758e6b#

"Random number generator with a period exceeding 10^(40 million)".

Mactrillionaire · Jan 17, 2011

The inherent defect in pseudorandom generators is that they only approximate randomness in an aggregate sense over a very large sample size (i.e., in the sense of "law of large numbers"). From a security standpoint, the fatal flaw is that if you know the seed then you can predict the sequence every time. From a simulation standpoint, you also do not get any guarantees about similarity to real world samples.

True randomness is defined as "if all numbers in the range have been picked an equal number of times (including zero), pick any; otherwise, pick any number in the range that has been picked less than the maximum selections of any number selected" (i.e., any permutation of a set without a repeat selection until the range is exhausted). There are two big inherent problems with true randomness: First, most real world examples are not going to benefit in accuracy from true randomness as opposed to artificial randomness. Secondly, there is a huge security problem as you finish selecting each range permutation. As an example, suppose you are simulating a six-sided dice roll. On the first roll of a truly random die, you select any number with probability 1/6. On the second roll of a truly random die, you select another number not yet selected with probability of 1/5, followed by 1/4, 1/3, 1/2 and absolute certainty on the last roll of selecting the number not yet selected in this permutation. So, even approximating this approach would be unwise from a security standpoint since, at some point, the probability of being able to guess what's next approaches and later arrives at certainty. The other issue from a pragmatic standpoint is that any attempt to use a pseudorandom algorithm to arrive at a truly random selection will bottleneck your program because the probability that it will pick truly random numbers exponentially decreases as you move closer to exhausting the range, so you would have to sabotage the randomness in order to stop the program from becoming inefficient very quickly.

In conclusion, your best bet is to collect some samples of real world data and analyze the properties thereof. You should then compare the various pseudorandom algorithms available and see which algorithm typically produces samples with properties most similar to the properties of the real world data. From there, you can make the simulation even better by generating many sets of pseudorandom data and then selecting from those sets the set which is, in terms of analyzed properties, most like the real world data. I know this approach makes life a lot more complicated than most would like, but let's face it: A simulation with garbage data is hardly more helpful than relying on a bad guess about expected real world outcomes. In other words, the data has to be similar to what you'd expect to see in the real world or otherwise the simulation is just garbage data in, garbage results out.

MrFusion · Jan 17, 2011

Mactrillionaire said:
A simulation with garbage data is hardly more helpful than relying on a bad guess about expected real world outcomes. In other words, the data has to be similar to what you'd expect to see in the real world or otherwise the simulation is just garbage data in, garbage results out.

Agreed. I have real world results to compare with. The simulations are to better understand the underlying physical processes and which parameters are important in explaining the observed results. Each parameter set will also be run a few times and averaged.

chown33 · Jan 17, 2011

Mactrillionaire said:
True randomness is defined as "if all numbers in the range have been picked an equal number of times (including zero), pick any; otherwise, pick any number in the range that has been picked less than the maximum selections of any number selected"

Citation please. Or is that just your definition of "true randomness"?

I don't see how that can be "true randomness", since it fails to cover many real-world randomness models. It's random-deck-of-cards randomness, which is to say, randomly picking from a shrinking set of discrete entries. That model doesn't begin to represent all the ways that randomness appears in nature (an N-sided die, for example).

As an introductory article, I recommend the Wikipedia article:
http://en.wikipedia.org/wiki/Randomness

As with any Wikipedia article, be sure to read the external links, citations, etc. at the end of the article.

gnasher729 · Jan 17, 2011

Mactrillionaire said:
True randomness is defined as "if all numbers in the range have been picked an equal number of times (including zero), pick any; otherwise, pick any number in the range that has been picked less than the maximum selections of any number selected" (i.e., any permutation of a set without a repeat selection until the range is exhausted).

That would actually be considered fatally non-random. Imagine a lottery using that as a random number generator for six numbers + bonus number out of 49. Every seventh week I could predict the six numbers + bonus number that will be drawn. Unfortunately, so could everyone else

Mactrillionaire · Jan 17, 2011

chown33 said:
Citation please. Or is that just your definition of "true randomness"?

I don't see how that can be "true randomness", since it fails to cover many real-world randomness models. It's random-deck-of-cards randomness, which is to say, randomly picking from a shrinking set of discrete entries. That model doesn't begin to represent all the ways that randomness appears in nature (an N-sided die, for example).

As an introductory article, I recommend the Wikipedia article:
http://en.wikipedia.org/wiki/Randomness

As with any Wikipedia article, be sure to read the external links, citations, etc. at the end of the article.

In the strict sense of the word, TRUE randomness means "continuing to positively maintain the ongoing assertion of equivalent probability". Of course, as was mentioned, this very strict definition of randomness is inherently deterministic. Furthermore, there is no real world example that I can think to cite because the real world is not like true randomness even though there are a few cases which closely approximate it. Most people refer to randomness loosely in an inexact fashion which doesn't exactly help the matter of what they are trying to debate. A pseudorandom number generator never produces an unpredictable result, by the way. A pseudorandom number generator only produces results that when viewed aggregately can be said to be random. For instance, over binary range, 0101 and 1010 are both truly random sequences, whereas 0011 and 1100 are only aggregately random. The truly random samples "continue to positively maintain the ongoing assertion of equivalent probability" whereas the aggregately random samples only succeed in achieving quantitative parity with "the ongoing assertion of equivalent probability" after n selections.

gnasher729 said:
That would actually be considered fatally non-random. Imagine a lottery using that as a random number generator for six numbers + bonus number out of 49. Every seventh week I could predict the six numbers + bonus number that will be drawn. Unfortunately, so could everyone else

It depends what kind of lottery you are talking about. If it is the one with numbered balls that shoot out of a tank, talking about mathematical randomness is barking up the wrong tree. Lottery selections like this aren't even close to being random. The physics of the ink on each individual ball has more to do with it than anything else.

macOS Random number generator for scientific computation

macrumors 6502a

Moderator emeritus

macrumors 68040

Suspended

macrumors regular

macrumors 6502a

Moderator emeritus

macrumors 68040

macrumors regular

macrumors 6502a

macrumors 68040

macrumors 6502a

macrumors 68040

Moderator

macrumors 6502a

macrumors 6502a

macrumors G3

macrumors 6502

macrumors 6502a

Suspended

macrumors regular

macrumors 6502a

Moderator

Suspended

macrumors regular

Our Staff