Random number generator for scientific computation

Discussion in 'Mac Programming' started by MrFusion, Jan 13, 2011.

  1. MrFusion macrumors 6502a

    Joined:
    Jun 8, 2005
    Location:
    West-Europe
    #1
    Hi all

    I need to implement a random number generator at work for a monte carlo
    simulation. The generator has to provide millions of random numbers, without too much of a discernable pattern. Unfortunately, I don't know anything about random number generators or how to judge their quality. Perhaps someone could point me towards a decent one and, if available, a tutorial on how the algorithm works and how to implement it?
     
  2. robbieduncan Moderator emeritus

    robbieduncan

    Joined:
    Jul 24, 2002
    Location:
    London
    #2
    You really want to implement this yourself? Surely there is a good-enough one out there you could use instead?
     
  3. subsonix macrumors 68040

    Joined:
    Feb 2, 2008
    #3
  4. gnasher729 macrumors P6

    gnasher729

    Joined:
    Nov 25, 2005
    #4
    Google for "Marsaglia". That should eventually lead you to a random number generator with a period of 10^45000 or so.
     
  5. MrFusion, Jan 13, 2011
    Last edited by a moderator: Jan 13, 2011

    MrFusion thread starter macrumors 6502a

    Joined:
    Jun 8, 2005
    Location:
    West-Europe
    #6
    Yes, from what I have read this one is better than random(). From what I understood, however, it starts repeating after a few thousands numbers. That is unfortunately not good enough.

    No, I don't necessarily have to implement it myself. I just need a decent one for millions of random numbers. random or arc4random is not going to cut it, from what I have read.
     
  6. balamw Moderator

    balamw

    Staff Member

    Joined:
    Aug 16, 2005
    Location:
    New England
    #7
    Based on his Wikipedia page, I must say the man picks great titles for his papers. Will have to read some if I ever need an RNG.

    I'm particularly curious about the Monty Python method.

    B
     
  7. subsonix macrumors 68040

    Joined:
    Feb 2, 2008
    #8
    I think you will have a hard time implementing something better, artificial randomness is never completely random. I think there's hardware solutions that offer better results, perhaps look into that.
     
  8. MrFusion, Jan 13, 2011
    Last edited by a moderator: Jan 13, 2011

    MrFusion thread starter macrumors 6502a

    Joined:
    Jun 8, 2005
    Location:
    West-Europe
    #10
    Thanks everyone for the quick replies!

    Wikipedia says this about the Mersenne twister:
    "The Mersenne Twister is designed with Monte Carlo simulations and other statistical simulations in mind. Researchers primarily want high quality numbers but also benefit from its speed and portability." Exactly what I need.

    I know it is only a pseudo-random generator when choosing software over hardware. But surely, one can do better than a few thousand random numbers?
     
  9. subsonix macrumors 68040

    Joined:
    Feb 2, 2008
    #11
    You might be right, the requirements for encryption is probably different.
     
  10. MrFusion thread starter macrumors 6502a

    Joined:
    Jun 8, 2005
    Location:
    West-Europe
    #12
    No idea. How many random numbers does one need when encrypting/decrypting something?
     
  11. subsonix macrumors 68040

    Joined:
    Feb 2, 2008
    #13
    Well, at least you need it to be non predictable, without patterns. All I'm saying is that you often see these types of functions referred to as "cryptographically good" or not, not necessarily being the same. Same thing with hashing functions like md5 or sha1 btw.
     
  12. chown33 macrumors 604

    Joined:
    Aug 9, 2009
    #14
    Where did you read or hear that arc4random() starts repeating after a few thousand?

    It's supposed to a cryptographically strong algorithm, and repeats after that short an interval would be completely unacceptable to cryptography.
     
  13. Hansr macrumors 6502a

    Joined:
    Apr 1, 2007
    #15
    Go with MT19937 the period is 2^19937 -1 that's the one we use for monte carlo sims. Marsaglia's Ziggurat RNG is also an option, it works a bit faster but IIRC had some issues with larger dimensional problems. MT19997 is useful up to 623 dimensions. MT 2203 is useful up to 1024 dimensions but slower than MT19997.
     
  14. Bill McEnaney macrumors 6502

    Joined:
    Apr 29, 2010
  15. MrFusion thread starter macrumors 6502a

    Joined:
    Jun 8, 2005
    Location:
    West-Europe
    #19
    A good option, but gsl uses gpl rather than a mit license.
     
  16. gnasher729 macrumors P6

    gnasher729

    Joined:
    Nov 25, 2005
    #20
  17. Mactrillionaire macrumors regular

    Joined:
    Oct 16, 2010
    #21
    The inherent defect in pseudorandom generators is that they only approximate randomness in an aggregate sense over a very large sample size (i.e., in the sense of "law of large numbers"). From a security standpoint, the fatal flaw is that if you know the seed then you can predict the sequence every time. From a simulation standpoint, you also do not get any guarantees about similarity to real world samples.

    True randomness is defined as "if all numbers in the range have been picked an equal number of times (including zero), pick any; otherwise, pick any number in the range that has been picked less than the maximum selections of any number selected" (i.e., any permutation of a set without a repeat selection until the range is exhausted). There are two big inherent problems with true randomness: First, most real world examples are not going to benefit in accuracy from true randomness as opposed to artificial randomness. Secondly, there is a huge security problem as you finish selecting each range permutation. As an example, suppose you are simulating a six-sided dice roll. On the first roll of a truly random die, you select any number with probability 1/6. On the second roll of a truly random die, you select another number not yet selected with probability of 1/5, followed by 1/4, 1/3, 1/2 and absolute certainty on the last roll of selecting the number not yet selected in this permutation. So, even approximating this approach would be unwise from a security standpoint since, at some point, the probability of being able to guess what's next approaches and later arrives at certainty. The other issue from a pragmatic standpoint is that any attempt to use a pseudorandom algorithm to arrive at a truly random selection will bottleneck your program because the probability that it will pick truly random numbers exponentially decreases as you move closer to exhausting the range, so you would have to sabotage the randomness in order to stop the program from becoming inefficient very quickly.

    In conclusion, your best bet is to collect some samples of real world data and analyze the properties thereof. You should then compare the various pseudorandom algorithms available and see which algorithm typically produces samples with properties most similar to the properties of the real world data. From there, you can make the simulation even better by generating many sets of pseudorandom data and then selecting from those sets the set which is, in terms of analyzed properties, most like the real world data. I know this approach makes life a lot more complicated than most would like, but let's face it: A simulation with garbage data is hardly more helpful than relying on a bad guess about expected real world outcomes. In other words, the data has to be similar to what you'd expect to see in the real world or otherwise the simulation is just garbage data in, garbage results out.
     
  18. MrFusion thread starter macrumors 6502a

    Joined:
    Jun 8, 2005
    Location:
    West-Europe
    #22
    Agreed. I have real world results to compare with. The simulations are to better understand the underlying physical processes and which parameters are important in explaining the observed results. Each parameter set will also be run a few times and averaged.
     
  19. chown33 macrumors 604

    Joined:
    Aug 9, 2009
    #23
    Citation please. Or is that just your definition of "true randomness"?

    I don't see how that can be "true randomness", since it fails to cover many real-world randomness models. It's random-deck-of-cards randomness, which is to say, randomly picking from a shrinking set of discrete entries. That model doesn't begin to represent all the ways that randomness appears in nature (an N-sided die, for example).

    As an introductory article, I recommend the Wikipedia article:
    http://en.wikipedia.org/wiki/Randomness

    As with any Wikipedia article, be sure to read the external links, citations, etc. at the end of the article.
     
  20. gnasher729 macrumors P6

    gnasher729

    Joined:
    Nov 25, 2005
    #24
    That would actually be considered fatally non-random. Imagine a lottery using that as a random number generator for six numbers + bonus number out of 49. Every seventh week I could predict the six numbers + bonus number that will be drawn. Unfortunately, so could everyone else :mad:
     
  21. Mactrillionaire, Jan 17, 2011
    Last edited by a moderator: Jan 17, 2011

    Mactrillionaire macrumors regular

    Joined:
    Oct 16, 2010
    #25
    In the strict sense of the word, TRUE randomness means "continuing to positively maintain the ongoing assertion of equivalent probability". Of course, as was mentioned, this very strict definition of randomness is inherently deterministic. Furthermore, there is no real world example that I can think to cite because the real world is not like true randomness even though there are a few cases which closely approximate it. Most people refer to randomness loosely in an inexact fashion which doesn't exactly help the matter of what they are trying to debate. A pseudorandom number generator never produces an unpredictable result, by the way. A pseudorandom number generator only produces results that when viewed aggregately can be said to be random. For instance, over binary range, 0101 and 1010 are both truly random sequences, whereas 0011 and 1100 are only aggregately random. The truly random samples "continue to positively maintain the ongoing assertion of equivalent probability" whereas the aggregately random samples only succeed in achieving quantitative parity with "the ongoing assertion of equivalent probability" after n selections.

    It depends what kind of lottery you are talking about. If it is the one with numbered balls that shoot out of a tank, talking about mathematical randomness is barking up the wrong tree. Lottery selections like this aren't even close to being random. The physics of the ink on each individual ball has more to do with it than anything else.
     

Share This Page