[Home]

I set out to find the state of the art in random number generation. It is more difficult than I expected. George Marsaglia is an expert in the field. One of his posts explains the problem and gives code for several good RNGs (latest versions of KISS, MWC256, and CMWC4096). He compares choosing the "best" random number generator to choosing the best in a "Miss America Contest". To be politically correct, I'll change his analogy to choosing the best automobile. If someone asked you which was the best car to get, how would you answer? The answer depends on the needs of the user. If he needs to move a lot of stuff around, you might recommend a pickup truck. If he needs to drive his family around, you might recommend an SUV, minivan, or 4-door sedan. Price and performance are also factors to consider. Then, there are the aesthetic issues. Which car, in terms of looks and design, is the most attractive to the user? So it goes on and on. And so it goes with random number generators.

The problem is that random number generators are not truly random. If you had access to a truly random source on your computer, you could just use that. Instead, you're trying to find an algorithm to generate random-looking numbers. As John von Neumann famously remarked, "Anyone who considers arithmetical methods of producing random numbers is, of course, in a state of sin." The quest is flawed from the start.

Generating random numbers for cryptography is very different than generating numbers for simulation. Most random number generators are designed to have good statistical properties; they are not designed to be unpredictable. I'm mainly concerned with RNGs with good statistical properties. For cryptographically secure random number generators, look at Yarrow, Blum Blum Shub, ISAAC, or one based on a secure digest or cipher. See, for example, Peter Gutmann's paper on Software Generation of Practically Strong Random Numbers. The Mersenne Twister FAQ also has suggestions for making it cryptographically secure.

The experts use the following criteria to measure a random number generator. See Pierre L'Ecuyer's Random Number Generation (2004, pdf) for a more thorough discussion of quality criteria.

**Good Theoretical Basis**. The RNG should be based on an algorithm with provably good statistical properties. It's fair to ask, "Why should I care about this if my random number generator passes all the standard statistical tests?" The problem is that statistical tests cover the tip of the iceberg. That is, statistical tests are very limited in scope due to computational limitations. You can't search all numbers produced by an RNG with a period of 2^131 for statistical anomalies. It's similar to cryptography. Just because no one has cracked your cipher, doesn't mean you have a good cipher. Statistical tests can prove that an RNG is bad; they can't prove that an RNG is good.**Long Period**. The RNG should have a "long" period. Every RNG based on a finite state eventually repeats the numbers it generates. The period should be long enough to ensure that the RNG does not cycle in practice. A period of 2^32 or less is too short. It seems like a period above 2^60 is sufficient.**"Pass" Empirical Statistical Tests**. There are many tests suites available: TestU01, DIEHARD, SPRNG, and NIST's Tests. Most statistical tests compute a p-value which should be uniform over [0,1). Poor RNGs will fail many tests with p-values at or very close to 0 or 1. However, I have yet to see an RNG which passes every test perfectly. As Marsaglia writes, "p happens". Pierre L'Ecuyer writes, "Building a RNG that*passes all statistical tests*is an impossible dream." He continues: "So we may say that*bad*RNGs are those that fail simple tests, whereas*good*RNGs fail only complicated tests that are hard to find and run."**Efficient**. Many researchers place a strong emphasis on efficiency. It's not clear to me why. Obviously, an RNG which is very cpu-intensive will be impractical. The emphasis you place on this criterion depends a great deal on your application for the RNG. If you require generating many numbers very quickly, then you'll place a high value on performance. Likewise if you need to implement the RNG in hardware and have very limited resources. But, for many applications, efficiency will be low on the list of priorities.**Repeatable**. For simulation, you would like an RNG that is repeatable. In other words, given the same starting state, it should generate the same numbers. For cryptographers this is less important; they need RNGs that are unpredictable.**Portable**. In order for an RNG to be generally useful, it should be portable. That means that it should be relatively easy to implement on a wide range of hardware, operating systems, and programming environments. RNGs which use only 32-bit integers are more portable than RNGs that use 64-bit integers.

- Pierre L'Ecuyer is one of the experts in the field. He has an excellent web site with lots of information on random number generators. Best of all, many of his published papers are available there. Especially good are, Random Number Generation (2004, pdf) and Uniform Random Number Generation (1994, ps). If you want to experiment with random number generators, check out TestU01. It has many random number generator implementations and many statistical tests.
- The latest version of George Marsaglia's DIEHARD test suite.
- George Marsaglia's USENET posts. Marsaglia's 1999 post gives code for several good RNGs. Another post in 2003 explains the difficulty in finding the "best" RNG and gives code for several RNGs with very long periods. For those obsessed with RNGs with very long periods, Marsaglia provides code for several RNGs with periods much longer than the famous Mersenne Twister. CMWC4096 has a period of 2^131086.
- pLab is a great source for information about random number generators.
- PRNG is portable C implementation of several random number generators.
- GNU Scientific Library (GSL) includes several very good random number generators including Mersenne Twister.

Unfortunately, it depends on what you need it for. For simple uses, a well-designed linear congruential generator may be fine. If you need a cryptographically secure RNG, choose one that was designed to be secure (most RNGs weren't).

Ideally, you should test the RNG yourself. I list several test suites above. Even better is if you have statistical tests specific to how you will use the RNG. It's good to verify that the implementation you're using is good and that the RNG doesn't exhibit any biases when applied to your type of application.

There are dozens of very good RNGs. None is perfect, but none has obvious flaws. None is clearly superior to the others. How do you choose? It's as if you had chosen the best automobile for your needs and the salesman asks whether you want it to be black, silver, white, blue, or red. Which is best?

In Random Number Generation, L'Ecuyer recommends combined MRG's, combined LFSR's, and Mersenne twisters. GSL's Manual recommends several RNGs including Mersenne Twister. George Marsaglia has reservations about the complexity of Mersenne Twister, but considers it to be a good RNG nonetheless. The criticisms of Mersenne Twister tend to be aesthetic -- there are many simpler and faster RNGs with good foundations in theory which perform as well in statistical tests.

Although some experts may not like the aesthetic design of Mersenne
Twister, I haven't seen any substantial criticisms.
**Matsumoto's Mersenne
Twister seems to be a safe choice**. It has strong support
from people knowledgeable in the field, a good theoretical basis, a
long period, good performance characteristics, and empirical
statistical support.

[Last Modified: 9 June 2004]