Search

5.9 — Random number generation

// NOTE: QUIZ QUESTION AT BOTTOM IS NOT ON LIVE SITE, DO NOT OVERWRITE

The ability to generate random numbers can be useful in certain kinds of programs, particularly in games, statistics modeling programs, and scientific simulations that need to model random events. Take games for example -- without random events, monsters would always attack you the same way, you’d always find the same treasure, the dungeon layout would never change, etc… and that would not make for a very good game.

So how do we generate random numbers? In real life, we often generate random results by doing things like flipping a coin, rolling a dice, or shuffling a deck of cards. These events involve so many physical variables (e.g. gravity, friction, air resistance, momentum, etc…) that they become almost impossible to predict or control, and produce results that are for all intents and purposes random.

However, computers aren’t designed to take advantage of physical variables -- your computer can’t toss a coin, throw a dice, or shuffle real cards. Computers live in a controlled electrical world where everything is binary (false or true) and there is no in-between. By their very nature, computers are designed to produce results that are as predictable as possible. When you tell the computer to calculate 2 + 2, you always want the answer to be 4. Not 3 or 5 on occasion.

Consequently, computers are generally incapable of generating random numbers. Instead, they must simulate randomness, which is most often done using pseudo-random number generators.

A pseudo-random number generator (PRNG) is a program that takes a starting number (called a seed), and performs mathematical operations on it to transform it into some other number that appears to be unrelated to the seed. It then takes that generated number and performs the same mathematical operation on it to transform it into a new number that appears unrelated to the number it was generated from. By continually applying the algorithm to the last generated number, it can generate a series of new numbers that will appear to be random if the algorithm is complex enough.

It’s actually fairly easy to write a PRNG. Here’s a short program that generates 100 pseudo-random numbers:

The result of this program is:

20433	22044	9937	30185	29341	
14783	29730	8430	3076	28768	
18053	16066	26537	100	30493	
4943	19511	19251	6669	32117	
31575	3373	32383	30496	12710	
23999	11929	5425	9938	12107	
28541	1938	3450	20283	16726	
6440	4938	26094	24391	12248	
24803	30416	16244	19590	6644	
24646	4873	2841	23831	23476	
17958	8827	17400	32129	32760	
25744	25405	13591	8859	15932	
19086	19666	19265	14179	1165	
27168	20996	29427	5857	3434	
18964	11980	564	4620	400	
17362	16934	11889	419	9714	
19808	29699	3694	25612	5512	
20256	10009	10247	1860	1846	
1487	14030	2615	16035	8107	
28736	267	29395	9438	20294	

Each number appears to be pretty random with respect to the previous one. As it turns out, our algorithm actually isn’t very good, for reasons we will discuss later. But it does effectively illustrate the principle of PRNG number generation.

Generating random numbers in C++

C (and by extension C++) comes with a built-in pseudo-random number generator. It is implemented as two separate functions that live in the cstdlib header:

srand() sets the initial seed value to a value that is passed in by the caller. srand() should only be called once.

rand() generates the next random number in the sequence. Taht number will be a pseudo-random integer between 0 and RAND_MAX, a constant in cstdlib that is typically set to 32767.

Here’s a sample program using these functions:

Here’s the output of this program:

17421	8558	19487	1344	26934	
7796	28102	15201	17869	6911	
4981	417	12650	28759	20778	
31890	23714	29127	15819	29971	
1069	25403	24427	9087	24392	
15886	11466	15140	19801	14365	
18458	18935	1746	16672	22281	
16517	21847	27194	7163	13869	
5923	27598	13463	15757	4520	
15765	8582	23866	22389	29933	
31607	180	17757	23924	31079	
30105	23254	32726	11295	18712	
29087	2787	4862	6569	6310	
21221	28152	12539	5672	23344	
28895	31278	21786	7674	15329	
10307	16840	1645	15699	8401	
22972	20731	24749	32505	29409	
17906	11989	17051	32232	592	
17312	32714	18411	17112	15510	
8830	32592	25957	1269	6793

PRNG sequences and seeding

If you run the rand() sample program above multiple times, you will note that it prints the same result every time! This means that while each number in the sequence is seemingly random with regards to the previous ones, the entire sequence is not random at all! And that means our program ends up totally predictable (the same inputs lead to the same outputs every time). There are cases where this can be useful or even desired (e.g. you want a scientific simulation to be repeatable, or you’re trying to debug why your random dungeon generator crashes).

But often, this is not what is desired. If you’re writing a game of hi-lo (where the user has 10 tries to guess a number, and the computer tells them whether their guess is too high or too low), you don’t want the program picking the same numbers each time. So let’s take a deeper look at why this is happening, and how we can fix it.

Remember that each number in a PRNG sequence is generated from the previous number, in a deterministic way. Thus, given any starting seed number, PRNGs will always generate the same sequence of numbers from that seed as a result! We are getting the same sequence because our starting seed number is always 5323.

In order to make our entire sequence randomized, we need some way to pick a seed that’s not a fixed number. The first answer that probably comes to mind is that we need a random number! That’s a good thought, but if we need a random number to generate random numbers, then we’re in a catch-22. It turns out, we really don’t need our seed to be a random number -- we just need to pick something that changes each time the program is run. Then we can use our PRNG to generate a unique sequence of pseudo-random numbers from that seed.

The commonly accepted method for doing this is to enlist the system clock. Each time the user runs the program, the time will be different. If we use this time value as our seed, then our program will generate a different sequence of numbers each time it is run!

C comes with a function called time() that returns the number of seconds since midnight on Jan 1, 1970. To use it, we merely need to include the ctime header, and then initialize srand() with a call to time(0).

Here’s the same program as above, using a call to time() as the seed:

Now our program will generate a different sequence of random numbers every time! Run it a couple of times and see for yourself.

Generating random numbers between a given range

Generally, we do not want random numbers between 0 and RAND_MAX -- we want numbers between two other values, which we’ll call min and max. For example, if we’re trying to simulate the user rolling a die, we want random numbers between 1 and 6.

Here’s a short function that converts the result of rand() into the range we want:

To simulate the roll of a die, we’d call getRandomNumber(1, 6).

What is a good PRNG?

As I mentioned above, the PRNG we wrote isn’t a very good one. This section will discuss the reasons why. It is optional reading because it’s not strictly related to C or C++, but if you like programming you will probably find it interesting anyway.

In order to be a good PRNG, the PRNG needs to exhibit a number of properties:

First, the PRNG should generate each number with approximately the same probability. This is called distribution uniformity. If some numbers are generated more often than others, the result of the program that uses the PRNG will be biased!

For example, let’s say you’re trying to write a random item generator for a game. You’ll pick a random number between 1 and 10, and if the result is a 10, the monster will drop a powerful item instead of a common one. You would expect a 1 in 10 chance of this happening. But if the underlying PRNG is not uniform, and generates a lot more 10s than it should, your players will end up getting more rare items than you’d intended, possibly trivializing the difficulty of your game.

Generating PRNGs that produce uniform results is difficult, and it’s one of the main reasons the PRNG we wrote at the top of this lesson isn’t a very good PRNG.

Second, the method by which the next number in the sequence is generated shouldn’t be obvious or predictable. For example, consider the following PRNG algorithm: num = num + 1. This PRNG is perfectly uniform, but it’s not very useful as a sequence of random numbers!

Third, the PRNG should have a good dimensional distribution of numbers. This means it should return low numbers, middle numbers, and high numbers seemingly at random. A PRNG that returned all low numbers, then all high numbers may be uniform and non-predictable, but it’s still going to lead to biased results, particularly if the number of random numbers you actually use is small.

Fourth, all PRNGs are periodic, which means that at some point the sequence of numbers generated will eventually begin to repeat itself. As mentioned before, PRNGs are deterministic, and given an input number, a PRNG will produce the same output number every time. Consider what happens when a PRNG generates a number it has previously generated. From that point forward, it will begin to duplicate the sequence between the first occurrence of that number and the next occurrence of that number over and over. The length of this sequence is known as the period.

For example, here are the first 100 numbers generated from a PRNG with poor periodicity:

112	9	130	97	64	
31	152	119	86	53	
20	141	108	75	42	
9	130	97	64	31	
152	119	86	53	20	
141	108	75	42	9	
130	97	64	31	152	
119	86	53	20	141	
108	75	42	9	130	
97	64	31	152	119	
86	53	20	141	108	
75	42	9	130	97	
64	31	152	119	86	
53	20	141	108	75	
42	9	130	97	64	
31	152	119	86	53	
20	141	108	75	42	
9	130	97	64	31	
152	119	86	53	20	
141	108	75	42	9

You will note that it generated 9 as the second number, and 9 again as the 16th number. The PRNG gets stuck generating the sequence in-between these two 9’s repeatedly: 9-130-97-64-31-152-119-86-53-20-141-108-75-42-(repeat).

A good PRNG should have a long period for all seed numbers. Designing an algorithm that meets this property can be extremely difficult -- most PRNGs will have long periods for some seeds and short periods for others. If the user happens to pick a seed that has a short period, then the PRNG won’t be doing a good job.

Despite the difficulty in designing algorithms that meet all of these criteria, a lot of research has been done in this area because of its importance to scientific computing.

rand() is a mediocre PRNG

The algorithm used to implement rand() can vary from compiler to compiler, leading to results that may not be consistent across compilers. Most implementations of rand() use a method called a Linear Congruential Generator (LCG). If you have a look at the first example in this lesson, you’ll note that it’s actually a LCG, though one with intentionally picked poor constants. LCGs tend to have shortcomings that make them not good choices for most kinds of problems.

One of the main shortcomings of rand() is that RAND_MAX is usually set to 32767 (essentially 16-bits). This means if you want to generate numbers over a larger range (e.g. 32-bit integers), rand() is not suitable. Also, rand() isn’t good if you want to generate random floating point numbers (e.g. between 0.0 and 1.0), which is often useful when doing statistical modelling. Finally, rand() tends to have a relatively short period compared to other algorithms.

That said, rand() is perfectly suitable for learning how to program, and for programs in which a high-quality PRNG is not a necessity.

For applications where a high-quality PRNG is useful, I would recommend Mersenne Twister, which produces great results and is relatively easy to use.

A note for Visual Studio users

The implementation of rand() in Visual Studio has a flaw -- the first random number generated doesn’t change much for similar seed values. This means that when using time() to seed your random number generator, the first result from rand() won’t be as random as it should be. This problem is compounded by calling getRandomNumber(), which compresses similar inputs into the same output number.

However, there’s an easy fix: call rand() once and discard the result before using rand() in your program.

Random numbers in C++11

C++11 added a ton of random number generation functionality to the C++ standard library, including the Mersenne Twister algorithm, as well as generators for different kinds of random distributions (uniform, normal, Poisson, etc…). This is accessed via the <random> header.

Here’s a short example showing how to generate random numbers in C++11 using Mersenne Twister:

You’ll note that Mersenne Twister generates random 32-bit unsigned integers (not 15-bit integers like rand()), giving a lot more range. There’s also a version (std::mt19937_64) for generating 64-bit unsigned integers!

There’s so much functionality in <random> that it really warrants its own section. We’ll look to cover that in a future lesson in more detail.

// MOVE THIS TO END OF CHAPTER QUIZ

// NOTE: WE HAVEN’T COVERED RANDOM NUMBERS YET

Question #3

Implement a game of hi-lo. First, your program should pick a random integer between 1 and 100. The user is given 7 tries to guess the number. If the user does not guess the correct number, the program should tell them whether they guessed too high or too low. If the user guesses the right number, the program should tell them they won. If they run out of guesses, the program should tell them they lost, and what the correct number is. At the end of the game, the user should be asked if they want to play again. If the user doesn’t enter ‘y’ or ‘n’, ask them again.

Here’s what your output should look like:

Let's play a game.  I'm thinking of a number.  You have 7 tries to guess what it is.
Guess #1: 64
Your guess is too high.
Guess #2: 32
Your guess is too low.
Guess #3: 54
Your guess is too high.
Guess #4: 51
Correct! You win!
Would you like to play again (y/n)? y
Let's play a game.  I'm thinking of a number.  You have 7 tries to guess what it is.
Guess #1: 64
Your guess is too high.
Guess #2: 32
Your guess is too low.
Guess #3: 54
Your guess is too high.
Guess #4: 51
Your guess is too high.
Guess #5: 36
Your guess is too low.
Guess #6: 45
Your guess is too low.
Guess #7: 48
Your guess is too low.
Sorry, you lose.  The correct number was 49.
Would you like to play again (y/n)? q
Would you like to play again (y/n)? f
Would you like to play again (y/n)? n
Thank you for playing.

Hints:
* Seed your random number generator with time(0).
* Visual Studio users: Due to a flaw in the Visual Studio implementation of rand(), call rand() once after seeding to discard the first result.
* Use the getRandomNumber() function from lesson 5.9 -- Random number generation to pick a random number.
* Write a function that allows the user to play a single game of hi-lo.
* Write a function that asks the user if they want to play again and handles the looping logic for an incorrect input.

Show Solution

7.x -- Chapter 7 summary and quiz
Index
7.10 -- Break and continue

69 comments to 5.9 — Random number generation

  • DarkmoonBlade

    java has a much easier to use random generator that works much better Math.random ()*90-1 random values from 1-90 i thought C++ was superior to java doesn't look like it theres no bull shit of srand() and system clock and you can easily get random double values with math.random
    .

  • dice3000

    Much better random variable initaliztaion :

    #include Windows header

    LARGE_INTEGER cicles;
    QueryPerformanceCounter(&cicles);
    srand (static_cast(cicles.QuadPart));

  • dice3000

    srand(time(0)) - this kind of inital seed values is absolutely bad, because it is a timestamp (amount of seconds since 1970.01.01). So if you are calling your main function multiple times in a row, your random numbers will follow the same pattern, since the seed values only changes by a couple of seconds, and that is very small change to a number 1375984340 (2013.03.08 20:52). What wee need for a seed value is computer microtime (microseconds) of current time, because its changes rapidly

  • thehuntedpie

    I love the use of modulus to choose a number range. However, wouldn't the statement "return endSeed % 32767; " actually return a value between 0 and 32766 *not* 32767 as the comment suggests?

    Since a modulus operation returns 0 if there is no remainder.

  • mori

    If you are on a unix/linux system you should use /dev/random or /dev/urandom, those do have more "physical" randomness, mouse movements, etc.

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">