6.2 Crypto.Util.randpool

For cryptographic purposes, ordinary random number generators are frequently insufficient, because if some of their output is known, it is frequently possible to derive the generator's future (or past) output. Given the generator's state at some point in time, someone could try to derive any keys generated using it. The solution is to use strong encryption or hashing algorithms to generate successive data; this makes breaking the generator as difficult as breaking the algorithms used.

Understanding the concept of entropy is important for using the random number generator properly. In the sense we'll be using it, entropy measures the amount of randomness; the usual unit is in bits. So, a single random bit has an entropy of 1 bit; a random byte has an entropy of 8 bits. Now consider a one-byte field in a database containing a person's sex, represented as a single character "M" or "F". What's the entropy of this field? Since there are only two possible values, it's not 8 bits, but one; if you were trying to guess the value, you wouldn't have to bother trying "Q" or "@".

Now imagine running that single byte field through a hash function that produces 128 bits of output. Is the entropy of the resulting hash value 128 bits? No, it's still just 1 bit. The entropy is a measure of how many possible states of the data exist. For English text, the entropy of a five-character string is not 40 bits; it's somewhat less, because not all combinations would be seen. "Guido"is a possible string, as is "In th"; "zJwvb" is not.

The relevance to random number generation? We want enough bits of entropy to avoid making an attack on our generator possible. An example: One computer system had a mechanism which generated nonsense passwords for its users. This is a good idea, since it would prevent people from choosing their own name or some other easily guessed string. Unfortunately, the random number generator used only had 65536 states, which meant only 65536 different passwords would ever be generated, and it was easy to compute all the possible passwords and try them. The entropy of the random passwords was far too low. By the same token, if you generate an RSA key with only 32 bits of entropy available, there are only about 4.2 billion keys you could have generated, and an adversary could compute them all to find your private key. See RFC 1750, "Randomness Recommendations for Security", for an interesting discussion of the issues related to random number generation.

The randpool module implements a strong random number generator in the RandomPool class. The internal state consists of a string of random data, which is returned as callers request it. The class keeps track of the number of bits of entropy left, and provides a function to add new random data; this data can be obtained in various ways, such as by using the variance in a user's keystroke timings.

class RandomPool([numbytes, cipher, hash])
An object of the RandomPool class can be created without parameters if desired. numbytes sets the number of bytes of random data in the pool, and defaults to 160 (1280 bits). hash can be a string containing the module name of the hash function to use in stirring the random data, or a module object supporting the hashing interface. The default action is to use SHA.

The cipher argument is vestigial; it was removed from version 1.1 so RandomPool would work even in the limited exportable subset of the code. I recommend passing hash using a keyword argument so that someday I can safely delete the cipher argument

RandomPool objects define the following variables and methods:

add_event(time[, string])
Adds an event to the random pool. time should be set to the current system time, measured at the highest resolution available. string can be a string of data that will be XORed into the pool, and can be used to increase the entropy of the pool. For example, if you're encrypting a document, you might use the hash value of the document; an adversary presumably won't have the plaintext of the document, and thus won't be able to use this information to break the generator.

The return value is the value of self.entropy after the data has been added. The function works in the following manner: the time between successive calls to the add_event() method is determined, and the entropy of the data is guessed; the larger the time between calls, the better. The system time is then read and added to the pool, along with the string parameter, if present. The hope is that the low-order bits of the time are effectively random. In an application, it is recommended that add_event() be called as frequently as possible, with whatever random data can be found.

bits
A constant integer value containing the number of bits of data in the pool, equal to the bytes attribute multiplied by 8.

bytes
A constant integer value containing the number of bytes of data in the pool.

entropy
An integer value containing the number of bits of entropy currently in the pool. The value is incremented by the add_event() method, and decreased by the get_bytes() method.

get_bytes(num)
Returns a string containing num bytes of random data, and decrements the amount of entropy available. It is not an error to reduce the entropy to zero, or to call this function when the entropy is zero. This simply means that, in theory, enough random information has been extracted to derive the state of the generator. It is the caller's responsibility to monitor the amount of entropy remaining and decide whether it is sufficent for secure operation.

stir()
Scrambles the random pool using the previously chosen encryption and hash function. An adversary may attempt to learn or alter the state of the pool in order to affect its future output; this function destroys the existing state of the pool in a non-reversible way. It is recommended that stir() be called before and after using the RandomPool object. Even better, several calls to stir() can be interleaved with calls to add_event().

The PersistentRandomPool class is a subclass of RandomPool that adds the capability to save and load the pool from a disk file.

class PersistentRandomPool(filename, [numbytes, cipher, hash])
The path given in filename will be automatically opened, and an existing random pool read; if no such file exists, the pool will be initialized as usual. If omitted, the filename defaults to the empty string, which will prevent it from being saved to a file. These arguments are identical to those for the RandomPool constructor.

save()
Opens the file named by the filename attribute, and saves the random data into the file using the pickle module.

The KeyboardRandomPool class is a subclass of PersistentRandomPool that provides a method to obtain random data from the keyboard:

randomize()
(Unix systems only) Obtain random data from the keyboard. This works by prompting the user to hit keys at random, and then using the keystroke timings (and also the actual keys pressed) to add entropy to the pool. This works similarly to PGP's random pool mechanism.