Browser APIs for Cryptographically Secure Password Generation

Jonathan Davis • Chief Systems Architect • FindDevTools Security Lab

The first attempt most developers make at writing a random string generator relies entirely on `Math.random()`. While mathematically intriguing for simple games or UI animations, using pseudo-random number generators to create passwords, tokens, or encryption salts introduces fatal vulnerabilities into your web application.

The Math.random() Trap

Pseudo-Random Number Generators (PRNGs) are deterministic. They use mathematical formulas and a starting "seed" to produce a sequence of numbers that only appear random. If an attacker can determine the initial seed—often based on the global timestamp—or observe enough of the output sequence, they can predict every subsequent number generated. For a password, this completely shatters any assumed entropy.

The Web Crypto API Solution

The only secure method for generating random material in the browser is through the Web Cryptography API, specifically the `window.crypto.getRandomValues()` method. This API does not use a predictable algorithm; instead, it requests true entropy directly from the underlying operating system (such as `/dev/urandom` on Unix or `CryptGenRandom` on Windows).

These operating system sources harvest physical entropy from the machine's environment—mouse movements, keyboard timing, fan speeds, and CPU thermal fluctuations. This creates Cryptographically Secure Pseudo-Random Number Generators (CSPRNG), making predictions impossible.

Implementing the generator

When the FindDevTools Secure Password Generator creates a string, it initializes an empty `Uint8Array`. It populates this array using `crypto.getRandomValues()`. It then iterates through the random byte array, using modulo arithmetic against a predefined array of target characters (numbers, symbols, casing) to select the exact string index. This ensures perfect statistical distribution across the character set.

This is a 1000+ word deep dive...

Technical Deep Dive & Specification Reference

process that were well-enough correlated to those used originally to make exhaustive search practical. The use of multiple random inputs with a strong mixing function is recommended and can overcome weakness in any particular input. The timing and content of requested "random" user keystrokes can yield hundreds of random bits, but conservative assumptions need to be made. For example, one reasonably conservative assumption would be that an inter-keystroke interval provides at most a few bits of randomness, but only when the interval is unique in the sequence of intervals up to that point. A similar assumption would be that a key code provides a few bits of randomness, but only when the code is unique in the sequence.

Thus, an interval or key code that duplicated a previous value would be assumed to provide no additional randomness. The results of mixing these timings with typed characters could be further combined with clock values and other inputs. This strategy may make practical portable code for producing good random numbers for security, even if some of the inputs are very weak on some of the target systems. However, it may still fail against a high-grade attack on small, single-user, or embedded systems, especially if the adversary has ever been able to observe the generation process in the past. A hardware-based random source is still preferable.

4. De-skewing Is there any specific requirement on the shape of the distribution of quantities gathered for the entropy to produce the random numbers? The good news is that the distribution need not be uniform. All that is needed to bound performance is a conservative estimate of how Eastlake, et al. Standards Track [Page 12] RFC 4086 Randomness Requirements for Security June 2005 non-uniform it is. Simple techniques to de-skew a bit stream are given below, and stronger cryptographic techniques are described in Section 5.2.

4.1. Using Stream Parity to De-Skew As a simple but not particularly practical example, consider taking a sufficiently long string of bits and mapping the string to "zero" or "one". The mapping will not yield a perfectly uniform distribution, but it can be as close as desired. One mapping that serves the purpose is to take the parity of the string. This has the advantages that it is robust across all degrees of skew up to the estimated maximum skew and that it is trivial to implement in hardware.

The following analysis gives the number of bits that must be sampled: Suppose that the ratio of ones to zeros is ( 0.5 + E ) to ( 0.5 - E ), where E is between 0 and 0.5 and is a measure of the "eccentricity" of the distribution. Consider the distribution of the parity function of N bit samples. The respective probabilities that the parity will be one or zero will be the sum of the odd or even terms in the binomial expansion of (p + q)^N, where p = 0.5 + E, the probability of a one, and q = 0.5 - E, the probability of a zero. These sums can be computed easily as N N 1/2 * ( ( p + q ) + ( p - q ) ) and N N 1/2 * ( ( p + q ) - ( p - q ) ). (Which formula corresponds to the probability that the parity will be 1 depends on whether N is odd or even.) Since p + q = 1 and p - q = 2E, these expressions reduce to N 1/2 * [1 + (2E) ] and N 1/2 * [1 - (2E) ].

Neither of these will ever be exactly 0.5 unless E is zero, but we can bring them arbitrarily close to 0.5. If we want the probabilities to be within some delta d of 0.5, e.g., then Eastlake, et al. Standards Track [Page 13] RFC 4086 Randomness Requirements for Security June 2005 N ( 0.5 + ( 0.5 * (2E) ) ) < 0.5 + d. Solving for N yields N > log(2d)/log(2E). (Note that 2E is less than 1, so its log is negative.

Division by a negative number reverses the sense of an inequality.) The following table gives the length N of the string that must be sampled for various degrees of skew in order to come within 0.001 of a 50/50 distribution. +---------+--------+-------+ | Prob(1) | E | N | +---------+--------+-------+ | 0.5 | 0.00 | 1 | | 0.6 | 0.10 | 4 | | 0.7 | 0.20 | 7 | | 0.8 | 0.30 | 13 | | 0.9 | 0.40 | 28 | | 0.95 | 0.45 | 59 | | 0.99 | 0.49 | 308 | +---------+--------+-------+ The last entry shows that even if the distribution is skewed 99% in favor of ones, the parity of a string of 308 samples will be within 0.001 of a 50/50 distribution. But, as we shall see in section 5.2, there are much stronger techniques that extract more of the available entropy. 4.2. Using Transition Mappings to De-Skew Another technique, originally due to von Neumann [VON_NEUMANN], is to examine a bit stream as a sequence of non-overlapping pairs.

One could then discard any 00 or 11 pairs found, interpret 01 as a 0 and 10 as a 1. Assume that the probability of a 1 is 0.5+E and that the probability of a 0 is 0.5-E, where E is the eccentricity of the source as described in the previous section. Then the probability of each pair is shown in the following table: +------+-----------------------------------------+ | pair | probability | +------+-----------------------------------------+ | 00 | (0.5 - E)^2 = 0.25 - E + E^2 | | 01 | (0.5 - E)*(0.5 + E) = 0.25 - E^2 | | 10 | (0.5 + E)*(0.5 - E) = 0.25 - E^2 | | 11 | (0.5 + E)^2 = 0.25 + E + E^2 | +------+-----------------------------------------+ Eastlake, et al. Standards Track [Page 14] RFC 4086 Randomness Requirements for Security June 2005 This technique will completely eliminate any bias but requires an indeterminate number of input bits for any particular desired number of output bits. The probability of any particular pair being discarded is 0.5 + 2E^2, so the expected number of input bits to produce X output bits is X/(0.25 - E^2).

This technique assumes that the bits are from a stream where each bit has the same probability of being a 0 or 1 as any other bit in the stream and that bits are uncorrelated, i.e., that the bits come from identical independent distributions. If alternate bits are from two correlated sources, for example, the above analysis breaks down. The above technique also provides another illustration of how a simple statistical analysis can mislead if one is not always on the lookout for patterns that could be exploited by an adversary. If the algorithm were misread slightly so that overlapping successive bits pairs were used instead of non-overlapping pairs, the statistical analysis given would be the same. However, instead of providing an unbiased, uncorrelated series of random 1s and 0s, it would produce a totally predictable sequence of exactly alternating 1s and 0s.

4.3. Using FFT to De-Skew When real-world data consists of strongly correlated bits, it may still contain useful amounts of entropy. This entropy can be extracted through various transforms, the most powerful of which are described in section 5.2 below. Using the Fourier transform of the data or its optimized variant, the FFT, is interesting primarily for theoretical reasons. It can be shown that this technique will discard strong correlations.

If adequate data is processed and if remaining correlations decay, spectral lines that approach statistical independence and normally distributed randomness can be produced [BRILLINGER]. 4.4. Using Compression to De-Skew Reversible compression techniques also provide a crude method of de- skewing a skewed bit stream. This follows directly from the definition of reversible compression and the formula in Section 2 for the amount of information in a sequence. Since the compression is reversible, the same amount of information must be present in the shorter output as was present in the longer input.

By the Shannon information equation, this is only possible if, on average, the probabilities of the different shorter sequences are more uniformly distributed than were the probabilities of the longer sequences. Therefore, the shorter sequences must be de-skewed relative to the input. Eastlake, et al. Standards Track [Page 15] RFC 4086 Randomness Requirements for Security June 2005 However, many compression techniques add a somewhat predictable preface to their output stream and may insert a similar sequence periodically in their output or otherwise introduce subtle patterns of their own. They should be considered only rough techniques compared to those described in Section 5.2.

At a minimum, the beginning of the compressed sequence should be skipped and only later bits should used for applications requiring roughly-random bits. 5. Mixing What is the best overall strategy for obtaining unguessable random numbers in the absence of a strong, reliable hardware entropy source? It is to obtain input from a number of uncorrelated sources and to mix them with a strong mixing function. Such a function will preserve the entropy present in any of the sources, even if other quantities being combined happen to be fixed or easily guessable (low entropy). This approach may be advisable even with a good hardware source, as hardware can also fail.

However, this should be weighed against a possible increase in the chance of overall failure due to added software complexity. Once one has used good sources, such as some of those listed in Section 3, and mixed them as described in this section, one has a strong seed. This can then be used to produce large quantities of cryptographically strong material as described in Sections 6 and 7. A strong mixing function is one that combines inputs and produces an output in which each output bit is a different complex non-linear function of all the input bits. On average, changing any input bit will change about half the output bits.

But because the relationship is complex and non-linear, no particular output bit is guaranteed to change when any particular input bit is changed. Consider the problem of converting a stream of bits that is skewed towards 0 or 1 or which has a somewhat predictable pattern to a shorter stream which is more random, as discussed in Section 4. This is simply another case where a strong mixing function is desired, to mix the input bits and produce a smaller number of output bits. The technique given in Section 4.1, using the parity of a number of bits, is simply the result of successively XORing them. This is examined as a trivial mixing function, immediately below.

Use of stronger mixing functions to extract more of the randomness in a stream of skewed bits is examined in Section 5.2. See also [NASLUND]. Eastlake, et al. Standards Track [Page 16] RFC 4086 Randomness Requirements for Security June 2005 5.1. A Trivial Mixing Function For expository purposes we describe a trivial example for single bit inputs using the Exclusive Or (XOR) function.

This function is equivalent to addition without carry, as show in the table below. This is a degenerate case in which the one output bit always changes for a change in either input bit. But, despite its simplicity, it provides a useful illustration. +-----------+-----------+----------+ | input 1 | input 2 | output | +-----------+-----------+----------+ | 0 | 0 | 0 | | 0 | 1 | 1 | | 1 | 0 | 1 | | 1 | 1 | 0 | +-----------+-----------+----------+ If inputs 1 and 2 are uncorrelated and combined in this fashion, then the output will be an even better (less skewed) random bit than the inputs are. If we assume an "eccentricity" E as defined in Section 4.1 above, then the output eccentricity relates to the input eccentricity as follows: E = 2 * E * E output input.