Investigation of Strength and Security of Pseudo Random Number Generators

Security is a key factor in today’s fast communicating world. Many cryptographic algorithms are tested and put into use efficiently. Random numbers are used in diverse forms like nonces, secret key, initialization vector, etc. They find place in encryption, digital signature, hashing algorithms. A deterministic algorithms takes an intial seed value as input and produces pseudo random numbers with falsely induced randomness. This research work extensively surveys large set of state-of-the-art PRNGs and categorizes them based on methodology used to produce them. We compared their statistical results obtained from various statistical test tools like NIST SP 800-22, TestU01. Finally, security analyses of various PRNGs were done quantitatively for their key space, key sensitivity, entropy, speed of bit generation, linear complexity. At last, we concluded the results with some future directions for researchers to carry out their research in improving the PRNGs.


1.
Introduction The sequence of numbers with unpredictable output has greater significance in various fields of technology like Monte Carlo simulations[1, 2], Cryptography[3, 4], Gaming theory[5], Statistics[6], etc. The random numbers are hard to predict and all the numbers generated must have equal probability to make them effective candidate. Today's fast moving world demands on the fly security of information that is being shared. This paper extensively analyses the random number generators used in various algorithms of cryptography.
Cryptography uses random numbers in various forms like secret key, nonces, initialization vector, etc. Random numbers with complete unpredictable behaviour and uniform distribution are required to make the cryptosystems withstand any kind of attacks. Random numbers can be generated in either of two ways. Firstly from the true source of random like thermal noise, atmospheric pressure, SRAM start-up pattern, etc. They are called True Random Number Generators (TRNG) [7]. Secondly by feeding a seed value into a deterministic algorithm. They are called Pseudo Random Number Generator (PRNG) [8].
PRNGs can be constructed in multiple ways using various mathematical formulations. PRNGs are cost effective and faster to implement that makes them more favourable in faster communications IOP Publishing doi:10.1088/1757-899X/1055/1/012076 2 than their counterpart TRNGs. Many cryptographic protocols for key generation, key distribution, secure encryption, digital signature integrate either specialized random number generator algorithm into them or utilize the readily available random number generators embedded in the digital systems.
Since these cryptographic protocols' security is now greatly dependent on random numbers, it is the responsibility of random number generators to produce high entropy numbers with unpredictable behaviour. Weak construction of random number generator can be vulnerable to reconstruct the keys generated and break the cryptosystem [9]. Various security analyses [10,11] and statistical randomness tests [12,13] have been developed to extensively test the generators before they are deployed.
PRNG need some seed value to initiate their sequence generation. Though this idea may seem simple, they are not secure as the system is completely predictable. To overcome this issue, many researchers have proposed various techniques like processing the seed value [14], modify the deterministic algorithm [15], combine different algorithms [16] to make the system look more complex. In this paper various methods of producing pseudo random numbers are extensively studied and detailed analyses of their statistical properties and security are presented.
Following this section, the remainder of this paper are grouped as follows: History of generating pseudo random numbers is described in Section 2, classification of PRNGs is presented and various published techniques in PRNG are studied in Section 3, statistical analyses of all techniques are analysed in Section 4, comparison of all the techniques in terms of performance, security are made in Section 5 and conclusion of this survey with brief discussion of future directions in improving PRNGs is given in Section 6.

2.
History of PRNG PRNG is a deterministic algorithm generating random number sequence with an initial seed value. The sequence generated is not truly random in nature owing to their dependence on the previous state of algorithm. However, these sequences are found useful for many fields of science owing to their simple nature. Hence they are named as Pseudo Random numbers. Traditionally pseudo random numbers were generated with mathematical linear equations [17,18]. Some researches proposed to build secure PRNGs with block ciphering algorithms [19]. These PRNGs are known to produce good quality random numbers suitable for secure communications. PRNGs must be analysed mathematically to ensure their randomness properties before putting into use. Later days, PRNGs were constructed with seed from random sources [20] like sensors. These random numbers had improved randomness due to their random seeds. To make PRNGs more efficient and perform better in software, many chaotic system based PRNG [21] were proposed to generate random numbers.

3.
Classification of PRNGs The authors have made a detailed survey on various PRNGs and their testing methodologies. From this study, six major categories of random number generators are identified and listed as below: i. Chaotic map based PRNG ii.
Polynomial equation based PRNG iii.
Hardware based PRNG iv.
Nature Inspired PRNG v.
Cryptographic algorithm based PRNG This article surveys a huge variety of PRNGs that are implemented by various mathematical models. Each category of PRNG is explained in detail with their own advantages and disadvantages. They are analysed in various parameters like randomness, key space, entropy, security, performance, etc. These results are of helpful for researchers to focus their ideas into improving the weaknesses of surveyed PRNGs. It also provides directions for future research.

3.1
Chaotic map based PRNG Chaotic maps are well known for their chaotic behaviour [22]. Chaotic maps are controlled by parameters and they produce output in iterative manner. Depending on the number of dimensions it produces outputs; chaotic maps are categorized into one-dimension, two-dimension, and threedimension chaotic maps and so on.
Researchers have used wide varieties of chaotic maps like Logistic map [22], Henon map [23], Skew Tent map [24], Zig-Zag map [25], Lorenz map, Sawtooth map [26], Bernoulli map, etc. These maps have chaotic behaviour naturally, however when implemented on digital systems, they tend to lose their chaotic behaviour owing to digitization of output values.
To overcome the problems of chaotic degradation due to digitalization, various techniques have been proposed to improve their chaotic behaviour. Chaotic maps are modified in their basic structure or parameters, mix two or more chaotic maps or deploy higher order dimensional chaotic maps. These techniques are detailed below.

Modified chaotic maps.
Using only the chaotic map as a deterministic algorithmstosgenerate pseudosrandomsnumbersshave many flaws in the domain of randomness, performance, key space. These flaws are mainly contributed by the digital implementation of chaotic maps where the chaotic attractors are represented using double-precision floating point number. When the attractors are represented in floating point 64-bits, the remaining bits are truncated to fit the memory location in a computer. This leads to loss of chaotic behaviour and the sequence generated tend to have shorter periods.
One way to improve the periodicity of random sequences is to modify the chaotic map by adding some other input like arcsine function [27], Physical Unclonable Function (PUF) [28], constant perturbation [29][30][31][32][33], combine multiple dimensions into single output [34][35][36], modifying their mathematical structure with additional operations [15,[37][38][39][40][41][42], time-varying delay function [43,44], fractal order [45,46] The alternative way to improvesthesperiodicitysofschaoticsmapssisstosmodifystheirsparameters. Ordering of parameter has improved the key length in [47]. Good parameters were chosen based on Kolmogorov Entropy measures to maintain good chaos in Sine map, Tent map and Logistic map [48]. Parameters can be dynamically varied by means of AES S-box as in [49], multiplying a probability value as in [50], choosing from set of parameter values as in [51,52], mixing and confusing the parameters as in [53]. These modifications were proven to be improving the periodicity of generated sequences and have larger key spaces to withstand various attacks.

Combined chaotic maps.
Single chaotic map may seem to be inefficient in producing good statistical results. Hence researchers also developed the idea of combining more than one chaotic map to improve their chaos. Combining the maps can be done in many forms so that output of one map perturbs the other map to have healthier chaos with minimal digital degradation. Cascading of two or more chaotic maps have minimized the degradation and was optimized to have faster generation speed in [54][55][56]. Other techniques include combine two or more different maps as in [57][58][59][60][61][62][63], combine a chaotic system with another true random source like Ring oscillator [64], coupling of maps [65][66][67][68][69], getting additional perturbation from hash function [70], parallel implementation of separate maps and combining their output as in [71,72], combine some nonlinear operations into the map as in [73], output of multiple maps are multiplexed and shifted as in [74,75]. Combining chaotic maps with these different techniques have improved their usability in cryptography for generating random keys and encrypt the images with improved efficiency.

Higher dimensional chaotic maps.
With the advancement in computer systems in recent years, researchers have tried to put hyperchaotic systems in use to generate random numbers. A chaotic system with more than one Lyapunov attractor is termed as Hyperchaotic. Such hyperchaotic IOP Publishing doi:10.1088/1757-899X/1055/1/012076 4 systems have further more chaotic behaviour owing to presence of multiple attractors. Various studies have been done to utilize these hyperchaotic systems in cryptography. Five dimensional Hamiltonian chaotic system was proposed in [76], Logistic map is modified into four dimensions as in [77], modified into three dimensions as in [78]. Some hyperchaotic systems were perturbed to overcome the digital degradation as in [79,80]. Use of memristive hyperchaotic system was proposed in [81]. Hyperchaotic systems were also providing good random sequences that are more suitable for applications like image and video encryption where higher rate of generation is required.

3.2
Polynomial equation based PRNG PRNGs traditionally were built upon mathematical polynomial equations. Some of well known such generators are Linear Congruential Generator (LCG), Mersenne Twister, Linear Feedback Shift Register (LFSR). These kinds of generators can be implemented easily and efficiently in hardware and also in software. However, they have severe flaws like lesser periodicity, non-uniform distribution of random bits, high correlation between its outputs, etc. To make them potential candidates in cryptography owing to their simple nature, many researchers have proposed various ideas of improving them.
Two LCGs were coupled with variable inputs and implemented in hardware component like Field Programmable Gate Array (FPGA) to improve their periodicity [82]. Residue Number System (RNS) was applied to polynomials over finite field in [83] to increase the periodicity. Chaotic system was used to generate initial vectors for LFSR in [84,85] that increased entropy of the random sequence. LFSR was combined with a layer of bit-reorganization and a non-linear function to efficiently encrypt videos with less data loss in [37]. Mersenne Twister is fed with low-entropy seeds taken from Static Random Access Memory (SRAM) of Radio Frequency Identification (RFID) tags to generate dynamic secret keys [86].
De Burjin block is combined with a Nonlinear Feedback Shift Register (NLFSR) to make them lightweight for suitable use in Wireless Sensor Networks (WSN) and RFID tags[87]. Composited de Burjin sequences is used to generate random numbers in [88]. LFSRs were modified by genetic algorithm to increase the length of output sequences in [89]. Sensor seeds were also proposed to be used as a source of entropy for LCG in [20,90], they are more suitable for WSN were sensor data are readily available for use.

3.3
Hardware based PRNG Hardware components can be of good source for producing randomness. Physical Unclonabe Function (PUF) is a physical component that uniquely generates a fingerprint for every semiconductor devices like microprocessor. Ring oscillators with odd number of NOT gates connected in a chain fashion produces alternating 1s and 0s that may be taken as source of our random sequence.
A new hardware based chaotic system was proposed in [91] where Hamiltonian cycle was deleted within the N-cube and added one permutation to improve the chaotic behaviour of the proposed system. PUF generated random sequences were proven to be resistant to machine learning attacks [92] as its unpredictable behaviour is not easily learnt by machines. Ring Oscillator is coupled with hash functions to produce random sequence with good statistics in [93].

3.4
Nature inspired PRNG Nature inspired computing is where inspiration to solve a particular problem was taken from a natural phenomenon. Such nature inspired algorithms try to implement the natural phenomenon to provide a novel solution to problem-solving techniques. Some of the most commonly inspired concepts are Cellular Automata (CA), cellular neural networks, genetic algorithm, DNA computing, quantum computing, etc. In recent years, researchers have come forward to put in use the nature inspired algorithms in generating random sequences with better statistical properties. Cellular neural networks have been proposed to generate random sequences in six dimensions [94] with higher performance. Hybrid CA was proposed in [95] where non-linear feedback is coupled with IOP Publishing doi:10.1088/1757-899X/1055/1/012076 5 63 cells. This technique had higher throughput and was resistant to algebraic attacks. Selfprogrammable CA was utilized in [96] to generate faster bit rates with lesser energy. CA was combined with Langton's ants where ants produced the disturbance in CA. Effect of varying the number of ants into CA has been extensively examined in [97]. Another nature inspired concept named Genetic Algorithm (GA) is applied to LFSR to increase the length of random output sequence in [89]. This technique proved to generate more secure keys for encryption. Another GA was proposed [98] to use calculation of entropy as fitness function to generate high entropy random sequence. Recurrent Neural Netwoork (RNN) was added with Long Short-Term Memory (LSTM) to produce irrational number sequence [99]. This again fed to a hash function to generate the random sequence.
Quantum walks similar to random walks have their state transition with randomness coming from quantum superposition of states or reversible evolution or collapsing of wave functions. These quantum walks are also used in the field of generating random sequence as in [100]. Some researchers proposed to modify the quantum walks using quantum hash function as in [101]. Quantum chaotic map were proposed [46] to produce superior quality random sequences with good statistical results. They were also proven to have longer periodicity. However, using quantum chaotic maps must be examined carefully as they were shown to have 99% of weaker keys in [102].

3.5
Cryptographic algorithm based PRNG PRNGs can also be constructed from cryptographic algorithms like hash, stream ciphers and block ciphers. These algorithms have multiple iterations and need some input in form of message and secret key. The result may be a hash value in case of hash function or an encrypted data in case of cipher algorithms. Hash value or encrypted data are considered to be good random sequence.
S-box of Advanced Encryption Standard (AES) algorithm is used along with piecewise Logistic map in [49] to improve the density of probability distribution and maintains a balance between efficiency and security. Grostl hash function have been proposed to be used with logistic map and Skew tent map to improve the statistical randomness of output sequence in [70]. Secure Hash Algorithm (SHA) -256 was used in [103] to generate random bit sequence by obtaining seed data from SRAM. These sequences were tested to be cryptographically secure. AES algorithm along with additional input from Sprott 94 G True Random Number Generator (TRNG) produced superior results in [104]. In a similar fashion, input from TRNG like Ring Oscillator was fed to Keccack hash function to produce cryptographically strong random number sequence in [93].

4.
Statistical analyses Random numbers generated by all these algorithms need to be statistically tested to verify their randomness property. An ideal PRNG must have uniform distribution of random numbers and independence between the numbers generated. Many statistical tools are readily available to measure the randomness of produced sequences. Most commonly used tools are NIST Statistical test suite 800-22 [13], TestU01 [105]. These software packages are free to use and they evaluate the randomness of a PRNG. NIST Statistical test suite 800-22 has 15 battery tests that analyse a PRNG in different dimensions. Each test produces a P-value based on which it can be determined whether the sequence has passed the test. TestU01 also performs an empirical analysis of RNG. This section compares the statistical analysis results of various PRNGs discussed above. A PRNG passing only few batteries of a test doesn't mean it is unfit for use. Its properties can be improved in future by examining closely on its weakness.

5.
Security analysis 5.1 Key sensitivity PRNGs must be very sensitive to any change in their input seed values. With their internal state remaining constant, a diminutive alteration in the key or seed value should have a greater impact on its output. This behaviour is preferred in cryptographic applications, since smaller changes in input will highly affect the resulting sequence thus able to detect such errors. PRNG when constructed must be analysed for its sensitivity to changes in key value. While checking for sensitivity, researchers modify the key value in a micro level and find the correlation between original sequence and modified sequence. If the correlation between them is close to 0, then the sequences are completely unrelated and have higher sensitivity to initial key. Table 3. lists the PRNGs that have undergone the sensitivity test.

Key space
The key space of a PRNG refers to the number of keys that can be totally generated before getting repeated. For a PRNG to withstand many cryptanalytic attacks, it is recommended to have key space greater than 2 128 . Larger the key space, higher the chance of being secure against differential attacks. Key space of a PRNG is dependent on its initial parameters, number of internal states. Fig 1. shows the key space of various PRNGs. It is evident that with advancement in digital platforms, it is easier to achieve higher key space.

5.4
Entropy Unpredictability of a PRNG can be measured in terms of its entropy. Entropy refers to the amount of information present in the given sequence. A PRNG generating n-bit random numbers can have totally 2 n possible random numbers. An ideal PRNG must have maximum entropy in each random sequence generated to ensure its unpredictable behaviour. It can be concluded that, a PRNG generating n-bit random number must have entropy close to n for it to be more unpredictable. Figure  2.

5.5
Speed The rate at which a PRNG generates the random bits determines its speed. A PRNG generating large number of bits in short time find its usefulness in high speed network communications. These applications require random bits in order of Gigabits per second to make faster communication.

5.6
Resistance to security attacks PRNGs are subjected to various attacks with the goal of obtaining the next random number to be generated. Some of the well known attacks on PRNG include DifferentialsAttack (DA), BrutesForce sAttack (BFA), Side-ChannelsAttack (SCA), ChosensPlaintextsAttack (CPA), etc. When a PRNG is designed, it is tested for its resistance to various attacks. Table 5. shows various PRNGs and their resistance to various attacks.

Conclusion
This paper detailed the analysissofsvarious pseudosrandomsnumbersgenerators [106] thatsaresput into use in cryptographic applications [107][108][109][110]. We have first introduced the types of random number generators and their potential benefits and drawbacks. Later in the section how PRNGs evolved mathematically and found profound place in many areas of science. Next we categorized the techniques into six major categories of producing pseudo random numbers based on chaotic systems, polynomial equations, hardware, nature-inspired, and cryptographic algorithms. Then an exhaustive review of large varieties of PRNGs under each category was explained. Every PRNG must be analysed for its statistical properties, key space, entropy, linear complexity, key sensitivity, etc. A section comparing the statistical properties of all discussed PRNGs has been proposed. Finally the security analyses of all discussed PRNGs are summarized under each security parameter. This survey can be effectively utilised to make a brief comparison of various categories of PRNGs and identify the flaws that need improvement in future.