Anatomy of a pseudorandom number generator – visualising Cryptocat's buggy PRNG

oinksoft · on July 9, 2013

  Cryptocat's hacktivist credibility was cemented in 2012 when
  its Canadian developer, Nadim Kobeissi, was stopped at the
  US border and interviewed about his Cryptocat-related
  programming activities.

s/was/claimed to be/. This software clearly is not Ft. Knox, and its becoming less and less believable that US intelligence would ever feel the need to interrogate the author of an open source project, and with such brittle security, about the cryptography techniques used therein.

I find it far more believable that the young developer with a penchant for dramatics and a reply for everything has a talent for PR. And a shallow understanding of the grave consequences of faulty crypto software.

marshray · on July 9, 2013

I believe Nadim is being truthful in that someone at the border asked him about Cryptocat.

I think they probably saw his tweets "I'M CROSSING THE BORDER NOW OMG I HOPE THEY DON'T GIVE ME TROUBLE FOR BEING A BIGTIME HACKER ACTIVIST", they Googled him, and asked him about his website.

tfgg · on July 9, 2013

Perhaps the government knew it was flawed and wanted to encourage him by making him feel persecuted?

magikarp · on July 9, 2013

As the lead developer for Cryptocat, I must say this is really a great example of how to write a post-mortem for a security bug. Sophos bloggers are always worth reading.

aortega · on July 9, 2013

Just remember that your software is not you. Have a tick skin, learn and start again. Probably a name-change will help too...

yawgmoth · on July 9, 2013

I don't understand this. Why would anyone have to change their alias just because they wrote buggy software? Mistakes happen, people move forward, but they don't have to recreate themselves just because they mess up. That seems radically unhealthy.

dakimov · on July 9, 2013

My condolences to you. Now your product has become a target of mockery.

Also I love cats and feel sorry that the name of those lovely animals is used in a discredited entity.

The problem is that your product did not have just a security vulnerability, but had a number of blatantly unprofessional mistakes showing off ignorance and carelessness of its authors.

This is the worst that can happen with an author of open source and especially security software.

I hope at least you the poor cat's devs realize what has happened.

magikarp · on July 9, 2013

I feel this is something that many a [insert security software in which critical bug was recently found here] has gone through. We've been following full disclosure principles and fixing bugs as they come for the past couple of years. It's really unfortunate that the comments tend to be so dismissive and personal — a quick look at our codebase or blog shows a serious and professional effort. That said, we definitely mess up.

Pyramids · on July 9, 2013

While I admire the effort and overall mission, the problem is that when an application promotes 'secure communications', there are people who actually may use it as such.

Mistakes are understandable, however I think in-depth code review and auditing in any environment involving cryptography is an absolute must. Potentially, peoples lives could be jeopardized (either legally or physically) if they believed their communications were secure, when in fact they were not.

aortega · on July 9, 2013

>I feel this is something that many a [insert security software in which critical bug was recently found here] has gone through.

No, not really. No this kind...

Alright I quit.

zaroth · on July 9, 2013

First, this is a great article showing the bias in Cryptocat's very awkward PRNG code.

However, the off-by-one bias was actually the least of the problems with Cryptocat's random numbers...

From reading Steve's write-up, the problem was that their keys were ridiculous undersized, because they called their own function wrong:

May 7, 2012 (switched from DH to ECC):

  myPrivateKey = Cryptocat.randomString(32, 0, 0, 1);

April 19, 2013:

  rand = Cryptocat.randomString(32, 0, 0, 1);
  myPrivateKey = BigInt.str2bigInt(rand, 10);

June 3, 2013:

  rand = Cryptocat.randomString(64, 0, 0, 1, 0);
  myPrivateKey = BigInt.str2bigInt(rand, 16);

> The bug that lasted 347 days was the confusion between a string and an array of integers. This made the ECC private keys ridiculously small because they passed a string of decimal digits into a function expecting an array of 17, 15 bit integers. Each character was considered an element in the array. So each of those "15 bit integers" were only the values 0 to 9 (3.32 bits). Also the least significant 3 bits are zeroed giving you a key space of 2*10^16 (2^54.15). -- http://tobtu.com/decryptocat.php

See for yourself here: https://github.com/cryptocat/cryptocat/commit/a17bb1599463e0...

Even now looking at this code, for me it just doesn't pass the 'WTFs per minute' test... It still looks fucking wrong to me... Cryptocat.randomString(64, 0, 0, 1, 0) returns a 64 character string of 0-9, why are they calling str2bigInt with a hex radix? Did they add the last '0' in the wrong place? Or maybe I'm looking at the wrong check in...

EDIT: Ok, it WAS still wrong as of June 4th... It wasn't until July 4th until it finally became:

  var rand = Cryptocat.randomString(64, 0, 0, 0, 1)

See: https://github.com/cryptocat/cryptocat/commit/8fb7f4b8e59c76...

EDIT EDIT: So just to be clear... v2.1.11 which was supposed to fix this problem has this 'incorrect radix' bug in it? Please tell me I'm wrong, or do we need another CVE and another release?

marshray · on July 9, 2013

Maddening isn't it?

tptacek · on July 9, 2013

I am totally, completely in the dark as to which coding errors (there have been several in the past few months) attach to which CVEs and which versions, and about when these vulnerabilities were reported to the project and when they were subsequently disclosed. I can't be the only one confused, and I do wonder whether that confusion is strategic.

Pointing this out would be be gratuitous, except that Cryptocat's project leader has very high standards when it comes to the disclosure behavior of that project's competitors:

https://github.com/SilentCircle/silent-phone-base/issues/5

(Cryptocat is "kaepora" in that thread; the context there is that Mark Dowd had reviewed ZRTPCPP, an open source implementation of the ZRTP protocol that Silent Circle funds and relies on, and found several memory corruption vulnerabilities. Mark had gone public with those vulnerabilities quickly as a result of a miscommunication with the Silent Circle team, but then redacted his disclosure and agreed to embargo the findings while Silent Circle coordinated fixes with all of the Silent Circle competitors that relied on that library; what you see in that thread is, apparently, the lead developer of Cryptocat, which doesn't use ZRTPCPP, repeatedly criticizing Silent Circle's handling of the incident.)

The project should meet the standards that it has loudly set for others, and I don't think it has. I'd be happy to be corrected on this.

oinksoft · on July 9, 2013

Anyone saying, "oh, this is just the little guy trying so hard, take it easy on him" really ought to read the thread you linked. The author of CryptoCat's demeanor is surprisingly aggressive in his comments.

DanBC · on July 9, 2013

[edited to keep this thread constructive]

tptacek · on July 9, 2013

My concern is substantive:

In the process of handling an embargoed disclosure, Silent Circle groomed (perhaps aggressively) a Github pull request thread. Handling a coordinated disclosure on a Github pull request is tricky under the best of circumstances, but the Silent Circle team was dealing with an unplanned partial disclosure and was, with Dowd's cooperation, trying to get the genie back into the bottle long enough for (from what I can tell) their competitors to get patched.

This was, as is evident from the thread, a problem for Cryptocat's uninvolved, un-implicated lead developer.

Meanwhile: I cannot immediately tell from the record which of the independently discovered vulnerabilities in Cryptocat correspond to which CVEs or which advisories and which severity descriptions.

The problem isn't one of demeanor; it's that I can't tell what's going on with this project. Does this sound fiddly? Let's make it more concrete: did an exploitable vulnerability get reported to the Cryptocat project, mitigated in a public Github commit, and sit there for months without any acknowledgement from the project?

Cryptocat's project leader is an HN commenter, and if this confusion is my own doing --- totally possible --- I'd be happy for some clarification; not about Silent Circle, but on how someone would get a record of what vulnerabilities were disclosed in Cryptocat, when, and what the timeline was for each from discovery to public disclosure. It's plausible that there's a clear record somewhere and I just missed it.

zaroth · on July 9, 2013

It's so bad I want to put on my tin foil hat and believe the NSA made them do it.

How do you go back to the same mission-critical single line of code again, and AGAIN and keep getting it wrong?

Any change to that line of code raises extreme red flags. As a developer, if you're making a change to that line after it shipped, you're in complete "wow I fucked up bad" mode. Instead it looks like they're just changing it randomly, month after month, like it's some inconsequential debug flag. Really, this just blows my mind.

speeder · on July 9, 2013

I don't understand much of crypto...

But I am game developer, and game developers (specially RPG fans) love random numbers.

Some games of mine, I suspected something was off with the PRNG, and did something like they did on the ending, I used the random number generator to draw pictures.

Biased generators were quite obvious, because they made obvious patterns (one of the worst offenders was C default random function on the mingw that came with Dev-Cpp, instead of drawing noise, or weird noise, it drew diagonal lines... yes, LINES, it became clear to me that it was terrible...)

In the end I choose Mersenne Twister, it is a very good PRNG... For games that is, although the Mersenne Twister does generate random looking numbers, its author claim it is bad for crypto, Mersenne Twister, I don't remember why, can be predictable enough for anti-crypto.

nikic · on July 9, 2013

MT is not suitable for crypto because after observing 637 values you can predict all further values. 637 is the size of MTs internal state vector and the output of MT are basically just values from that state vector run through a tempering function. That tempering function is invertible, thus by looking at 637 values you get the full state vector, at which point you can generate the next values as usual.

axman6 · on July 9, 2013

That's a great answer, thanks for that!

tptacek · on July 9, 2013

You should definitely actually try to clone an MT instance sometime. It's illuminating, and just tricky enough to be fun without being frustrating.

adobriyan · on July 10, 2013

624 values are enough.

mistercow · on July 9, 2013

> its author claim it is bad for crypto, Mersenne Twister, I don't remember why

Because after observing a certain number of outputs, you can completely predict its subsequent outputs.

marshray · on July 9, 2013

Yep. Plus, MT uses a huge amount of memory. I don't see any reason to use it any more now that modern stream ciphers have gotten so fast.

tptacek · on July 9, 2013

It is the unfortunate de facto standard for built-in noncryptographic RNGs in high-level languages. PHP, Python, and Ruby all rely on MT generators (all 3 have real CSPRNGs too).

marshray · on July 9, 2013

Quick, somebody start the NoTwister movement.

Jach · on July 9, 2013

Even for RPG games, any given PRNG can often be too random for a game. You can make a lot more realistic and enjoyable behaviors by grabbing your random values from a distribution other than uniform 0.0-1.0. (Like a Gaussian distribution, which is everywhere in models of random things that are the sum of many other individual random processes.)

eterm · on July 9, 2013

random and uniform are different things. a PRNG isn't "too random", that doesn't make sense.

If it's too uniform, sample from other distributions, but to do so usually requires taking output from a PRNG.

Indeed using the inverse CDF of any distribution is a map from (0,1) to the desired sample space, following the distribution.

As for which PRNG for RPG games, MT is very good because while it isn't cryptographically secure, it otherwise has a very long period and delivers very nicely distributed numbers as well as being very fast.

For non-crypto applications such as monte-carlo, there's little reason to use much else.

Jach · on July 10, 2013

By "too random" I mean that for games, one often wants "predictable randomness" rather than unpredictable randomness, and a stock PRNG or MT can be too unpredictable. (Which is generally good, but not always for games. Or for candy; the colors of M&M's aren't distributed equally.) Picking your random values from a distribution is one way to make them predictable and more life-like. e.g. if you have a target shooting AI, using a built-in rand() function to select an angle and radius on the target is probably okay for the angle, but using it on the radius will give you a bad-looking spread. Whereas a Gaussian-selected radius will look more natural with most of the shots clustered close to the center, and it lets you easily fine-tune the AI's accuracy by adjusting parameters of the Gaussian.

Alternatively, some sets of random values don't look random to humans, so as a game developer you might need to tweak it and make it appear more random to the player. e.g. these are some truly random bits from Hotbits (not a PRNG: https://www.fourmilab.ch/hotbits/secure_generate.html): '001101010011001011001000010100011010011111000001111101010011000111010000111000001011111000111011'. My pattern-matching human brain is annoyed by runs longer than 3 or 4 times, and patterns like 000111 and 111000. If you take time to filter out 'annoyances' you can make the player like your game more, and if you're careful you can do it without decreasing the entropy bits per byte too much.

_r9fo · on July 9, 2013

I'm a complete noob when it comes to cryptography. I understand that having a PRNG that doesn't return numbers with even distribution across a range is bad. Extreme example would be something like http://xkcd.com/221/.

But could someone explain how an attacker can take advantage of the fact that 0 is returned ~1% more often than other digits? It this flaw alone sufficient to break cryptocat? Or does it simply make brute forcing easier when combined with other crypto flaws?

patio11 · on July 9, 2013

At a high level of abstraction: If one runs a nuclear power plant, one does not make a practice of tolerating small oil spills. Small oil spills are almost harmless. So are small quantities of sparks. The combination of small oil spills and small quantities of sparks, however, is a severe problem and gets worse in a hurry if it compounds with certain other usually benign properties of nuclear power plants.

Unfortunately, the sort of nuclear power plant operators which tolerate oil spills are often sufficiently not on their A game to tolerate sparks.

This is a very handwavy explanation. In particular, God doesn't hate nuclear powerplants and try to introduce sparks into them at inopportune moments just to see if they happen to find an oil spill, but The Adversary often can and will do this to your cryptosystem.

nikic · on July 9, 2013

Yes, the fact that 0 is returned ~1% more often than other digits is very significant. E.g. if the generated random number stream were used for XOR encryption, then you could just collect a large number of encryptions of the same text, for each character look which one occurs most often and then that would be the encrypted character (because 0 has a bias and occurs most often and c XOR 0 == c).

Using other encryption methods the advantage is often not so obvious, but it exists. Also it should be noted that 1/250 is a pretty large bias and would probably not even need particularly many ciphertexts.

nknighthb · on July 9, 2013

> I understand that having a PRNG that doesn't return numbers with even distribution across a range is bad.

That's (potentially dangerously) oversimplified. Uniformity is one of the consequences rather than the requirement itself. The core requirement is unpredictability. If the output is non-uniform, it means the output isn't fully unpredictable (which may or may not lead to practical attacks depending on the degree of the problem and how the PRNG is actually being used).

Meanwhile, you can also create a uniform PRNG that isn't secure at all (e.g. http://en.wikipedia.org/wiki/Mersenne_twister ).

thetwiceler · on July 9, 2013

If you take this sort-of-stretched case where encryption can be broken, I can explain why this is a problem.

Suppose you are using the PRNG to make a stream cipher. Basically, your random key is a seed for the PRNG. You then generate lots of pseudo-random characters from that seed, and XOR them together (character-by-character) with your message to encrypt it.

Now, the fact that XOR is linear (it's just addition mod 2) means when you XOR two probability distributions against a constant (i.e., an atomic distribution), you'll get a shifted distribution. Let's say your PRNG disproportionately outputs "0" at each character. Then the distribution of each character of the ciphertext will be centered at p XOR 0, where p is the corresponding character of the plaintext!

So by the law of large numbers, if we see the same message encrypted many times, we can determine with high probability exactly what each character of the plaintext is, and completely break encryption!

EthanHeilman · on July 9, 2013

If you are interested this webpage does some really neat stuff with visualizing PRNGs.

http://lcamtuf.coredump.cx/oldtcp/

Homunculiheaded · on July 9, 2013

There's a nice article "Cryptography is a science, not engineering" [0] that gives an overview of cryptography that might help with your question as well.

The essence is that in modern cryptography you're creating a provably secure system. One of those proofs is that your prng outputs a uniform distribution.

[0] http://www.daemonology.net/blog/2013-06-17-crypto-science-no...

marshray · on July 9, 2013

The small bias described in the article is a pointless and embarrassing flaw, but it's not the bug that made messages decryptable (this most recent time).

The recent work on biases in RC4 contains a good example of how small biases can end up being exploitable. http://www.isg.rhul.ac.uk/tls/

ChuckMcM · on July 9, 2013

Nice article, I can certainly vouch for creating a visualization to clue you in that there may be a problem.

One of the things I built out of Java when I was hanging out on the cypherpunks list was a 'Noise Sphere' applet. Basically this is a way of testing a PRNG visually. It was fun to put various ideas through it to see how they panned out (most really sucked) One of the cool things was using the alpha emitter RNG from a smart card and finding out that it too was slightly biased in the presence of foil on one side (never did find out why that was, but it did stick out)

DanBC · on July 9, 2013

H-Online have a nice article about an external enclosure that offered "hardware data encryption with 128-bit AES, access control via an RFID chip".

The visualization (of the encrypted data, not of any PRNG used) showed that the encrypted data had strong patterns. The data wasn't encrypted using RSA, it used XOR with a non-changing cipher block. (This is trivially easy to crack. Even I could do it.)

The error arose because the manufacturer of the controller chip was using confusing terminology.

> The IM7206 merely uses AES encryption when saving the RFID chip's ID in the controller's flash memory. The company explained that actual data encryption is based on a proprietary algorithm.

(http://www.h-online.com/security/features/Enclosed-but-not-e...)

Other people interested in testing PRNGs might be interested in the DIEHARD tests (http://en.wikipedia.org/wiki/Diehard_tests), but read all the cautions too.

baq · on July 9, 2013

turns out that hardware RNGs are very easily biased and usually need some transformation to be useful. see http://en.wikipedia.org/wiki/Hardware_random_number_generato....

jlgreco · on July 9, 2013

Really drives home what many people were saying about the authors not merely being bad at cryptography but programming in general...

tptacek · on July 9, 2013

I upvoted this because, of course, it confirms my own biases, but kind of wish I hadn't, because we're not helping with comments like these.

glurgh · on July 9, 2013

Sure but the author(s) of Cryptocat have been so uniquely resistant to help/input, over a fairly long period of time. Maybe your euphemism 'unserious' is worth adopting for every time one wants to say 'bad' or 'incompetent', in cases like this.

tptacek · on July 9, 2013

I think the project is worth criticizing, but that the concerns are serious enough to deserve better than snark.

glurgh · on July 9, 2013

Maybe at some point a project exhausts whatever potential goodwill/snarklessness is available and just becomes an object of snark - and at that point and beyond, perhaps the benefits of the project having a reputation of being a snarkmagnet outweigh negatives of the inevitable snark it generates.

Makes it a lousy, toxic topic for message boards but that seems a comparatively small price for a defense against the notion that Cryptocat is (or likely, ever will be) secure.

magikarp · on July 9, 2013

Whether Cryptocat programmers suck or not, I've seen worse errors from better programmers. I'm not sure what's the metric to follow here when it comes to correlating these two items.

aortega · on July 9, 2013

You are dangerously fooling yourself by minimizing the importance of those bugs. Cryptography software is not like regular software. It is critical software, like the kind used to run planes or nuclear power plants: People's lives depends on it.

People with no programming experience should be literally banned by law from writing critical software. You should take those bugs way more seriously.

PS: I have seen a programmer's face fear when asked to write crypto software. They know enough to be shit scared. You need that kind of people.

IsThisObvious · on July 9, 2013

I help write microcontroller code for pressure equipment management. If something goes sufficiently wrong in heating/fails to properly vent, an explosion can occur, endangering everyone in the area.

Even unrelated code is heavily audited to make sure that it can't somehow impact the main control loop and cause an invalid state.

Cryptography software should be much the same.

jlgreco · on July 9, 2013

With normal software you can load up on the unit and integration tests to make yourself more confident with your software. When the concern is with the integrity of a cryptographic system, things are not quite so simple. You can write tests, sure, but your overall confidence afterwards is going to be much different.

IsThisObvious · on July 9, 2013

We go beyond unit tests to verify that the algorithms can't create certain states by any execution path, etc.

Formal verification of software properties is an interesting field.

DanBC · on July 9, 2013

A blog post about the process would be fascinating, and probably something that many on HN would be interested in.

foobarqux · on July 9, 2013

The problem is no one would volunteer to write crypto code under those constraints.

IsThisObvious · on July 9, 2013

I know people who write crypto software that must conform to formal verification of the algorithm, requires detailed design documentation before a single line of code, etc.

z-factor · on July 9, 2013

Funny thing is that code is then compiled with a compiler was not formally verified, so it's still a 'fingers crossed' situation.

IsThisObvious · on July 9, 2013

The code generator we use (at my job; not crypto) was written in Coq and formally verified to generate code with certain properties, given properties of the input.

I assume that the crypto group I know, who developed most of the code generator I use, takes similar measures to make sure that their verified "theorems" translate correctly to code.

The unverified stage is actually the hardware, which is an open problem.

foobarqux · on July 9, 2013

I think in many regulated industries Coq itself would also need to be certified.

What industry is this?

IsThisObvious · on July 9, 2013

Pressure equipment, specifically sort of mid-scale items for laboratory and small-batch use.

Our low level code (ie, directly controlling machines) is written in the SPARK environment. This code tends not to get updated often, and has a high level of verification to it. It's what actually handles the pressure cut-offs (ie, hard limits on the machine), turning vales, etc.

Our middleware code is based on a specially developed VM that gets code generated for it in Coq, to ensure that it doesn't choke or become unresponsive. However, it's not directly responsible for safety control and has somewhat laxer restrictions.

Coq is useful for demonstrating that the middleware analytics will complete in a given time profile and not crash out the server on erroneous input.

jdiez17 · on July 9, 2013

Right, but it's much less of a leap of faith, IMHO. Besides, if you turn off optimisations you can be reasonably sure the compiler didn't do something unintended.

foobarqux · on July 9, 2013

I meant volunteer to mean "not get paid".

davidw · on July 9, 2013

> People with no programming experience should be literally banned by law from writing critical software.

Getting the government involved in who gets to write crypto software...

What could possibly go wrong?

jdiez17 · on July 9, 2013

I don't think he meant to "involve the government." I do agree with his sentiment though, an inexperienced programmer should not be allowed to write critical software.

This is, of course, very difficult to implement in practice.

davidw · on July 9, 2013

I don't know much about writing "critical software" - because it's not something I've ever done.

However, my guess would be that with proper processes, you should be able to let a junior programmer write the code, because bugs and errors and mistakes will come out in the wash. It's not like experienced programmers don't make mistakes! Perhaps it simple doesn't make sense to turn a junior guy loose, but my thinking is that where people's lives are at stake, depending on someone being a 'good coder' is a bad idea.

tptacek · on July 9, 2013

I agree. I think the problem isn't how the code is implemented; I think it's how it's designed and verified.

Most of the time, generalist developers can severely (sometimes even completely) mitigate the expense of competent design and verification by adopting trusted components and adapting the application requirements to those components (instead of the other way around, which is the usual way developers incorporate third party components).

If you don't do that, though, you're looking at the 10x-1x-10x problem: however much time it takes you to build your system, you're looking at a 1:20 ratio of effort for non-implementation work, and that's serialized.

(Lest anyone thing I'm talking my own book here: we do what I think is an atypically good job at handling the ancillary crypto stuff that comes up in normal applications, but I don't think we're well qualified to do formal reviews of cryptosystems.)

damarquis · on July 9, 2013

Cryptography software is usually critical but I think unlike the code for planes and nuclear power plants it doesn't necessarily have to be. Typically cryptography software is advertised as secure against an adversary has unlimited resources. However, the designers of the software can choose whatever threat model they want as long as they make that clear to their users.

I am thinking of a disclaimer like "We believe our product is secure enough that the minimum cost for to an adversary to decrypt a message is at least $ 10^k" for some k and the cost is an estimate of the total cost of factors like number of hours of cryptanalysis, number of cycles, etc.

This way, if the code is found to be completely broken at some point then the error is considered relative to the level of security the designers intended.

aortega · on July 9, 2013

I agree this is a much saner way to think about it.

danielweber · on July 9, 2013

I may have bombed an interview a few weeks ago when they asked me to design a security protocol on the fly. I really didn't want to answer that question (because anything I'd say would be vulnerable in some way) and ended up rambling on about certificates for a while.

I guess next time I'll be ready to say "that's not something you do during an interview."

tptacek · on July 9, 2013

I don't think the problem has much to do with how good a programmer you are.

on July 9, 2013

[deleted]

omra · on July 9, 2013

I agree that an off-by-one error is not a huge deal in most cases, but cryptography is one of those places where you cannot get it wrong. The programmers, or a employee well-versed in cryptography, should have ran a Chi-Squared test immediately on the random data to make sure it was random. (That is the first thing I would try, and I only have a working knowledge of statistics and cryptography.)

Cryptography is unforgiving, and when your code could seriously endanger someone else, you have to be constantly vigilant.

damian2000 · on July 9, 2013

I wouldn't say they're bad at programming - everyone makes mistakes. I think the issue is more a weakness in (or lack of) tests and oversight. The fact the code is open source at least gives it some accountability and the error did get spotted eventually.

mpyne · on July 9, 2013

Well a fencepost/off-by-one error is hardly unique to bad programmers. There are certainly instances of poor crypto implementation, but I don't think this particular example is worthy of the appellation of poor programming.

jlgreco · on July 9, 2013

It's not just the errors that were made though. I mean, ignoring the off-by-one error:

  repeat
     byte250 = randomSalsaByte()
  until byte250 <= 250

I am not sure what sort of code-review process they had in place if someone saw that and thought "Yup, brilliant. Ship it." I am sure we have all written code like that at one point in our lives, but code that is shipping to the general public?

billforsternz · on July 9, 2013

It's not the most elegant code I've ever seen - but I don't otherwise see a red flag. What's your problem with it ? I don't think the worry that a semi-infinite string of legitimate values >250 is going to come out of randomSalsaByte() is a real, practical problem. You may as well worry that the complete works of Shakespeare asciified might come out of the PRNG. In theory it could happen. In practice, it won't happen even once in 10^100 universe lifetimes. The transistors in the CPU are going to spontaneously fail whilst executing the code zillions of times more often.

axman6 · on July 9, 2013

The issue isn't the potentially infinite loop, it's the fact it creates an uneven distribution of values, giving 0 a 1/250 greater chance of being selected, and thus makes the results of the PRNG more predictable. The article makes this pretty clear...

billforsternz · on July 9, 2013

What part of "ignoring the off by one error" in the parent don't you get ? Edit: Sorry, unnecessarily snarky, and I could have made my (now uneditable) post clearer by saying ">=250" rather than ">250".

mpyne · on July 9, 2013

The article also makes clear that it's a standard fencepost error. Freakishly poor crypto, but not bush-league programming.

mistercow · on July 9, 2013

How else do you generate an unbiased random number between 0 and 250? As far as I know, that's the standard algorithm.

jlgreco · on July 9, 2013

My concern is with why they are doing that in the first place. It is possible that I am missing something here, I haven't looked at their actual code, but it seems to me that this concern would have been entirely avoided if they went with 0-F instead of 0-9.

mistercow · on July 9, 2013

Yes, I definitely agree with that, and I don't understand why they did it that way. Generating a float in the [0,1) range by producing decimal digits seems utterly perverse. The sane, naive, and correct way to do it is to generate an integer between 0 and 2^53 and then divide by 2^53.

marshray · on July 9, 2013

Or use an array of integers 0 <= n < 2^x. Some of the Javascript libraries involved are even using such a format.

mischanix · on July 9, 2013

Perhaps grab 5 bytes instead?

hatu · on July 9, 2013

I've learned to always run a simple test where I print into the console 1 million iterations of the distribution. Takes about 10 seconds and this is just for games, not crypto.

dchest · on July 9, 2013

While we're at it, did anyone review Salsa20 implementation or willing to review it?

https://github.com/cryptocat/cryptocat/blob/master/src/core/...

mpyne · on July 9, 2013

This reminds me of the IBM RANDU PRNG on which I gave an undergrad presentation (w/ visualization). RANDU was legendarily bad; all you had to do was plot triples in a 3-D volume to see that all of the generated points would fall within one of 15 (Edit: 15, not 11) planes.

mistercow · on July 9, 2013

>The code above is certainly an inelegant solution, since it is, in theory, at least, a potentially infinite loop.

As far as I know, it is the only solution for generating unbiased random numbers in a range that does not divide the native range of the PRNG you're using.

What I don't understand is why they're working in decimal in the first place. The obvious solution is to generate a random integer in the range [0, 2^53) (which does not require an unbounded-time loop, since 2^53 divides 2^64) and then divide by 2^53. I am open to the possibility that there's some quirk of floating point arithmetic that means that won't work [edit: less so now that I see it's how Java implements it¹], but I can hardly imagine that it's worse than this crazy business of generating 16 decimal digits because, hey, log₁₀(2^53) is pretty close to 16.

¹http://docs.oracle.com/javase/7/docs/api/java/util/Random.ht...

marshray · on July 9, 2013

You could accumulate numbers until you reached the least common multiple of the source and desired ranges.

kunil · on July 9, 2013

How is that any different, am I missing something?

marshray · on July 9, 2013

Never mind. I'm confused. Not the first time :-)

cobbal · on July 9, 2013

I think the criticism of the loop is a bit unfair. If your random number generator returns >250 a thousand times in a row, you have bigger problems than slowness.

q3k · on July 9, 2013

Purely mathematically, you cannot assert that a certain value will appear in a certain timeframe - it's random!

cobbal · on July 9, 2013

True, but purely mathematically every cipher is vulnerable to the "get lucky and guess the key right the first time" attack (OTP excepted of course).

Almost all crypto operates in the realm of absurdly close to certain.

marshray · on July 9, 2013

If it's random, then the probability that the same case of output will occur forever is zero. Here the probability of hitting the terminating case occurs with p=(1 - 250/256). You aren't going to randomly roll 6/256 enough times in a row to notice within the age of the universe.

This is the standard method for selecting a uniform range from a source that is not an exact multiple. The worst-case probability of having to draw again can always be kept under 0.5. For example, I want a number in the range of (0 <= n < 129).

JunkDNA · on July 9, 2013

Seems like at a minimum, if you are doing your own random number generator, you should have a test harness that runs the chi squared test and compare against a known good random number source. I wonder how many other automated tests one could stack up against a crypto codebase to do that kind of basic checking.

tptacek · on July 9, 2013

The problem is having to build your own random number generator at all, and it's a consequence of doing cryptography in browser Javascript.

Designing a robust CSPRNG is as hard a problem as designing a block cipher mode or a MAC; it's "systemsier" than designing your own block cipher core, but about on par with coming up with a simple authenticated encryption mode. By way of example, CSPRNG design gets extensive treatment in its own chapter in Schneier and Furgusen's _Practical Cryptography_, and it's one of the few chapters that highlights the authors own work.

Cryptocat uses JS inside of a Chrome extension to avoid the chicken/egg secure load problem (I'm ambivalent about this approach). But Chrome extension JS doesn't avoid the other problems of browser Javascript crypto, and here's an example of why: it forced Cryptocat to attempt the design of a custom CSPRNG.

yk · on July 9, 2013

You are running the chi-square against the claimed distribution of the RNG, not against another RNG. Which is just about enough to use the RNG for simulating dice rolls ( sometimes, if the players are not too passionate). RNG can have a lot more subtile flaws, for example the RNGs from the ANSI C rand() with the suggested parameters lie in just a few thousand planes, if you construct points in some vector space out of the random numbers. ( That is bad, if you want to run Monte Carlo integration, because your error estimate improves, but not the accuracy of the result.)

So even for simulation you can have rather subtle effects, even though there you get bitten just by bad luck. The task for cryptographically secure random number generators is actually quite a bit harder, because a attacker tries to exploit any weakness in your RNG, but your tests do only tell you which specific attacks do not work.

caf · on July 9, 2013

The ANSI C rand() doesn't actually specify any particular algorithm. The complete set of requirements for rand() is just that: rand() computes a sequence of pseudo-random integers in the range 0 to RAND_MAX; RAND_MAX must be at least 32767; if srand() is called with the same seed value, the same sequence of values will be produced; calling rand() before calling srand() acts as if srand() had been called with a seed of 1; and no library function can peturb the pseudo-random sequence.

Historical C standard library implementations did tend to use the same or similar LCRNGs, but that was never a requirement, and is no longer the case.

foobarqux · on July 9, 2013

Just run NIST DieHard or similar. Of course that is only a statistical test which is necessary but not sufficient for CSPRNG but it can help find problems like this.

agrona · on July 9, 2013

John Cook has some interesting material on testing PRNGs in his chapter for the book Beautiful Testing. His material is (was?) online free, here: http://www.johndcook.com/blog/2010/12/06/how-to-test-a-rando...

autodidakto · on July 9, 2013

I thought cryptocat was independently audited. Who audited it and why didn't they find these problems?

CariadKeigher · on July 9, 2013

Cryptocat was audited by Veracode but it is my understanding that they just audited the functionality of it (IE: the interface and all that is related) but not the cryptography, as it was out of scope.

Ellipsis753 · on July 9, 2013

Interesting though this is, all it really shows is that there is a bias in the random number generation.

I would be more interested to know if this is sufficient to break Cryptocat. Would this really make a brute force attempt much easier?

tptacek · on July 9, 2013

This isn't a disclosure of a new vulnerability; it's a followup on an earlier article which documented a much more significant problem with key management in Cryptocat.

danso · on July 9, 2013

I can mostly grok the scope of the error and the theory behind it...what's more interesting to me is to hear, preferably from the Cryptocat developers, what led to this error? An accidental regression? A deliberate decision? Cryptocat and what it aims to do is respectable, but I think what concerns people outside of the project is the prospect of unknown unknowns in the rest of the code: are the recent bug revelations indicative of endemic blind spots?

sirsar · on July 9, 2013

Off-by-one errors are very common (and they have a long history). My best guess is "an honest mistake."

As for the "endemic blind spots," any crypto software needs professional reviews and a good deal of time before it can be used in critical situations.

tptacek · on July 9, 2013

Yes. A pro friend of mine is fond of saying: budget 10x to design, 1x to implementation, and 10x to verification.

canvia · on July 9, 2013

I don't know if the name is supposed to be a joke, but it's a bit weird that the NSA uses it internally:

http://blogs.marketwatch.com/thetell/2013/06/06/nsa-targets-...

stavros · on July 9, 2013

Very interesting article, thank you.

rheide · on July 9, 2013

This should be possible to test automatically. And if that's not possible/easy to do, it should be standard procedure to generate colour maps that can be checked by humans.

revelation · on July 9, 2013

We talk about how JavaScript delivered from a webserver can never be a secure crypto platform, but of course in reality the problems are far more mundane.

stavros · on July 9, 2013

s/mundane/interesting/

bcl · on July 9, 2013

Why doesn't the article include updated graphs with the bug fixed?

kunil · on July 9, 2013

Wow, someone used <= instead of < and they wrote 5 page report with all those images and charts. The bug is obvious and simple, what is with all those explanations?

baq · on July 9, 2013

this is crypto. there were 'less serious' bugs like initializing arrays to zeros (see debian) that led to catastrophic results, which is what the explanation is about.

kunil · on July 9, 2013

I am not saying if bug is critical or not. It is just pages of explanations, charts and stuff. I was expecting a video explanation by the end.

If they are targeting non-crypto people (which includes me), explaining how a bad random algorithm affects cryptology would be better instead of showing that algorithm is bad in 5 different ways.

Also any link to articles about that Debian bug?

DanBC · on July 9, 2013

Random numbers are used for different things, but mostly generating keys. An attacker that knows everything about your system (all the hardware, and the software, all the sourcecode, everything) should not be able to predict the next bit output by a prng.

Errors include:

i) using a source that is not random. As mentioned elsewhere some hardware devices provide skewed numbers, and even de-skewing doesn't help too much.

ii) using a poor seed.

The Debian bug is an example of ii -

> This vulnerability was caused by the removal of two lines of code from the original version of the OpenSSL library. These lines were used to gather some entropy data by the library, needed to seed the PRNG used to create private keys, on which the secure connections are based. Without this entropy, the only dynamic data used was the PID of the software. Under Linux the PID can be a number between 1 and 32,768, that is a too small range of values if used to seed the PRNG and will cause the generation of predictable numbers. Therefore any key generated can be predictable, with only 32,767 possible keys for a given architecture and key length, and the secrecy of the network connections created with those keys is fully compromised.

(http://en.wikinews.org/wiki/Predictable_random_number_genera...)

(www.schneier.com/blog/archives/2008/05/random_number_b.html)

baq · on July 9, 2013

excuse me for inaccuracy - it wasn't a zero-initialization, it was a maintainer who thought he was fixing a bug because he made a compiler warning disappear, which resulted in a whole lot of zeros where there should be some random data.

http://www.schneier.com/blog/archives/2008/05/random_number_... http://research.swtch.com/openssl

pessimizer · on July 9, 2013

http://wiki.debian.org/SSLkeys#Technical_Summary

zainny · on July 9, 2013

I wonder - if you took in aggregate the time people spent complaining about Cryptocat could a more secure alternative have already been built? :-)

Anyway, I really enjoyed this article. Informative, interesting, and free of snark.

paranoiacblack · on July 9, 2013

No, stop trivializing cryptography software.