Wed Jan 3 13:37:00 2007

With all due respect

When people announce that they have found a vulnerability in something that should be really secure, it should always trigger interest with reservations. Yesterday, a contracting consultant currently working for us announced that he identified a bug in the reference implementation of the Twofish crypto cipher. While this would not have been the first time in our halls that an issue in a cryptographic reference implementation was identified, it astonished me.

At first sight, the code he showed did indeed look like a variable initialization was missing:

	int i,j;

	for (i=r=0;i<2;i++)
		r ^= (i) ? k0 : k1;			/* merge in 32 more key bits */
		for (j=0;j<4;j++)			/* shift one byte at a time */
	return r;

The initialization of the variable r wasn't really obvious and I missed it too. Frankly, my annoyance was significant since we are currently helping a customer to design a security protocol with Twofish being one of the cornerstones, due to the cipher's strength and conservative design. And we went to great lengths to get it right, getting a cryptographer on board for the project and verifying everything over and over again. The prospect of looking at a broken reference implementation wasn't really what I wanted to hear.

After reviewing the code today, it quickly became obvious that there was no bug in the function; it initializes r when entering the loop. But, while looking at the reference implementation, I noticed the following comment at the beginning of the file:

		*	Pedagogical version (non-optimized)
		*	Tab size is set to 4 characters in this file

While browsing the source code, I immediately stumbled over a few lines such as:

/* works for big and little endian! */
d[i/8] |= b << (4*((i^1)&7));		
outBuffer[n/8] = (outBuffer[n/8] & ~ bit) | 
                 (ctBit ^ ((((BYTE *) x)[0] & 0x80) >> (n&7)));

And I have to say: with all due respect, this is everything but pedagogical code. What is the point of presenting the pedagogical reference implementation of a cryptographic algorithm in a language such as C, where out of bounds array access isn't really noticed and one can write code such as the examples shown above? To teach students that really important code must look like this? If this is not optimized for actual use, why not present the code in a readable form, computing everything one step at a time and preferably in a programming language that doesn't look like a someone rolled an armadillo over his keyboard?

Schneier and Ferguson correctly state in their book "Practical Cryptography" that complexity kills most security systems and that the only known way of handling complexity is to modularize the problem into chunks that the human mind can handle. This advice should be adjusted to read: chunks that an average human mind can handle. I know that many people are proud of their brilliance, rightfully or not, but that's not the point.

Cryptography is fragile and delicate enough by nature. But it is also a very important building block of many security systems. Therefore, many programmers must implement cryptography in their respective programming languages and some of them may not be able to correctly understand either the C gibberish in the reference implementation or the LaTeX special math character party in the official Twofish paper. May be it's just my world view as a consultant, but we always try to explain demanding content as simple as possible while still being correct, which is in fact the real challenge.

A common answer is: If people don't understand it completely, they should not implement crypto. This seems simple enough, but the number of people in the world who do not fall under this rule is too small compared to the amount of software that needs protection. We need security mechanisms most programmers (and auditors) can handle; otherwise we will keep producing insecure software. Not even the most brilliant person can write all his programs himself without the need to trust another, potentially unknown programmer. And the other programmer will, be definition, be less brilliant.

Schneier and Ferguson keep repeating the following important sentence in their book: "We have enough high performance insecure systems, we don't need another one." They are totally right. But with all due respect, please consider this rule yourself next time.

Posted by FX | Permanent link | File under: rants