Saturday, 29 January, 2005

There's been quite a lot of controversy over the Microsoft Office security flaw which allows password protected documents to be recovered using a very low-tech attack. I feel that I need to communicate why this flaw is so serious and why it's a mistake that a company like Microsoft can't afford to make.

First a quick word about the logical operator called "XOR". If we XOR two bits together the rule for working out the answer is as follows: if either, but not both, of the bits is set to one then the answer is one. If not, the answer is zero. So, 0 XOR 0 = 0, 1 XOR 0 = 1, 1 XOR 1 = 0, 0 XOR 1 = 1. We don't have to do XOR on just single bits, we can perform the operation on an n-bit word. We'd start by taking the first bit from each of the words, XORing them together to make the first bit of the new word. We simply repeat the process for each bit of the word until we've XORed each of the pairs of bits in the n-bit words together. XOR has a couple of nice mathematical properties, it is commuative and associative.

A nice property of XOR is that if we XOR two identical streams together the result is a zero-bit stream. This is because 0 XOR 0 = 0 and 1 XOR 1 = 0, since the two words being XORed are identical they will always contains the same bits so zero is the only possible result of an XOR between them. This property is useful in cryptography because it allows us to easily design ciphers who's decryption and encryption algorithms are identical.

When you save a file and password protect it in Microsoft Word it is encrypted using an algorithm called RC4 with the password you supply. This sounds reasonable secure, right? Wrong! RC4 belongs to a class of ciphers called "stream ciphers". Essentially, they're secure random number generators and they produce a very long stream of bytes that we call a keystream. You then XOR this keystream with the plain-text to produce a piece of random looking cipher-text.

The problem comes if you encrypt two or more different plain-texts with the same encryption key. Given the same key, RC4 will generate an identical keystream on every successive running. This means that both cipher-texts will have been XORed against the same keystream. This is a huge problem because if we XOR both cipher-texts together we completely remove the influence of the keystream from the result and what we end up with is the XOR of the two plain-texts. You can see this fairly easily, if you consider (A XOR B) XOR (C XOR B) then by the fact that XOR is commutative and associative we see that the result must be A XOR C because the B's cancel. This can easily be broken by guessing the first section of one of the plain-texts, XORing this guess with the other plain-text and seeing if the word that comes out is reasonable.

So what should Microsoft do? Well, in my opinion you can't really patch this fault. It'd break backward compatibility with the releases of Office already out there. I think it would be wise for them to put a message in their Office suite that warned the user of the problem that reads something like this: "This security feature is only intended to prevent casual snooping and will not stave off intelligent attack." In the next version of Office they should upgrade their encryption to AES-128 and use it in mode other than EBC. They must not forget to include a secure integrity check such as HMAC or they could even be brave and use an encrypt and authenticate mode of operation for AES.

Really, this is pretty basic stuff that you'd just expect Microsoft to know. It makes me think that this "Trustworthy computing initiative" is just a marketing campaign. I mean, How can you have such an initiative and get the cryptography so horribly wrong?

Simon.

20:40:00 GMT | #Randomness | Permalink