Where things went wrong?

Let’s solve the “puzzle” from last post, shall we?

The issue was with the trust level. The code basically assumed that the storage layer, being a memory, filesystem or a remote web server, is trustworthy. When the data was read back from datastore, the code didn’t check whether it’s correct or not. And from the design point of view we know that the data must perfectly match the name of blob we asked for.

We could argue about the in-memory implementation which should be secure as long as someone doesn’t mess up with the memory of our process (sometimes it happens unintentionally). But when files are stored on the local HDD, it’s much easier to alter them. The least secure of all three implementations was the web connector one. Connecting it to a bogus server would basically break our security since any blob could be replaced with anything else.

So what’s the fix? You can see all changes I made here, but they can be summarized as:

  1. When reading data from any public interface implementation, always check if the contents does match given blob name
  2. In case of web connector, check what name is generated when calling SaveAutoNamed - it’s to ensure the server we’re connecting to does produce valid blob names
  3. A bonus one (not directly connected to trust problems but still being a security issue) - when comparing blob names, use constant-time comparison - this could prevent exposure of raw blob names in case we want to hide them on higher levels

After those changes, it’s now required for the datastore interface to return a valid data for given blob name or fail with an error so the consumer of this interface can assume no error means valid data.

Let’s play the bad guy

Ok, let’s think now how the trust issue could be exploited. If we assume that the attacker has full access to the storage (i.e. access to files in case of file-based datastore), he could basically inject any contents he want to whoever is using the datastore.

In case of plaintext data the result is pretty devastating - i.e. replacement of some javascript on a statically-served web page with a malicious version. Once the attacker can inject his own code into victim’s computer, the fight is over.

One may think though that if the data stored in datastore is encrypted, we’re in a much better position, the attacker wouldn’t be able to inject his own ciphertext without the knowledge of the key, right? Nothing further from the truth, this is known as chosen ciphertext attack. Results could be as devastating as in case of plaintext stored in the datastore, there would be no integrity of data.

That’s it for today. Hope to write something more soon.