Mar 25, 2013

Secure cloud backups with Tahoe-LAFS

There's plenty of public cloud storage these days, but trusting any of them with any kind of data seem reckless - service is free to corrupt, monetize, leak, hold hostage or just drop it then.
Given that these services are provided at no cost, and generally without much ads, guess reputation and ToS are the things stopping them from acting like that.
Not trusting any single one of these services looks like a sane safeguard against them suddenly collapsing or blocking one's account.
And not trusting any of them with plaintext of the sensitive data seem to be a good way to protect it from all the shady things that can be done to it.

Tahoe-LAFS is a great capability-based secure distributed storage system, where you basically do "tahoe put somefile" and get capability string like "URI:CHK:iqfgzp3ouul7tqtvgn54u3ejee:...u2lgztmbkdiuwzuqcufq:1:1:680" in return.

That string is sufficient to find, decrypt and check integrity of the file (or directory tree) - basically to get it back in what guaranteed to be the same state.
Neither tahoe node state nor stored data can be used to recover that cap.
Retreiving the file afterwards is as simple as GET with that cap in the url.

With remote storage providers, tahoe node works as a client, so all crypto being client-side, actual cloud provider is clueless about the stuff you store, which I find to be quite important thing, especially if you stripe data across many of these leaky and/or plain evil things.

Finally got around to connecting a third backend (box.net) to tahoe today, so wanted to share a few links on the subject: