Feb 11, 2010

My "simple" (ok, not quite) backup system

There's saying: "there are two kinds of sysadmins - the ones that aren't making backups yet, and the ones that already do". I'm not sure if the essence of the phrase wasn't lost in translation (ru->eng), but the point is that it's just a matter of time, 'till you start backing-up your data.

Luckily for me, I've got it quite fast, and consider making backups on a daily basis is a must-have practice for any developer/playground machine or under-development server. It saved me on a countless occasions, and there were quite a few times when I just needed to check if everything in my system is still in place and were there before.

Here I'll try to describe my sample backup system operation and the reasons for building it like that.

Ok, what do I need from the backup ecosystem?

  • Obviously, it'd be a bother to backup each machine manually every day, so there's a cron.
  • Backing up to the same machine obviously isn't a good idea, so the backup has to be transferred to remote system, preferrably several ones, in different environments.
  • Another thing to consider is the size of such backups and efficient method of storage, transfer and access to them.
  • Then there's a security issue - full fs read capabilities are required to create the backup, and that can be easily abused.

First two points suggest that you either need privileged remote access to the machine (like root ssh, which is a security issue) or make backups (local fs replicas) locally then transfer them to remote with unprivileged access (just to these backups).

Local backups make third point (space efficiency) more difficult, since you either have to make full backups locally (and transferring them, at the very least, is not-so-efficient at all) or keep some metadata about the state of all the files (like "md5deep -r /", but with file metadata checksums as well), so you can efficiently generate increments.

Traditional hacky way to avoid checksumming is to look at inode mtimes only, but that is unreliable, especially so, since I like to use stuff like "cp -a" and "rsync -a" (synchronises timestamps) on a daily basis and play with timestamps any way I like to.

Space efficiency usually achieved via incremental archives. Not really my thing, since they have terrible accessibility - tar (and any other streaming formats like cpio) especially, dar less so, since it has random access and file subset merge features, but still bad at keeping increments (reference archive have to be preserved, for one thing) and is not readily-browseable - you have to unpack it to some tmp path before doing anything useful with files. There's also SquashFS, which is sorta "browsable archive", but it has not increment-tracking features at all ;(

Another way to preserve space is to forget about these archive formats and just use filesystem to store backed-up tree. Compression is also an option here with ZFS or Btrfs or some FUSE layer like fusecompress, keeping increments is also simple with either hardlinks or snapshots.
Obviously, accessibility (and simplicity, btw) here is next to nothing, and you can use diff, rsync and rest of the usual tools to do anything you want with it, which I see as a great feat. And should you need to transfer it in a container - just tar it right to the medium in question.
Of course, I liked this way a lot more than the archives, and decided to stick with it.
So, at this point the task was refined to just rsync from backed-up machine to backup storage.
Since I have two laptops which mightn't always be accessible to backup host and should be able to initiate backup when I need to without much effort, it's best if the backups are initiated from backed-up machine.

That said...

  • I don't want to have any kind of access to backup storage from this machine or know anything about backup storage layout, so direct rsync to storage is out of question.
  • At the same time, I don't need any-time root - or any other kind of - access to local machine form backup host, I only need it when I do request a backup locally (or local cron does it for me).
  • In fact, even then, I don't need backup host to have anything but read-only access to local filesystem. This effectively obsoletes the idea of unprivileged access just-to-local-backups, since they are the same read-only (...replicas of...) local filesystem, so there's just no need to make them.

Obvious tool for the task is rsync-pull, initiated from backup host (and triggered by backed-up host), with some sort of one-time pass, given by the backed-up machine.

And local rsync should be limited to read-only access, so it can't be used by backup-host imposter to zero or corrupt local rootfs. Ok, that's quite a paranoid scenario, especially if you can identify backup host by something like ssh key fingerprint, but it's still a good policy.

Ways to limit local rsync to read-only, but otherwise unrestricted, access I've considered were:

  • Locally-initiated rsync with parameters, passed from backup host, like "rsync -a / host:/path/to/storage". Not a good option, since that requres parameter checking and that's proven to be error-prone soul-sucking task (just look at the sudo or suid-perl), plus it'd need some one-time and restricted access mechanism on backup host.
  • Local rsyncd with one-time credentials. Not a bad way. Simple, for one thing, but the link between the hosts can be insecure (wireless) and rsync protocol does not provide any encryption for the passed data - and that's the whole filesystem piped through. Also, there's no obvious way to make sure it'd process only one connection (from backup host, just to read fs once) - credentials can be sniffed and used again.
  • Same as before, but via locally-initiated reverse-ssh tunnel to rsyncd.
  • One-shot local sshd with rsync-only command restriction, one-time generated keypair and remote ip restriction.

Last two options seem to be the best, being pretty much the same thing, with the last one more robust and secure, since there's no need to tamper with rsyncd and it's really one-shot.

Caveat however, is how to give rsync process read-only access. Luckily, there's dac_read_search posix capability, which allows just that - all that's needed is to make it inheritable-effective for rsync binary in question, which can be separate statically-linked one, just for these backup purposes.
Separate one-shot sshd also friendly to nice/ionice setting and traffic shaping (since it's listening on separate port), which is quite crucial for wireless upload bandwidth since it has a major impact on interactive connections - output pfifo gets swarmed by ssh-data packets and every other connection actions (say, ssh session keypress) lag until it's packets wait in this line... but that's a bit unrelated note (see LARTC if you don't know what it's all about, mandatory).

And that actually concludes the overall plan, which comes to these steps:

  • Backed-up host:
    • Generates ssh keypair (ssh-keygen).
    • Starts one-shot sshd ("-d" option) with authorization only for generated public key, command ("ForceCommand" option), remote ip ("from=" option) and other (no tunneling, key-only auth, etc) restrictions.
    • Connects (ssh, naturally) to backup host's unprivileged user or restricted shell and sends it's generated (private) key for sshd auth, waits.
  • Backup host:
    • Receives private ssh key from backed-up host.
    • rsync backed-up-host:/ /path/to/local/storage

Minor details:

  • ssh pubkey authentication is used to open secure channel to a backup host, precluding any mitm attacks, non-interactive cron-friendly.
  • sshd has lowered nice/ionice and bandwidth priority, so it won't interfere with host operation in any way.
  • Backup host receives link destination for rsync along with the private key, so it won't have to guess who requested the backup and which port it should use.
  • ForceCommand can actually point to the same "backup initiator" script, which will act as a shell with full rsync command in SSH_ORIGINAL_COMMAND env var, so additional checks or privilege manipulations can be performed immediately before sync.
  • Minimal set of tools used: openssh, rsync and two (fairly simple) scripts on both ends.
Phew... and I've started writing this just as an example usage of posix capabilities for previous entry.
Guess I'll leave implementation details for the next one.

Feb 01, 2010

POSIX capabilities for python

I bet everyone who did any sysadmin tasks for linux/*bsd/whatever, stumbled upon the need to elevate privileges for some binary or script.

And most of the time if there's any need for privileges at all, it's for the ones that only root has: changing uid/gid on files, full backup, moving stuff owned by root/other-uids, signaling daemons, network tasks, etc.

Most of these tasks require only a fragment of root's power, so capabilities(7) is a nice way to get what you need without compromising anything. Great feat of caps is that they aren't inherited on exec, which seem to beat most of vulnerabilities for scripts, which don't usually suffer from C-like code shortcomings, provided the interpreter itself is up-to-date.

However, I've found that support for capabilities in linux (gentoo in my case, but that seem to hold true for other distros) is quite lacking. While they've been around for quite a while, even simplest ping util still has suid bit instead of single cap_net_*, daemons get root just to bind a socket on a privileged port and service scripts just to send signal some pid.

For my purposes, I needed to backup FS with rsync, synchronize data between laptops and control autofs/mounts, all that from py scripts, and using full root for any of these tasks isn't necessary at all.

First problem is to give limited capabilities to a script.

One way to get them is to get everything from sudo or suid bit (aka get root), then drop everything that isn't needed, which is certainly better than having root all the time, but still excessive, since I don't need full and inheritable root at any point.

Another way is to inherit caps from cap-enabled binary. Just like suid, but you don't need to get all of them, they won't have to be inheritable and it doesn't have to be root-or-nothing. This approach looks a way nicer than the first one, so I decided to stick with it.

For py script, it means that the interpreter has to inherit some caps from something else, since it wouldn't be wise to give caps to all py scripts indiscriminatively. "some_caps=i" (according to libcap text representation format, see cap_to_text(3)) or even "all=i" are certainly better.

To get caps from nothing, a simple C wrapper would suffice, but I'm a bit too lazy to write one for every script I run so I wrote one that gets all the caps and drops them to the subset that script file's inherited set. More on this (a bit unrelated) subject here.

That leads to the point there py code starts with some permitted, but not immediately effective, set of capabilities.

Tinkering with caps in C is possible via libcap and libcap-ng, and the only module for py seem to be cap-ng bindings. And they do suck.

Not only it's a direct C calls translation, but the interface is sorely lacking as well. Say, you need something extremely simple: to remove cap from some set, to activate permitted caps as effective or copy them to inherited set... well, no way to do that, what a tool. Funny thing, libcap can't do that in any obvious way either!

So here goes my solution - dumped whole cap-manipulation interface of both libs apart from dump-restore from/to string functions, wrote simple py-C interface to it and wrapped them in python OO interface - Caps class.

And the resulting high-level py code to make permitted caps effective goes like this:

Caps.from_process().activate().apply()

To make permitted caps inheritable:

caps.propagnate().apply()

And the rest of the ops is just like this:

caps['inheritable'].add('cap_dac_read_search')
caps.apply()

Well, friendly enough for me, and less than hundred lines of py code (which does all the work apart from load-save) for that.

While the code is part of a larger toolkit (fgc), it doesn't depend on any other part of it - just C module and py wrapper.

Of course, I was wondering why no-one actually wrote something like this before, but looks like not many people actually use caps at all, even though it's worth it, supported by the fact that while I've managed to find the bug in .32 and .33-rc* kernel, preventing prehaps one of the most useful caps (cap_dac_read_search) from working ;(

Well, whatever.

Guess I'll write more about practical side and my application of this stuff next time.

Jan 30, 2010

Wheee, I've got a blog ;)

There are times when even almighty google can't give a clear solution to some simple-enough problem, and it seem to be happening more frequently so far, so I thought I better write it all down somewhere, so here goes...

The idea formed quite a while ago, but I've always either dismissed it as useless or was too lazy to implement it.

Not that it's any difficult to start a blog these days, but hosting it on some major software platform like blogspot doesn't seem right to me, since I got too used to be able to access the code and tweak anything I don't like (yes, open-source has got me), and that should be pretty much impossible there.

Other extreme is writing my own platform from scratch.
Not a bad thought altogether, but too much of a wasted effort, especially since I don't really like web development, web design and associated voodoo rituals.
Besides, it'd be too buggy anyway ;)
So, I thought to get my hands on some simple and working out-of-the-box blog engine and fit it to my purposes as needed.
Since don't like php, 95% of such engines were out of question.
Surprisingly few platforms are written on py or lisp, and I wasn't fond of the idea of weaving separate cgi/fcgi module into my site.
Although it wasn't much of a problem with twisted, since control over request handling there is absolute and expressed in simple py code, I've stumbled upon my long-neglected google-apps account and a bloog project.
Having played with gapps about two years ago, I really liked the idea: you get all the flexibility you want without having to care about things like db and some buggy api for it in the app, authorization and bot-protection, content serving mechanism, availability, even page generation, since google has django for it. In a nutshell I got a very simple representation layer between gdb and django, easy to bend in any way I want. As a bonus, bloog is not just simple and stupid tool, but quite nice and uniform restful api with YUI-based client.
Two evenings of customization and I'm pretty happy with the result and completely familiar with the inner workings of the thing. Thanks to Bill Katz for sharing it.

All in all, it's an interesting experience. Blogosphere seem to have evolved into some kind of sophisticated ecosystem, with it's own platforms, resources, syndication rules, etc. While I'm pretty sure I won't blend in, at least I can study it a bit.

So ends the first entry. Quite more of it than I've expected, actually.
More to come? I wonder.
← Previous Page 17 of 17