Jan 29, 2017

Proxying ssh user connections to gitolite host transparently

Recently bumped into apparently not well-supported scenario of accessing gitolite instance transparently on a host that is only accessible through some other gateway (often called "bastion" in ssh context) host.

Something like this:

+---------------+
|               |   git@myhost.net:myrepo
|  dev-machine  ---------------------------+
|               |                          |
+---------------+                          |
                              +------------v------+
      git@gitolite:myrepo     |                   |
  +----------------------------  myhost.net (gw)  |
  |                           |                   |
+-v-------------------+       +-------------------+
|                     |
|    gitolite (gl)    |
|  host/container/vm  |
|                     |
+---------------------+

Here gitolite instance might be running on a separate machine, or on the same "myhost.net", but inside a container or vm with separate sshd daemon.

From any dev-machine you want to simply use git@myhost.net:myrepo to access repositories, but naturally that won't work because in normal configuration you'd hit sshd on gw host (myhost.net) and not on gl host.

There are quite a few common options to work around this:

  • Use separate public host/IP for gitolite, e.g. git.myhost.net (!= myhost.net).

  • TCP port forwarding or similar tricks.

    E.g. simply forward ssh port connections in a "gw:22 -> gl:22" fashion, and have gw-specific sshd listen on some other port, if necessary.

    This can be fairly easy to use with something like this for odd-port sshd in ~/.ssh/config:

    Host myhost.net
      Port 1234
    Host git.myhost.net
      Port 1235
    

    Can also be configured in git via remote urls like ssh://git@myhost.net:1235/myrepo.

  • Use ssh port forwarding to essentially do same thing as above, but with resulting git port accessible on localhost.

  • Configure ssh to use ProxyCommand, which will login to gw host and setup forwarding through it.

All of these, while suitable for some scenarios, are still nowhere near what I'd call "transparent", and require some additional configuration for each git client beyond just git add remote origin git@myhost.net:myrepo.

One advantage of such lower-level forwarding is that ssh authentication to gitolite is only handled on gitolite host, gw host has no clue about that.

If dropping this is not a big deal (e.g. because gw host has root access to everything in gl container anyway), there is a rather easy way to forward only git@myhost.net connections from gw to gl host, authenticating them only on gw instead, described below.


Gitolite works by building ~/.ssh/authorized_keys file with essentially command="gitolite-shell gl-key-id" <gl-key> for each public key pushed to gitolite-admin repository.

Hence to proxy connections from gw, similar key-list should be available there, with key-commands ssh'ing into gitolite user/host and running above commands there (with original git commands also passed through SSH_ORIGINAL_COMMAND env-var).

To keep such list up-to-date, post-update trigger/hook for gitolite-admin repo is needed, which can use same git@gw login (with special "gl auth admin" key) to update key-list on gw host.

Steps to implement/deploy whole thing:

  • useradd -m git on gw and run ssh-keygen -t ed25519 on both gw and gl hosts for git/gitolite user.

  • Setup all connections for git@gw to be processed via single "gitolite proxy" command, disallowing anything else, exactly like gitolite does for its users on gl host.

    gitolite-proxy.py script (python3) that I came up with for this purpose can be found here: https://github.com/mk-fg/gitolite-ssh-user-proxy/

    It's rather simple and does two things:

    • When run with --auth-update argument, receives gitolite authorized_keys list, and builds local ~/.ssh/authorized_keys from it and authorized_keys.base file.

    • Similar to gitolite-shell, when run as gitolite-proxy key-id, ssh'es into gl host, passing key-id and git command to it.

      This is done in a straightforward os.execlp('ssh', 'ssh', '-qT', ...) manner, no extra processing or any error-prone stuff like that.

    When installing it (to e.g. /usr/local/bin/gitolite-proxy as used below), be sure to set/update "gl_host_login = ..." line at the top there.

    For --auth-update, ~/.ssh/authorized_keys.base (note .base) file on gw should have this single line (split over two lines for readability, must be all on one line for ssh!):

    command="/usr/local/bin/gitolite-proxy --auth-update",no-port-forwarding
      ,no-X11-forwarding,no-agent-forwarding,no-pty ssh-ed25519 AAA...4u3FI git@gl
    

    Here ssh-ed25519 AAA...4u3FI git@gl is the key from ~git/.ssh/id_ed25519.pub on gitolite host.

    Also run:

    # install -m0600 -o git -g git ~git/.ssh/authorized_keys{.base,}
    # install -m0600 -o git -g git ~git/.ssh/authorized_keys{.base,.old}
    
    To have initial auth-file, not yet populated with gitolite-specific keys/commands.
    Note that only these two files need to be writable for git user on gw host.
  • From gitolite (gl) host and user, run: ssh -qT git@gw < ~/.ssh/authorized_keys

    This is to test gitolite-proxy setup above - should populate ~git/.ssh/authorized_keys on gw host and print back gw host key and proxy script to run as command="..." for it (ignore them, will be installed by trigger).

  • Add trigger that'd run after gitolite-admin repository updates on gl host.

    • On gl host, put this to ~git/.gitolite.rc right before ENABLE line:

      LOCAL_CODE => "$rc{GL_ADMIN_BASE}/local",
      POST_COMPILE => ['push-authkeys'],
      
    • Commit/push push-authkeys trigger script (also from gitolite-ssh-user-proxy repo) to gitolite-admin repo as local/triggers/push-authkeys, updating gw_proxy_login line in there.

    gitolite docs on adding triggers: http://gitolite.com/gitolite/gitolite.html#triggers

Once proxy-command is in place on gw and gitolite-admin hook runs at least once (to setup gw->gl access and proxy-command), git@gw (git@myhost.net) ssh login spec can be used in exactly same way as git@gl.

That is, fully transparent access to gitolite on a different host through that one user, while otherwise allowing to use sshd on a gw host, without any forwarding tricks necessary for git clients.

Whole project, with maybe a bit more refined process description and/or whatever fixes can be found on github here: https://github.com/mk-fg/gitolite-ssh-user-proxy/

Huge thanks to sitaramc (gitolite author) for suggesting how to best setup gitolite triggers for this purpose on the ML.

Sep 01, 2015

Transparent and easy encryption for files in git repositories

Have been installing things to an OS containers (/var/lib/machines) lately, and looking for proper configuration management in these.

Large-scale container setups use some hard-to-integrate things like etcd, where you have to template configuration from values in these, which is not very convenient and very low effort-to-results ratio (maintenance of that system itself) for "10 service containers on 3 hosts" case.

Besides, such centralized value store is a bit backwards for one-container-per-service case, where most values in such "central db" are specific to one container, and it's much easier to edit end-result configs then db values and then templates and then check how it all gets rendered on every trivial tweak.

Usual solution I have for these setups is simply putting all confs under git control, but leaving all the secrets (e.g. keys, passwords, auth data) out of the repo, in case it might be pulled from on other hosts, by different people and for purposes which don't need these sensitive bits and might leak them (e.g. giving access to contracted app devs).

For more transient container setups, something should definitely keep track of these "secrets" however, as "rm -rf /var/lib/machines/..." is much more realistic possibility and has its uses.


So my (non-original) idea here was to have one "master key" per host - just one short string - with which to encrypt all secrets for that host, which can then be shared between hosts and specific people (making these public might still be a bad idea), if necessary.

This key should then be simply stored in whatever key-management repo, written on a sticker and glued to a display, or something.

Git can be (ab)used for such encryption, with its "filter" facilities, which are generally used for opposite thing (normalization to one style), but are easy to adapt for this case too.

Git filters work by running "clear" operation on selected paths (can be a wildcard patterns like "*.c") every time git itself uses these and "smudge" when showing to user and checking them out to a local copy (where they are edited).

In case of encryption, "clear" would not be normalizing CR/LF in line endings, but rather wrapping contents (or parts of them) into a binary blob, and "smudge" should do the opposite, and gitattributes patterns would match files to be encrypted.


Looking for projects that already do that, found quite a few, but still decided to write my own tool, because none seem have all the things I wanted:

  • Use sane encryption.

    It's AES-CTR in the absolutely best case, and AES-ECB (wtf!?) in some, sometimes openssl is called with "password" on the command line (trivial to spoof in /proc).

    OpenSSL itself is a red flag - hard to believe that someone who knows how bad its API and primitives are still uses it willingly, for non-TLS, at least.

    Expected to find at least one project using AEAD through NaCl or something, but no such luck.

  • Have tool manage gitattributes.

    You don't add file to git repo by typing /path/to/myfile managed=version-control some-other-flags to some config, why should you do it here?

  • Be easy to deploy.

    Ideally it'd be a script, not some c++/autotools project to install build tools for or package to every setup.

    Though bash script is maybe taking it a bit too far, given how messy it is for anything non-trivial, secure and reliable in diff environments.

  • Have "configuration repository" as intended use-case.

So wrote git-nerps python script to address all these.

Crypto there is trivial yet solid PyNaCl stuff, marking files for encryption is as easy as git-nerps taint /what/ever/path and bootstrapping the thing requires nothing more than python, git, PyNaCl (which are norm in any of my setups) and git-nerps key-gen in the repo.

README for the project has info on every aspect of how the thing works and more on the ideas behind it.

I expect it'll have a few more use-case-specific features and convenience-wrapper commands once I'll get to use it in a more realistic cases than it has now (initially).


[project link]

Feb 04, 2013

codetag + tmsu: Tag all the Things (and Go!)

Was hacking something irrelevant together again and, as often happens with such things, realized that I implemented something like that before.
It can be some simple - locking function in python, awk pipe to get some monitoring data, chunk of argparse-based code to process multiple subcommands, TLS wrapper for requests, dbapi wrapper, multi-module parser/generator for human-readable dates, logging buffer, etc...
Point is - some short snippet of code is needed as a base for implementing something new or maybe even to re-use as-is, yet it's not noteworthy enough on it's own to split into a module or generally do anything specific about it.

Happens a lot to me, as over the years, a lot of such ad-hoc yet reusable code gets written, and I can usually remember enough implementation details (e.g. which modules were used there, how the methods/classes were called and such), but going "grep" over the source dir takes a shitload of time.

Some things make it faster - ack or pss tools can scan only relevant things (like e.g. "grep ... **/*.py" will do in zsh), but these also run for minutes, as even simple "find" does - there're several django source trees in appengine sdk, php projects with 4k+ files inside, maybe even whole linux kernel source tree or two...

Traversing all these each time on regular fs to find something that can be rewritten in a few minutes will never be an option for me, but luckily there're cool post-fs projects like tmsu, which allow to transcend single-hierarchy-index limitation of a traditional unix fs in much more elegant and useful way than gazillion of symlinks and dentries.

tmsu allows to attach any tags to any files, then query these files back using a set of tags, which it does really fast using sqlite db and clever indexes there.

So, just tagging all the "*.py" files with "lang:py" will allow to:

% time tmsu files lang:py | grep myclass
tmsu files lang:py  0.08s user 0.01s system 98% cpu 0.094 total
grep --color=auto myclass  0.01s user 0.00s system 10% cpu 0.093 total

That's 0.1s instead of several minutes for all the python code in the development area on this machine.

tmsu can actually do even cooler tricks than that with fuse-tagfs mounts, but that's all kinda wasted until all the files won't be tagged properly.
Which, of course, is a simple enough problem to solve.
So here's my first useful Go project - codetag.

I've added taggers for things that are immediately useful for me to tag files by - implementation language, code hosting (github, bitbucket, local project, as I sometimes remember that snippet was in some public tool), scm type (git, hg, bzr, svn), but it adding a new one is just a metter of writing a "Tagger" function, which, given the path and config, returns a list of string tags, plus they're only used if explicitly enabled in config.

Other features include proper python-like logging and rsync-like filtering (but using more powerful re2 regexps instead of simple glob patterns).
Up-to-date list of these should be apparent from the included configuration file.

Being a proper compiled language, Go allows to make the thing into a single static binary, which is quite neat, as I realized that I now have a tool to tag all the things everywhere - media files on servers' remote-fs'es, like music and movies, hundreds of configuration files by the app they belong to (think tmsu files daemon:apache to find/grep all the horrible ".htaccess" things and it's "*.conf" includes), distfiles by the os package name, etc... can be useful.

So, to paraphrase well-known meme, Tag All The Things! ;)

github link

Feb 07, 2012

Phasing out fossil completely

Having used git excessively for the last few days decided to ditch fossil scm at last.
All the stuff will be in git and mirorred on the github (maybe later on bittbucket as well).
Will probably re-import meta stuff (issues, wikis) from there into the main tree, but still can't find nice-enough tool for that.
Closest thing seem to be Artemis, but it's for mercurial, so I'll probably need to port it to git first, shouldn't be too hard.

Also, I'm torn at this point between the thoughts along the lines "selection of modern DVCS spoil us" against "damn, why they there is no clear popular + works-for-everything thing", but it's probably normal, as I have (or had) similar thoughts about lot of technologies.

Feb 03, 2012

On github as well now

Following another hiatus from a day job, I finally have enough spare time to read some of the internets and do something about them.

For quite a while I had lots of quite small scripts and projects, which I kinda documented here (and on the site pages before that).
I always kept them in some kind of scm - be it system-wide repo for configuration files, ~/.cFG repo for DE and misc user configuration and ~/bin scripts, or ~/hatch repo I keep for misc stuff, but as their number grows, as well as the size and complexity, I think maybe some of this stuff deserves some kind of repo, maybe attention, and best-case scenario, will even be useful to someone but me.

So I thought to gradually push all this stuff out to github and/or bitbucket (still need to learn or at least look at hg for that!). github being the most obvious and easiest choice, just created a few repos there and started the migration. More to come.

Still don't really trust a silo like github to keep anything reliably (besides it lags like hell here, especially compared to local servers I'm kinda used to), so need to devise some mirroring scheme asap.
Initial idea is to take some flexible tool (hg seem to be ideal, being python and scm proper) and build a hooks into local repos to push stuff out to mirrors from there, ideally both bitbucket and github, also exploiting their metadata APIs to fetch stuff like tickets/issues and commit history of these into separate repo branch as well.

Effort should be somewhat justified by the fact that such repos will be geo-distributed backups, shareable links and I can learn more SCM internals by the way.

For now - me on github.

May 02, 2011

Fossil to Git export and mirroring

The biggest issue I have with fossil scm is that it's not git - there are just too many advanced tools which I got used to with git over time, which probably will never be implemented in fossil just because of it's "lean single binary" philosophy.
And things get even worse when you need to bridge git-fossil repos - common denominator here is git, so it's either constant "export-merge-import" cycle or some hacks, since fossil doesn't support incremental export to a git repo out of the box (but it does have support for full import/export), and git doesn't seem to have a plugin to track fossil remotes (yet?).
I thought of migrating away from fossil, but there's just no substitute (although quite a lot of attempts to implement that) for distributed issue tracking and documentation right in the same repository and plain easy to access format with a sensible web frontend for those who don't want to install/learn scm and clone the repo just to file a ticket.
None of git-based tools I've been able to find seem to meet this (seemingly) simple criterias, so dual-stack it is then.
Solution I came up with is real-time mirroring of all the changes in fossil repositories to a git.
It's quite a simple script, which is
  • watching fossil-path with inotify(7) for IN_MODIFY events (needs pyinotify for that)
  • checking for new revisions in fossil (source) repo against tip of a git
  • comparing these by timestamps, which are kept in perfect sync (by fossil-export as well)
  • exporting revisions from fossil as a full artifacts (blobs), importing these into git via git-fast-import

It's also capable to do oneshot updates (in which case it doesn't need anything but python-2.7, git and fossil), bootstrapping git mirrors as new fossil repos are created and catching-up with their sync on startup.

While the script uses quite a low-level (but standard and documented here and there) scm internals, it was actually very easy to write (~200 lines, mostly simple processing-generation code), because both scms in question are built upon principles of simple and robust design, which I deeply admire.

Resulting mirrors of fossil repos retain all the metadata like commit messages, timestamps and authors.
Limitation is that it only tracks one branch, specified at startup ("trunk", by default), and doesn't care about the tags at the moment, but I'll probably fix the latter when I'll do some tagging next time (hence will have a realworld test case).
It's also trivial to make the script do two-way synchronization, since fossil supports "fossil import --incremental" update right from git-fast-export, so it's just a simple pipe, which can be run w/o any special tools on demand.

Script itself.

fossil_echo --help:

usage: fossil_echo [-h] [-1] [-s] [-c] [-b BRANCH] [--dry-run] [-x EXCLUDE]
                      [-t STAT_INTERVAL] [--debug]
                      fossil_root git_root

Tool to keep fossil and git repositories in sync. Monitors fossil_root for
changes in *.fossil files (which are treated as source fossil repositories)
and pushes them to corresponding (according to basename) git repositories.
Also has --oneshot mode to do a one-time sync between specified repos.

positional arguments:
  fossil_root           Path to fossil repos.
  git_root              Path to git repos.

optional arguments:
  -h, --help            show this help message and exit
  -1, --oneshot         Treat fossil_root and git_root as repository paths and
                        try to sync them at once.
  -s, --initial-sync    Do an initial sync for every *.fossil repository found
                        in fossil_root at start.
  -c, --create          Dynamically create missing git repositories (bare)
                        inside git-root.
  -b BRANCH, --branch BRANCH
                        Branch to sync (must exist on both sides, default:
                        trunk).
  --dry-run             Dump git updates (fast-import format) to stdout,
                        instead of feeding them to git. Cancels --create.
  -x EXCLUDE, --exclude EXCLUDE
                        Repository names to exclude from syncing (w/o .fossil
                        or .git suffix, can be specified multiple times).
  -t STAT_INTERVAL, --stat-interval STAT_INTERVAL
                        Interval between polling source repositories for
                        changes, if there's no inotify/kevent support
                        (default: 300s).
  --debug               Verbose operation mode.

Apr 18, 2011

Key-Value storage with history/versioning on top of scm

Working with a number of non-synced servers remotely (via fabric) lately, I've found the need to push updates to a set of (fairly similar) files.

It's a bit different story for each server, of course, like crontabs for a web backend with a lot of periodic maintenance, data-shuffle and cache-related tasks, firewall configurations, common html templates... well, you get the idea.
I'm not the only one who makes the changes there, and without any change/version control for these sets of files, state for each file/server combo is essentially unique and accidental change can only be reverted from a weekly backup.
Not really a sensible infrastructure as far as I can tell (or just got used to), but since I'm a total noob here, working for only a couple of weeks, global changes are out of question, plus I've got my hands full with the other tasks as it is.
So, I needed to change files, keeping the old state for each one in case rollback is necessary, and actually check remote state before updating files, since someone might've introduced either the same or conflicting change while I was preparing mine.
Problem of conflicting changes can be solved by keeping some reference (local) state and just applying patches on top of it. If file in question is important enough, having such state is double-handy, since you can pull the remote state in case of changes there, look through the diff (if any) and then decide whether the patch is still valid or not.
Problem of rollbacks is solved long ago by various versioning tools.
Combined, two issues kinda beg for some sort of storage with a history of changes for each value there, and since it's basically a text, diffs and patches between any points of this history would also be nice to have.
It's the domain of the SCM's, but my use-case is a bit more complicated then the usual usage of these by the fact that I need to create new revisions non-interactively - ideally via something like a key-value api (set, get, get_older_version) with the usual interactive interface to the history at hand in case of any conflicts or complications.
Being most comfortable with git, I looked for non-interactive db solutions on top of it, and the simpliest one I've found was gitshelve. GitDB seem to be more powerful, but unnecessary complex for my use-case.
Then I just implemented patch (update key by a diff stream) and diff methods (generate diff stream from key and file) on top of gitshelve plus writeback operation, and thus got a fairly complete implementation of what I needed.

Looking at such storage from a DBA perspective, it's looking pretty good - integrity and atomicity are assured by git locking, all sorts of replication and merging possible in a quite efficient and robust manner via git-merge and friends, cli interface and transparency of operation is just superb. Regular storage performance is probably far off db level though, but it's not an issue in my use-case.

Here's gitshelve and state.py, as used in my fabric stuff. fabric imports can be just dropped there without much problem (I use fabric api to vary keys depending on host).

Pity I'm far more used to git than pure-py solutions like mercurial or bazaar, since it'd have probably been much cleaner and simplier to implement such storage on top of them - they probably expose python interface directly.
Guess I'll put rewriting the thing on top of hg on my long todo list.

Apr 25, 2010

Exherbo / paludis fossil syncer

So far I like exherbo way of package management and base system layout.
I haven't migrated my desktop environment to it yet, but I expect it shouldn't be a problem, since I don't mind porting all the stuff I need either from gentoo or writing exheres for all I need from scratch.
First challenge I've faced though was due to my late addiction to fossil scm, which doesn't seem to neither be in any of exherbo repos listed in unavailable meta-repository, nor have a syncer for paludis, so I wrote my own dofossil syncer and created the repo.
Syncer should support both fossil+http:// and fossil+file:// protocols and tries to rebuild repository data from artifacts' storage, should it encounter any errors in process.

Repository, syncer and some instructions are here.

Thought I'd give google some keywords, should someone be looking for the same thing, although I'd probably try to push it into paludis and/or "unavailable" repo, when (and if) I'll get a bit more solid grasp on exherbo concepts.

Apr 17, 2010

Thoughts on VCS, supporting documentation and Fossil

I'm a happy git user for several years now, and the best thing about it is that I've learned how VCS-es, and git in particular, work under the hood.
It expanded (and in most aspects probably formed) my view on the time-series data storage - very useful knowledge for wide range of purposes from log or configuration storage to snapshotting, backups and filesystem synchronisation. Another similar revelation in this area was probably rrdtool, but still on much smaller scale.
Few years back, I've kept virtually no history of my actions, only keeping my work in CVS/SVN, and even that was just for ease of collaboration.
Today, I can easily trace, sync and transfer virtually everything that changes and is important in my system - the code I'm working on, all the configuration files, even auto-generated ones, tasks' and thoughts' lists, state-description files like lists of installed packages (local sw state) and gentoo-portage tree (global sw state), even all the logs and binary blobs like rootfs in rsync-hardlinked backups for a few past months.

Git is a great help in these tasks, but what I feel lacking there is a first - common timeline (spanning both into the past and the future) for all these data series, and second - documentation.

Solution to the first one I've yet to find.

Second one is partially solved by commit-msgs, inline comments and even this blog for the past issues and simple todo-lists (some I keep in plaintext, some in tudu app) for the future.
Biggest problem I see here is the lack of consistency between all these: todo-tasks end up as dropped lines in the git-log w/o any link to the past issues or reverse link to the original idea or vision, and that's just the changes.

Documentation for anything more than local implementation details and it's history is virtually non-existant and most times it takes a lot of effort and time to retrace the original line of thought, reasoning and purpose behind the stuff I've done (and why I've done it like that) in the past, often with the considerable gaps and eventual re-invention of the wheels and pitfalls I've already done, due to faulty biological memory.

So, today I've decided to scour over the available project and task management software to find something that ties the vcs repositories and their logs with the future tickets and some sort of expanded notes, where needed.

Starting point was actually the trac, which I've used quite extensively in the past and present, and is quite fond of it's outside simplicity yet fully-featured capabilities as both wiki-engine and issue tracker. Better yet, it's py and can work with vcs.
The downside is that it's still a separate service and web-based one at that, meaning that it's online-only, and that the content is anchored to the server I deploy it to (not to mention underlying vcs). Hell, it's centralized and laggy, and ever since git's branching and merging ideas of decentralized work took root in my brain, I have issue with that.

It just looks like a completely wrong approach for my task, yet I thought that I can probably tolerate that if there are no better options and then I've stumbled upon Fossil VCS.

The name actually rang a bell, but from a 9p universe, where it's a name for a vcs-like filesystem which was (along with venti, built on top of it) one of two primary reasons I've even looked into plan9 (the other being its 9p/styx protocol).
Similary-named VCS haven't disappointed me as well, at least conceptually. The main win is in the integrated ticket system and wiki, providing just the thing I need in a distributed versioned vcs environment.

Fossil's overall design principles and concepts (plus this) are well-documented on it's site (which is a just a fossil repo itself), and the catch-points for me were:

  • Tickets and wiki, of course. Can be edited locally, synced, distributed, have local settings and appearance, based on tcl-ish domain-specific language.
  • Distributed nature, yet rational position of authors on centralization and synchronization topic.
  • All-in-one-static-binary approach! Installing hundreds of git binaries to every freebsd-to-debian-based system, was a pain, plus I've ended up with 1.4-1.7 version span and some features (like "add -p") depend on a whole lot of stuff there, like perl and damn lot of it's modules. Unix-way is cool, but that's really more portable and distributed-way-friendly.
  • Repository in a single package, and not just a binary blob, but a freely-browsable sqlite db. It certainly is a hell lot more convenient than path with over nine thousand blobs with sha1-names, even if the actual artifact-storage here is built basically the same way. And the performance should be actually better than the fs - with just index-selects BTree-based sqlite is as fast as filesystem, but keeping different indexes on fs is by sym-/hardlinking, and that's a pain that is never done right on fs.
  • As simple as possible internal blobs' format.
  • Actual symbolics and terminology. Git is a faceless tool, Fossil have some sort of a style, and that's nice ;)

Yet there are some things I don't like about it:

  • HTTP-only sync. In what kind of twisted world that can be better than ssh+pam or direct access? Can be fixed with a wrapper, I guess, but really, wtf...
  • SQLite container around generic artifact storage. Artifacts are pure data with a single sha1sum-key for it, and that is simple, solid and easy to work with anytime, but wrapped into sqlite db it suddenly depends on this db format, libs, command-line tool or language bindings, etc. All the other tables can be rebuilt just from these blobs, so they should be as accessible as possible, but I guess that'd violate whole single-file design concept and would require a lot of separate management code, a pity.

But that's nothing more than a few hours' tour of the docs and basic hello-world tests, guess it all will look different after I'll use it for a while, which I'm intend to do right now. In the worst case it's just a distributed issue tracker + wiki with cli interface and great versioning support in one-file package (including webserver) which is more than I can say about trac, anyway.

Member of The Internet Defense League