Having used git excessively for the last few days decided to ditch
fossil scm at last.
Will probably re-import meta stuff (issues, wikis) from there into the main
tree, but still can't find nice-enough tool for that.
Closest thing seem to be
Artemis,
but it's for mercurial, so I'll probably need to port it to git first,
shouldn't be too hard.
Also, I'm torn at this point between the thoughts along the lines "selection of
modern DVCS spoil us" against "damn, why they there is no clear popular +
works-for-everything thing", but it's probably normal, as I have (or had)
similar thoughts about lot of technologies.
Following another hiatus from a day job, I finally have enough spare time to
read some of the internets and do something about them.
For quite a while I had lots of quite small scripts and projects, which I
kinda documented here (and on the site pages before that).
I always kept them in some kind of scm - be it system-wide repo for
configuration files, ~/.cFG repo for DE and misc user configuration and ~/bin
scripts, or ~/hatch repo I keep for misc stuff, but as their number grows, as
well as the size and complexity, I think maybe some of this stuff deserves
some kind of repo, maybe attention, and best-case scenario, will even be
useful to someone but me.
So I thought to gradually push all this stuff out to github and/or bitbucket
(still need to learn or at least look at hg for that!). github being the most
obvious and easiest choice, just created a few repos there and started the
migration. More to come.
Still don't really trust a silo like github to keep anything reliably (besides
it lags like hell here, especially compared to local servers I'm kinda used
to), so need to devise some mirroring scheme asap.
Initial idea is to take some flexible tool (hg seem to be ideal, being python
and scm proper) and build a hooks into local repos to push stuff out to
mirrors from there, ideally both bitbucket and github, also exploiting their
metadata APIs to fetch stuff like tickets/issues and commit history of these
into separate repo branch as well.
Effort should be somewhat justified by the fact that such repos will be
geo-distributed backups, shareable links and I can learn more SCM internals by
the way.
For now - me on github.
Should've done it a long time ago, actually. I was totally sure it'd be much
harder task, but then recently I've had some spare time and decided to do
something about this binary crap, and looking for possible solutions stumbled
upon apparmor.
A while ago I've used
SELinux (which was the
reason why I thought it'd have to be hard) and kinda considered LSM-based
security as kind of heavy-handed no-nonsense shit you chose NOT to deal with
if you have such choice, but apparmor totally proved this to be a silly
misconception, which I'm insanely happy about.
With apparmor, it's just one file with a set of permissions, which can be
loaded/enforced/removed at runtime, no xattrs (and associated maintenance
burden) or huge and complicated policies like SELinux has.
For good whole-system security SELinux still seem to be a better approach, but
not for confining a few crappy apps on a otherwise general system.
On top of that, it's also trivially easy to install on a general system - only
kernel LSM and one userspace package needed.
Case in point - skype apparmor profile, which doesn't allow it to access
anything but ~/.Skype, /opt/skype and a few other system-wide things:
#include <tunables/global>
/usr/bin/skype {
#include <abstractions/base>
#include <abstractions/user-tmp>
#include <abstractions/pulse>
#include <abstractions/nameservice>
#include <abstractions/ssl_certs>
#include <abstractions/fonts>
#include <abstractions/X>
#include <abstractions/freedesktop.org>
#include <abstractions/kde>
#include <abstractions/site/base>
#include <abstractions/site/de>
/usr/bin/skype mr,
/opt/skype/skype pix,
/opt/skype/** mr,
/usr/share/fonts/X11/** m,
@{PROC}/*/net/arp r,
@{PROC}/sys/kernel/ostype r,
@{PROC}/sys/kernel/osrelease r,
/dev/ r,
/dev/tty rw,
/dev/pts/* rw,
/dev/video* mrw,
@{HOME}/.Skype/ rw,
@{HOME}/.Skype/** krw,
deny @{HOME}/.mozilla/ r, # no idea what it tries to get there
deny @{PROC}/[0-9]*/fd/ r,
deny @{PROC}/[0-9]*/task/ r,
deny @{PROC}/[0-9]*/task/** r,
}
"deny" lines here are just to supress audit warnings about this paths,
everything is denied by default, unless explicitly allowed.
Compared to "default" linux DAC-only "as user" confinement, where it has access
to all your documents, activities, smartcard, gpg keys and processes, ssh keys
and sessions, etc - it's a huge improvement.
Even more useful confinement is firefox and it's plugin-container process
(which can - and does, in my configuration - have separate profile), where
known-to-be-extremely-exploitable adobe flash player runs.
Before apparmor, I mostly relied on FlashBlock extension to keep Flash in
check somehow, but at some point I noted that plugin-container with
libflashplayer.so seem to be running regardless of FlashBlock and whether
flash is displayed on pages or not. I don't know if it's just a warm-start,
check-run or something, but still looks like a possible hole.
I'm actually quite surprised that I failed to find functional profiles for
common apps like firefox and pulseaudio on the internets, aside from some blog
posts like this one.
In theory, Ubuntu and SUSE should have these, since apparmor is developed and
deployed there by default (afaik), so maybe google just haven't picked these
files up in the package manifests, and all I needed was to go over them by
hand. Not sure if it was much faster or more productive than writing them
myself though.
Update 2015-11-25: with "ask-password" caching implemented as of systemd-227
(2015-10-07), better way would be to use that in-kernel caching, though likely
requires systemd running in initramfs (e.g. dracut had that for a while).
Up until now I've used lvm on top of single full-disk dm-crypt partition.
It seems easiest to work with - no need to decrypt individual lv's, no
confusion between what's encrypted (everything but /boot!) and what's not, etc.
Main problem with it though is that it's harder to have non-encrypted parts,
everything is encrypted with the same keys (unless there're several dm-crypt
layers) and it's bad for SSD - dm-crypt still (as of 3.0) doesn't pass any
TRIM requests through, leading to nasty
write amplification effect, even more so with full
disk given to dm-crypt+lvm.
While there's hope that SSD issues
will be kinda-solved
(with an optional security trade-off) in 3.1, it's still much easier to keep
different distros or some decrypted-when-needed partitions with dm-crypt after
lvm, so I've decided to go with the latter for new 120G SSD.
Also, such scheme allows to re-create encrypted lvs, issuing TRIM for the old
ones, thus recycling the blocks even w/o support for this in dm-crypt.
Same as with
previous initramfs,
I've had simple "openct" module (udev there makes it even easier) in
dracut to find inserted smartcard
and use it to obtain encryption key, which is used once to decrypt the only
partition on which everything resides.
Since the only goal of dracut is to find root and get-the-hell-outta-the-way,
it won't even try to decrypt all the /var and /home stuff without serious
ideological changes.
The problem is actually solved in generic distros by
plymouth, which gets the
password(s), caches it, and provides it to dracut and systemd (or whatever
comes as the real "init"). I don't need splash, and actually hate it for
hiding all the info that scrolls in it's place, so plymouth is a no-go for me.
Having a hack to obtain and cache key for dracut by non-conventional means
anyway, I just needed to pass it further to systemd, and since they share common
/run tmpfs these days, it basically means not to rm it in dracut after use.
Luckily, system-wide password handling mechanism in systemd is well-documented and easily
extensible beyond plymouth and default console prompt.
So whole key management in my system goes like this now:
- dracut.cmdline: create udev rule to generate key.
- dracut.udev.openct: find smartcard, run rule to generate and cache
key in /run/initramfs.
- dracut.udev.crypt: check for cached key or prompt for it (caching
result), decrypt root, run systemd.
- systemd: start post-dracut-crypt.path unit to monitor
/run/systemd/ask-password for password prompts, along with default
.path units for fallback prompts via wall/console.
- systemd.udev: discover encrypted devices, create key requests.
- systemd.post-dracut-crypt.path: start post-dracut-crypt.service to
read cached passwords from /run/initramfs and use these to satisfy
requests.
- systemd.post-dracut-crypt-cleanup.service (after local-fs.target is
activated): stop post-dracut-crypt.service, flush caches, generate
new one-time keys for decrypted partitions.
End result is passwordless boot with this new layout, which seem to be only
possible to spoof by getting root during that process somehow, with altering
unencrypted /boot to run some extra code and revert it back being the most
obvious possibility.
It's kinda weird that there doesn't seem to be any caching in place already,
surely not everyone with dm-crypt are using plymouth?
Most complicated piece here is probably the password agent (in python), which
can actually could've been simpler if I haven't followed the proper guidelines and thought
a bit around them.
For example, whole inotify handling thing (I've used it via
ctypes) can be dropped with .path unit
with DirectoryNotEmpty= activation condition - it's there already,
PolicyKit authorization just isn't working
at such an early stage, there doesn't seem to be much need to check request
validity since sending replies to sockets is racy anyway, etc
Still, a good excercise.
Python password agent for systemd.
Unit files to start
and stop it on demand.
Two questions:
- How to tell which pids (or groups of forks) eat most swap right now?
- How much RAM one apache/php/whatever really consumes?
Somehow people keep pointing me at "top" and "ps" tools to do this sort of
things, but there's an obvious problem:
#include <stdlib.h>
#include <unistd.h>
#define G 1024*1024*1024
int main (void) {
(void *) malloc(2 * G);
sleep(10);
return 0;
}
This code will immediately float to 1st position in top, sorted by "swap" (F p
<return>), showing 2G even with no swap in the system.
Second question/issue is also common but somehow not universally recognized,
which is kinda obvious when scared admins (or whoever happen to ssh into web
backend machine) see N pids of something, summing up to more than total amount
of RAM in the system, like 50 httpd processes 50M each.
It gets even worse when tools like "atop" helpfully aggregate the numbers
("atop -p"), showing that there are 6 sphinx processes, eating 15G on a
machine with 4-6G physical RAM + 4-8G swap, causing local panic and mayhem.
The answer is, of course, that sphinx, apache and pretty much anything using
worker processes share a lot of memory pages between their processes, and not
just because of shared objects like libc.
Guess it's just general ignorance of how memory works in linux (or other
unix-os'es) of those who never had to write a fork() or deal with malloc's in C,
which kinda make lots of these concepts look fairly trivial.
So, mostly out of curiosity than the real need, decided to find a way to
answer these questions.
proc(5) reveals this data more-or-less via "maps" / "smaps" files, but that
needs some post-processing to give per-pid numbers.
Closest tools I was able to find were pmap from procps package and
ps_mem.py
script from coreutils
maintainer. Former seem to give only mapped memory region sizes, latter
cleverly shows shared memory divided by a number of similar processes,
omitting per-process numbers and swap.
Oh, and of course there are glorious valgrind and gdb, but both seem to be
active debugging tools, not much suitable for normal day-to-day operation
conditions and a bit too complex for the task.
So I though I'd write my own tool for the job to put the matter at rest once and
for all, and so I can later point people at it and just say "see?" (although I
bet it'll never be that simple).
Idea is to group similar processes (by cmd) and show details for each one, like
this:
agetty:
-stats:
private: 252.0 KiB
shared: 712.0 KiB
swap: 0
7606:
-stats:
private: 84.0 KiB
shared: 712.0 KiB
swap: 0
-cmdline: /sbin/agetty tty3 38400
/lib/ld-2.12.2.so:
-shared-with: rpcbind, _plutorun, redshift, dbus-launch, acpid, ...
private: 8.0 KiB
shared: 104.0 KiB
swap: 0
/lib/libc-2.12.2.so:
-shared-with: rpcbind, _plutorun, redshift, dbus-launch, acpid, ...
private: 12.0 KiB
shared: 548.0 KiB
swap: 0
...
/sbin/agetty:
-shared-with: agetty
private: 4.0 KiB
shared: 24.0 KiB
swap: 0
/usr/lib/locale/locale-archive:
-shared-with: firefox, redshift, tee, sleep, ypbind, pulseaudio [updated], ...
private: 0
shared: 8.0 KiB
swap: 0
[anon]:
private: 20.0 KiB
shared: 0
swap: 0
[heap]:
private: 8.0 KiB
shared: 0
swap: 0
[stack]:
private: 24.0 KiB
shared: 0
swap: 0
[vdso]:
private: 0
shared: 0
swap: 0
7608:
-stats:
private: 84.0 KiB
shared: 712.0 KiB
swap: 0
-cmdline: /sbin/agetty tty4 38400
...
7609:
-stats:
private: 84.0 KiB
shared: 712.0 KiB
swap: 0
-cmdline: /sbin/agetty tty5 38400
...
So it's obvious that there are 3 agetty processes, which ps will report as 796
KiB RSS:
root 7606 0.0 0.0 3924 796 tty3 Ss+ 23:05 0:00 /sbin/agetty tty3 38400
root 7608 0.0 0.0 3924 796 tty4 Ss+ 23:05 0:00 /sbin/agetty tty4 38400
root 7609 0.0 0.0 3924 796 tty5 Ss+ 23:05 0:00 /sbin/agetty tty5 38400
Each of which, in fact, consumes only 84 KiB of RAM, with 24 KiB more shared
between all agettys as /sbin/agetty binary, rest of stuff like ld and libc is
shared system-wide (shared-with list contains pretty much every process in the
system), so it won't be freed by killing agetty and starting 10 more of them
will consume ~1 MiB, not ~10 MiB, as "ps" output might suggest.
"top" will show ~3M of "swap" (same with "SZ" in ps) for each agetty, which is
also obviously untrue.
More machine-friendly (flat) output might remind of sysctl:
agetty.-stats.private: 252.0 KiB
agetty.-stats.shared: 712.0 KiB
agetty.-stats.swap: 0
agetty.7606.-stats.private: 84.0 KiB
agetty.7606.-stats.shared: 712.0 KiB
agetty.7606.-stats.swap: 0
agetty.7606.-cmdline: /sbin/agetty tty3 38400
agetty.7606.'/lib/ld-2.12.2.so'.-shared-with: ...
agetty.7606.'/lib/ld-2.12.2.so'.private: 8.0 KiB
agetty.7606.'/lib/ld-2.12.2.so'.shared: 104.0 KiB
agetty.7606.'/lib/ld-2.12.2.so'.swap: 0
agetty.7606.'/lib/libc-2.12.2.so'.-shared-with: ...
...
Script. No dependencies
needed, apart from python 2.7 or 3.X (works with both w/o conversion).
Some optional parameters are supported:
usage: ps_mem_details.py [-h] [-p] [-s] [-n MIN_VAL] [-f] [--debug] [name]
Detailed process memory usage accounting tool.
positional arguments:
name String to look for in process cmd/binary.
optional arguments:
-h, --help show this help message and exit
-p, --private Show only private memory leaks.
-s, --swap Show only swapped-out stuff.
-n MIN_VAL, --min-val MIN_VAL
Minimal (non-inclusive) value for tracked parameter
(KiB, see --swap, --private, default: 0).
-f, --flat Flat output.
--debug Verbose operation mode.
For example, to find what hogs more than 500K swap in the system:
# ps_mem_details.py --flat --swap -n 500
memcached.-stats.private: 28.4 MiB
memcached.-stats.shared: 588.0 KiB
memcached.-stats.swap: 1.5 MiB
memcached.927.-cmdline: /usr/bin/memcached -p 11211 -l 127.0.0.1
memcached.927.[anon].private: 28.0 MiB
memcached.927.[anon].shared: 0
memcached.927.[anon].swap: 1.5 MiB
squid.-stats.private: 130.9 MiB
squid.-stats.shared: 1.2 MiB
squid.-stats.swap: 668.0 KiB
squid.1334.-cmdline: /usr/sbin/squid -NYC
squid.1334.[heap].private: 128.0 MiB
squid.1334.[heap].shared: 0
squid.1334.[heap].swap: 660.0 KiB
udevd.-stats.private: 368.0 KiB
udevd.-stats.shared: 796.0 KiB
udevd.-stats.swap: 748.0 KiB
...or what eats more than 20K in agetty pids (should be useful to see which .so
or binary "leaks" in a process):
# ps_mem_details.py --private --flat -n 20 agetty
agetty.-stats.private: 252.0 KiB
agetty.-stats.shared: 712.0 KiB
agetty.-stats.swap: 0
agetty.7606.-stats.private: 84.0 KiB
agetty.7606.-stats.shared: 712.0 KiB
agetty.7606.-stats.swap: 0
agetty.7606.-cmdline: /sbin/agetty tty3 38400
agetty.7606.[stack].private: 24.0 KiB
agetty.7606.[stack].shared: 0
agetty.7606.[stack].swap: 0
agetty.7608.-stats.private: 84.0 KiB
agetty.7608.-stats.shared: 712.0 KiB
agetty.7608.-stats.swap: 0
agetty.7608.-cmdline: /sbin/agetty tty4 38400
agetty.7608.[stack].private: 24.0 KiB
agetty.7608.[stack].shared: 0
agetty.7608.[stack].swap: 0
agetty.7609.-stats.private: 84.0 KiB
agetty.7609.-stats.shared: 712.0 KiB
agetty.7609.-stats.swap: 0
agetty.7609.-cmdline: /sbin/agetty tty5 38400
agetty.7609.[stack].private: 24.0 KiB
agetty.7609.[stack].shared: 0
agetty.7609.[stack].swap: 0
I've delayed update of the whole libnotify / notification-daemon /
notify-python stack for a while now, because notification-daemon got too
GNOME-oriented around 0.7, making it a lot more simpler, but sadly dropping
lots of good stuff I've used there.
Default nice-looking theme is gone in favor of black blobs (although colors
are probably subject to gtkrc); it's one-note-at-a-time only, which makes
reading them intolerable; configurability was dropped as well, guess blobs
follow some gnome-panel settings now.
Older notification-daemon versions won't build with newer libnotify.
Same problem with notify-python, which seem to be unnecessary now, since it's
functionality is accessible via introspection and
PyGObject (part known as
PyGI before merge - gi.repositories.Notify).
Looking for more-or-less drop-in replacements I've found
notipy project, which looked like what I
needed, and the best part is that it's python - no need to filter notification
requests in a proxy anymore, eliminating some associated complexity.
Project has a bit different goals however, them being simplicity, less deps
and concept separation, so I incorporated (more-or-less) notipy as a simple
NotificationDisplay class into notification-proxy, making it into
notification-thing
(first name that came to mind, not that it matters).
All the rendering now is in python using PyGObject (gi) / gtk-3.0 toolkit,
which seem to be a good idea, given that I still have no reason to keep Qt in
my system, and gtk-2.0 being obsolete.
Exploring newer Gtk stuff like
css styling and honest
auto-generated interfaces was fun, although the whole mess seem to be much
harder than expected. Simple things like adding a border, margins or some
non-solid background to existing widgets seem to be very complex and totally
counter-intuitive, unlike say, doing the same (even in totally cross-browser
fashion) with html. I also failed to find a way to just draw what I want on
arbitrary widgets, looks like it was removed (in favor of GtkDrawable) on
purpose.
My (uneducated) guess is that gtk authors geared toward "one way to do one
thing" philosophy, but unlike Python motto, they've to ditch the "one
*obvious* way" part. But then, maybe it's just me being too lazy to read
docs properly.
Looking over Desktop Notifications
Spec in process, I've
noticed that there are more good ideas that I'm not using, so guess I
might need to revisit local notification setup in the near future.
Usually I was using fabric to clone similar stuff to many machines, but since
I've been deploying
csync2 everywhere to
sync some web templates and I'm not the only one introducing changes, it
ocurred to me that it'd be great to use it for scripts as well.
Problem I see there is security - most scripts I need to sync are cronjobs
executed as root, so updating some script one one (compromised) machine with
"rm -Rf /*" and running csync2 to push this change to other machines will
cause a lot of trouble.
So I came up with simple way to provide one-time keys to csync2 hosts, which
will be valid only when I want them to.
Idea is to create FIFO socket in place of a key on remote hosts, then just
pipe a key into each socket while script is running on my dev
machine. Simplest form of such "pipe" I could come up with is an "ssh host
'cat >remote.key.fifo'", no fancy sockets, queues or protocols.
That way, even if one host is compromised changes can't be propagnated to
other hosts without access to fifo sockets there and knowing the right
key. Plus running sync for that "privileged" group accidentally will just
result in a hang 'till the script will push data to fifo socket - nothing will
break down or crash horribly, just wait.
Key can be spoofed of course, and sync can be timed to the moment the keys are
available, so the method is far from perfect, but it's insanely fast and
convenient.
Implementation is
fairly simple
twisted eventloop, spawning ssh
processes (guess
twisted.conch or stuff like
paramiko can be used for ssh implementation there, but
neither performance nor flexibility is an issue with ssh binary).
Script also (by default) figures out the hosts to connect to from the provided
group name(s) and the local copy of csync2 configuration file, so I don't have
to specify keep separate list of these or specify them each time.
As always, twisted makes it insanely simple to write such IO-parallel loop.
csync2 can be configured like this:
group sbin_sync {
host host1 host2;
key /var/lib/csync2/privileged.key;
include /usr/local/sbin/*.sh
}
And then I just run it with something like "./csync2_unlocker.py sbin_sync"
when I need to replicate updates between hosts.
Source.
I think in ideal world this shouldn't be happening, it really is a job for a
proper database engine.
Some filesystems (reiserfs,
pomegranate) are fairly good at dealing with
such use-cases though, but not the usual tools for working with fs-based data,
which generally suck all the time and resources on such a mess.
In my particular case, there's a (mostly) legacy system, which keeps such
tons-of-files db with ~5M files, taking about 5G of space, which have to be
backed-up somehow. Every file can be changed, added or unlinked, total
consistency between parts (like snapshotting the same point in time for every
file) is not necessary. Contents are (typically) php serializations (yuck!).
Tar and rsync are prime example of tools that aren't quite fit for the task -
both eat huge amounts of RAM (gigs) and time to do this, especially when you
have to make these backups incremental, and ideally this path should be
backed-up every single day.
Both seem to build some large and not very efficient list of existing files in
memory and then do a backup against that. Both aren't really good at capturing
the state - increments either take a huge amounts of space/inodes (with rsync
--link-dest) or loose info on removed entries (tar).
Nice off-the-shelf alternatives are
dar, which
is not a fs-to-stream packer, but rather squashfs-like image builder with the
ability to make proper incremental backups, and of course
mksquashfs itself, which supports append these days.
These sound nice, but somehow I failed to check for "append" support in
squashfs (although I remember hearing about it before), plus there's still
doesn't seem to be a way to remove paths.
Results turned out to be really good - 40min to back all this stuff up from
scratch and under 20min to do an incremental update (mostly comparing the
timestamps plus adding/removing new/unlinked keys). Implementation on top of
berkdb also turned out to be fairly straightorward (just 150 lines total!)
with just a little bit of optimization magic to put higter-level paths before
the ones nested inside (by adding \0 and \1 bytes before basename for
file/dir).
I still need to test it against dar and squashfs when I'll have more time (as
if that will ever happen) on my hands, but even such makeshift python
implementation (which includes full "extract" and "list" functionality though)
proven to be sufficient and ended up in a daily crontab.
So much for the infamous "don't keep the files in a database!" argument, btw,
wonder if original developers of this "db" used this hype to justify this
mess...
Obligatory proof-of-concept code link.
Update:tried mksquashfs, but quickly pulled a plug as it started to eat
more than 3G of RAM - sadly unfit for the task as well. dar also ate ~1G and
been at it for a few hours, guess no tool cares about such fs use-cases at all.
The biggest issue I have with
fossil scm is that
it's not
git - there are just too many advanced tools
which I got used to with git over time, which probably will never be
implemented in fossil just because of it's "lean single binary" philosophy.
And things get even worse when you need to bridge git-fossil repos - common
denominator here is git, so it's either constant "export-merge-import" cycle
or some hacks, since fossil doesn't support incremental export to a git repo
out of the box (but it does have
support for full import/export), and git
doesn't seem to have a plugin to track fossil remotes (yet?).
I thought of migrating away from fossil, but there's just no substitute
(although
quite a lot of attempts to implement that) for distributed issue tracking and
documentation right in the same repository and plain easy to access format
with a sensible web frontend for those who don't want to install/learn scm and
clone the repo just to file a ticket.
None of git-based tools I've been able to find seem to meet this (seemingly)
simple criterias, so dual-stack it is then.
Solution I came up with is real-time mirroring of all the changes in fossil
repositories to a git.
- watching fossil-path with inotify(7) for IN_MODIFY events (needs
pyinotify for that)
- checking for new revisions in fossil (source) repo against tip of a
git
- comparing these by timestamps, which are kept in perfect sync (by
fossil-export as well)
- exporting revisions from fossil as a full artifacts (blobs),
importing these into git via git-fast-import
It's also capable to do oneshot updates (in which case it doesn't need anything
but python-2.7, git and fossil), bootstrapping git mirrors as new fossil repos
are created and catching-up with their sync on startup.
While the script uses quite a low-level (but standard and documented here and there) scm
internals, it was actually very easy to write (~200 lines, mostly simple
processing-generation code), because both scms in question are built upon
principles of simple and robust design, which I deeply admire.
Limitation is that it only tracks one branch, specified at startup ("trunk",
by default), and doesn't care about the tags at the moment, but I'll probably
fix the latter when I'll do some tagging next time (hence will have a
realworld test case).
It's also trivial to make the script do two-way synchronization, since fossil
supports "fossil import --incremental" update right from git-fast-export, so
it's just a simple pipe, which can be run w/o any special tools on demand.
Script itself.
fossil_echo --help:
usage: fossil_echo [-h] [-1] [-s] [-c] [-b BRANCH] [--dry-run] [-x EXCLUDE]
[-t STAT_INTERVAL] [--debug]
fossil_root git_root
Tool to keep fossil and git repositories in sync. Monitors fossil_root for
changes in *.fossil files (which are treated as source fossil repositories)
and pushes them to corresponding (according to basename) git repositories.
Also has --oneshot mode to do a one-time sync between specified repos.
positional arguments:
fossil_root Path to fossil repos.
git_root Path to git repos.
optional arguments:
-h, --help show this help message and exit
-1, --oneshot Treat fossil_root and git_root as repository paths and
try to sync them at once.
-s, --initial-sync Do an initial sync for every *.fossil repository found
in fossil_root at start.
-c, --create Dynamically create missing git repositories (bare)
inside git-root.
-b BRANCH, --branch BRANCH
Branch to sync (must exist on both sides, default:
trunk).
--dry-run Dump git updates (fast-import format) to stdout,
instead of feeding them to git. Cancels --create.
-x EXCLUDE, --exclude EXCLUDE
Repository names to exclude from syncing (w/o .fossil
or .git suffix, can be specified multiple times).
-t STAT_INTERVAL, --stat-interval STAT_INTERVAL
Interval between polling source repositories for
changes, if there's no inotify/kevent support
(default: 300s).
--debug Verbose operation mode.
xdiskusage(1) is a simple and useful
tool to visualize disk space usage (a must-have thing in any admin's
toolkit!).
Probably the best thing about it is that it's built on top of "du" command, so
if there's a problem with free space on a remote X-less server, just "ssh
user@host 'du -k' | xdiskusage" and in a few moments you'll get the idea
where the space has gone to.
Lately though I've had problems building fltk, and noticed that xdiskusage is
the only app that uses it on my system, so I just got rid of both, in hopes
that I'll be able to find some lite gtk replacement (don't have qt either).
Maybe I do suck at googling (or just giving up too early), but
filelight (kde util),
baobab (gnome util) and
philesight (ruby) are pretty much the only
alternatives I've found. First one drags in half of the kde, second one - half
of gnome, and I don't really need ruby in my system either.
And for what? xdiskusage seem to be totally sufficient and much easier to
interpret (apparently it's a lot easier to compare lengths than angles for me)
than stupid round graphs that filelight and it's ruby clone produce, plus it
looks like a no-brainer to write.
There are some CLI alternatives as well, but this task is definitely outside
of CLI domain.
So I wrote this tool. Real source is
actually coffeescript, here, JS is compiled from it.
Initially I wanted to do this in python, but then took a break to read some
reddit and blogs, which just happened to push me in the direction of a
web. Good thing they did, too, as it turned out to be simple and
straightforward to work with graphics there these days.
I didn't use (much-hyped) html5 canvas though, since svg seem to be much more
fitting in html world, plus it's much easier to make it interactive (titles,
events, changes, etc).
Aside from the intended stuff, tool also shows performance shortcomings in
firefox and opera browsers - they both are horribly slow on pasting large text
into textarea (or iframe with "design mode") and just slow on rendering
svg. Google chrome is fairly good at both tasks.
Not that I'll migrate all my firefox addons/settings and habits to chrome
anytime soon, but it's certainly something to think about.
Also, JS calculations can probably be made hundred-times faster by caching
size of the traversed subtrees (right now they're recalculated gozillion times
over, and that's basically all the work).
I was just too lazy to do it initially and textarea pasting is still a lot
slower than JS, so it doesn't seem to be a big deal, but guess I'll do that
eventually anyway.