My blog_title

Feb 26, 2011

cgroups initialization, libcgroup and my ad-hoc replacement for it

Update 2019-10-02: This still works, but only for old cgroup-v1 interface, and is deprecated as such, plus largely unnecessary with modern systemd - see cgroup-v2 resource limits for apps with systemd scopes and slices post for more info.

Linux control groups (cgroups) rock.

If you've never used them at all, you bloody well should.

"git gc --aggressive" of a linux kernel tree killing you disk and cpu?

Background compilation makes desktop apps sluggish? Apps step on each others' toes? Disk activity totally kills performance?

I've lived with all of the above on the desktop in the (not so distant) past and cgroups just make all this shit go away - even forkbombs and super-multithreaded i/o can just be isolated in their own cgroup (from which there's no way to escape, not with any amount of forks) and scheduled/throttled (cpu hard-throttling - w/o cpusets - seem to be coming soon as well) as necessary.

Some problems with process classification and management of these limits seem to exist though.

Systemd does a great job of classification of everything outside of user session (i.e. all the daemons) - any rc/cgroup can be specified right in the unit files or set by default via system.conf.

And it also makes all this stuff neat and tidy because cgroup support there is not even optional - it's basic mechanism on which systemd is built, used to isolate all the processes belonging to one daemon or the other in place of hopefully-gone-for-good crappy and unreliable pidfiles. No renegade processes, leftovers, pids mixups... etc, ever.

Bad news however is that all the cool stuff it can do ends right there.
Classification is nice, but there's little point in it from resource-limiting
perspective without setting the actual limits, and systemd doesn't do that
(recent thread on systemd-devel).
Besides, no classification for user-started processes means that desktop users
are totally on their own, since all resource consumption there branches off
the user's fingertips. And desktop is where responsiveness actually matters
for me (as in "me the regular user"), so clearly something is needed to create
cgroups, set limits there and classify processes to be put into these cgroups.

libcgroup project looks like the remedy at
first, but since I started using it about a year ago, it's been nothing but
the source of pain.
First task that stands before it is to actually create cgroups' tree, mount
all the resource controller pseudo-filesystems and set the appropriate limits
there.
libcgroup project has cgconfigparser for that, which is probably the most
brain-dead tool one can come up with. Configuration is stone-age pain in the
ass, making you clench your teeth, fuck all the DRY principles and write
N*100 line crap for even the simplest tasks with as much punctuation as
possible to cram in w/o making eyes water.
Then, that cool toy parses the config, giving no indication where you messed
it up but the dreaded message like "failed to parse file". Maybe it's not
harder to get right by hand than XML, but at least XML-processing tools give
some useful feedback.

Syntax aside, tool still sucks hard when it comes to apply all the stuff
there - it either does every mount/mkdir/write w/o error or just gives you the
same "something failed, go figure" indication. Something being already mounted
counts as failure as well, so it doesn't play along with anything, including
systemd.
Worse yet, when it inevitably fails, it starts a "rollback" sequence,
unmounting and removing all the shit it was supposed to mount/create.
After killing all the configuration you could've had, it will fail
anyway. strace will show you why, of course, but if that's the feedback
mechanism the developers had in mind...

Surely, classification tools there can't be any worse than that? Wrong, they certainly are.

Maybe C-API is where the project shines, but I have no reason to believe that, and luckily I don't seem to have any need to find out.

Luckily, cgroups can be controlled via regular filesystem calls, and thoroughly documented (in Documentation/cgroups).

Anyways, my humble needs (for the purposes of this post) are:

isolate compilation processes, usually performed by "cave" client of paludis package mangler (exherbo) and occasionally shell-invoked make in a kernel tree, from all the other processes;
insulate specific "desktop" processes like firefox and occasional java-based crap from the rest of the system as well;
create all these hierarchies in a freezer and have a convenient stop-switch for these groups.

So, how would I initially approach it with libcgroup? Ok, here's the cgconfig.conf:

### Mounts

mount {
  cpu = /sys/fs/cgroup/cpu;
  blkio = /sys/fs/cgroup/blkio;
  freezer = /sys/fs/cgroup/freezer;
}

### Hierarchical RCs

group tagged/cave {
  perm {
    task {
      uid = root;
      gid = paludisbuild;
    }
    admin {
      uid = root;
      gid = root;
    }
  }

  cpu {
    cpu.shares = 100;
  }
  freezer {
  }
}

group tagged/desktop/roam {
  perm {
    task {
      uid = root;
      gid = users;
    }
    admin {
      uid = root;
      gid = root;
    }
  }

  cpu {
    cpu.shares = 300;
  }
  freezer {
  }
}

group tagged/desktop/java {
  perm {
    task {
      uid = root;
      gid = users;
    }
    admin {
      uid = root;
      gid = root;
    }
  }

  cpu {
    cpu.shares = 100;
  }
  freezer {
  }
}

### Non-hierarchical RCs (blkio)

group tagged.cave {
  perm {
    task {
      uid = root;
      gid = users;
    }
    admin {
      uid = root;
      gid = root;
    }
  }

  blkio {
    blkio.weight = 100;
  }
}

group tagged.desktop.roam {
  perm {
    task {
      uid = root;
      gid = users;
    }
    admin {
      uid = root;
      gid = root;
    }
  }

  blkio {
    blkio.weight = 300;
  }
}

group tagged.desktop.java {
  perm {
    task {
      uid = root;
      gid = users;
    }
    admin {
      uid = root;
      gid = root;
    }
  }

  blkio {
    blkio.weight = 100;
  }
}

Yep, it's huge, ugly and stupid.

Oh, and you have to do some chmods afterwards (more wrapping!) to make the "group ..." lines actually matter.

So, what do I want it to look like? This:

path: /sys/fs/cgroup

defaults:
  _tasks: root:wheel:664
  _admin: root:wheel:644
  freezer:

groups:

  base:
    _default: true
    cpu.shares: 1000
    blkio.weight: 1000

  tagged:
    cave:
      _tasks: root:paludisbuild
      _admin: root:paludisbuild
      cpu.shares: 100
      blkio.weight: 100

    desktop:
      roam:
        _tasks: root:users
        cpu.shares: 300
        blkio.weight: 300
      java:
        _tasks: root:users
        cpu.shares: 100
        blkio.weight: 100

It's parseable and readable YAML, not some parenthesis-semicolon nightmare of a C junkie (you may think that because of these spaces don't matter there btw... well, think again!).

After writing that config-I-like-to-see, I just spent a few hours to write a script to apply all the rules there while providing all the debugging facilities I can think of and wiped my system clean of libcgroup, it's that simple.

Didn't had to touch the parser again or debug it either (especially with - god forbid - strace), everything just worked as expected, so I thought I'd dump it here jic.

Configuration file above (YAML) consists of three basic definition blocks:

"path" to where cgroups should be initialized.
Names for the created and mounted rc's are taken right from "groups" and
"defaults" sections.
Yes, that doesn't allow mounting "blkio" resource controller to "cpu"
directory, guess I'll go back to using libcgroup when I'd want to do
that... right after seeing the psychiatrist to have my head examined...  if
they'd let me go back to society afterwards, that is.

"groups" with actual tree of group parameter definitions.
Two special nodes here - "_tasks" and "_admin" - may contain (otherwise the
stuff from "defaults" is used) ownership/modes for all cgroup knob-files
("_admin") and "tasks" file ("_tasks"), these can be specified as
"user[:group[:mode]]" (with brackets indicating optional definition, of
course) with non-specified optional parts taken from the "defaults" section.
Limits (or any other settings for any kernel-provided knobs there, for that
matter) can either be defined on per-rc-dict basis, like this:

roam:
  _tasks: root:users
  cpu:
    shares: 300
  blkio:
    weight: 300
    throttle.write_bps_device: 253:9 1000000

Or just with one line per rc knob, like this:

roam:
  _tasks: root:users
  cpu.shares: 300
  blkio.weight: 300
  blkio.throttle.write_bps_device: 253:9 1000000

Empty dicts (like "freezer" in "defaults") will just create cgroup in a named rc, but won't touch any knobs there.

And the "_default" parameter indicates that every pid/tid, listed in a root "tasks" file of resource controllers, specified in this cgroup, should belong to it. That is, act like default cgroup for any tasks, not classified into any other cgroup.

"defaults" section mirrors the structure of any leaf cgroup. RCs/parameters here will be used for created cgroups, unless overidden in "groups" section.

Script to process this stuff (cgconf) can be run with
--debug to dump a shitload of info about every step it takes (and why it does
that), plus with --dry-run flag to just dump all the actions w/o actually
doing anything.
cgconf can be launched as many times as needed to get the job done - it won't
unmount anything (what for? out of fear of data loss on a pseudo-fs?), will
just create/mount missing stuff, adjust defined permissions and set defined
limits without touching anything else, thus it will work alongside with
everything that can also be using these hierarchies - systemd, libcgroup,
ulatencyd, whatever... just set what you need to adjust in .yaml and it wll be
there after run, no side effects.
cgconf.yaml
(.yaml, generally speaking) file can be put alongside cgconf or passed via the
-c parameter.
Anyway, -h or --help is there, in case of any further questions.

That handles the limits and initial (default cgroup for all tasks) classification part, but then chosen tasks also need to be assigned to a dedicated cgroups.

libcgroup has pam_cgroup module and cgred daemon, neither of which can sensibly (re)classify anything within a user session, plus cgexec and cgclassify wrappers to basically do "echo $$ >/.../some_cg/tasks && exec $1" or just "echo" respectively.

These are dumb simple, nothing done there to make them any easier than echo, so even using libcgroup I had to wrap these.

Since I knew exactly which (few) apps should be confined to which groups, I just wrote a simple wrapper scripts for each, putting these in a separate dir, in the head of PATH. Example:

#!/usr/local/bin/cgrc -s desktop/roam/usr/bin/firefox

cgrc script here is a
dead-simple wrapper to parse cgroup parameter, putting itself into
corresponding cgroup within every rc where it exists, making special
conversion in case not-yet-hierarchical (there's a patchset for that though:
http://lkml.org/lkml/2010/8/30/30) blkio, exec'ing the specified binary with
all the passed arguments afterwards.
All the parameters after cgroup (or "-g ", for the sake of clarity) go to the
specified binary. "-s" option indicates that script is used in shebang, so
it'll read command from the file specified in argv after that and pass all the
further arguments to it.
Otherwise cgrc script can be used as "cgrc -g /usr/bin/firefox " or
"cgrc. /usr/bin/firefox ", so it's actually painless and effortless to use
this right from the interactive shell. Amen for the crappy libcgroup tools.

Another special use-case for cgroups I've found useful on many occasions is a "freezer" thing - no matter how many processes compilation (or whatever other cgroup-confined stuff) forks, they can be instantly and painlessly stopped and resumed afterwards.

cgfreeze dozen-liner script addresses this need in my case - "cgfreeze cave" will stop "cave" cgroup, "cgfreeze -u cave" resume, and "cgfreeze -c cave" will just show it's current status, see -h there for details. No pgrep, kill -STOP or ^Z involved.

Guess I'll direct the next poor soul struggling with libcgroup here, instead of wasting time explaining how to work around that crap and facing the inevitable question "what else is there?" *sigh*.

All the mentioned scripts can be found here.

posted on 2011-02-26 20:28 YEKT

Dec 29, 2010

Sane playback for online streaming video via stream dumping

I rarely watch footage from various conferences online, usually because I have some work to do and video takes much more dedicated time than the same thing just written on a webpage, making it basically a waste of time, but sometimes it's just fun.

Watching familiar "desktop linux complexity" holywar right on the stage of "Desktop on the Linux..." presentation of 27c3 (here's the dump, available atm, better one should probably appear in the Recordings section) certainly was, and since there are few other interesting topics on schedule (like DJB's talk about high-speed network security) and I have some extra time, I decided not to miss the fun.

Problem is, "watching stuff online" is even worse than just "watching stuff" - either you pay attention or you just miss it, so I set up recording as a sort of "fs-based cache", at the very least to watch the recorded streams right as they get written, being able to pause or rewind, should I need to do so.

Natural tool to do the job is mplayer, with it's "-streamdump" flag.
It works well up until some network (remote or local) error or just mplayer
glitch, which seem to happen quite often.
That's when mplayer crashes with funny "Core dumped ;)" line and if you're
watching the recorded stream atm, you'll certainly miss it at the time,
noticing the fuckup when whatever you're watching ends aburptly and the
real-time talk is already finished.
Somehow, I managed to forget about the issue and got bit by it soon enough.

So, mplayer needs to be wrapped in a while loop, but then you also need to give dump files unique names to keep mplayer from overwriting them, and actually do several such loops for several streams you're recording (different halls, different talks, same time), and then probably because of strain on the streaming servers mplayer tend to reconnect several times in a row, producing lots of small dumps, which aren't really good for anything, and you'd also like to get some feedback on what's happening, and... so on.

Well, I just needed a better tool, and it looks like there aren't much simple non-gui dumpers for video+audio streams and not many libs to connect to http video streams from python, existing one being vlc bindings, which isn't probably any better than mplayer, provided all I need is just to dump a stream to a file, without any reconnection or rotation or multi-stream handling mechanism.

To cut the story short I ended up writing a bit more complicated eventloop
script to control several mplayer instances, aggregating (and marking each
accordingly) their output, restarting failed ones, discarding failed dumps and
making up sensible names for the produced files.
It was a quick ad-hoc hack, so I thought to implement it straight through
signal handling and poll loop for the output, but thinking about all the async
calls and state-tracking it'd involve I quickly reconsidered and just used
twisted to shove all this mess under the rug, ending up with quick and simple
100-liner.
Script code,
twisted is required.

And now, back to the ongoing talks of day 3.

posted on 2010-12-29 11:09 YEKT

desktop python caching web

Dec 25, 2010

Commandline pulseaudio mixer tool

Some time ago I decided to check out pulseaudio project, and after a few docs it was controlling all the
sound flow in my system, since then I've never really looked back to pure
alsa.
At first I just needed the sound-over-network feature, which I've extensively
used a few years ago with esd, and pulse offered full transparency here, not
just limited support. Hell, it even has simple-client netcat'able stream
support, so there's no need to look for a client on alien OS'es.
Controllable and centralized resampling was the next nice feat, because some
apps (notably, audacious and aqualung) seemed to waste quite a lot of
resources on it in the past, either because of unreasonably-high quality or
just suboptimal alghorithms, I've never really cared to check. Alsa should be
capable to do that as well, but for some reason it failed me in this regard
before.

One major annoyance though was the abscence of a simple tool to control volume
levels.
pactl seem to be only good for muting the output, while the rest of pa-stuff
on the net seem to be based on either gtk or qt, while I needed something to
bind to a hotkeys and quickly run inside a readily-available terminal.
Maybe it's just an old habit of using alsamixer for this, but replacing it
with heavy gnome/kde tools for such a simple task seem unreasonable, so I
thought: since it's modern daemon with a lot of well-defined interfaces, why
not write my own?

I considered writing a simple hack around pacmd/pacli, but they aren't much machine-oriented and regex-parsing is not fun, so I found that newer (git or v1.0-dev) pulse versions have a nice dbus interface to everything.

Only problem there is that it doesn't really work, crashing pulse on any attempt to get some list from properties. Had to track down the issue, good thing it's fairly trivial to fix (just a simple revert), and then just hacked-up simple non-interactive tool to adjust sink volume by some percentage, specified on command line.

It was good enough for hotkeys, but I still wanted some nice alsamixer-like bars and thought it might be a good place to implement control per-stream sound levels as well, which is really another nice feature, but only as long as there's a way to actually adjust these levels, which there wasn't.

A few hours of python and learning curses and there we go:

ALSA plug-in [aplay] (fraggod@sacrilege:1424)        [ #############---------- ]
MPlayer (fraggod@sacrilege:1298)                     [ ####################### ]
Simple client (TCP/IP client from 192.168.0.5:49162) [ #########-------------- ]

Result was quite responsive and solid, which I kinda didn't expect from any sort of interactive interface.

Guess I may be not the only person in the world looking for a cli mixer, so I'd probably put the project up somewhere, meanwhile the script is available here.

The only deps are python-2.7 with curses support and dbus-python, which should come out of the box on any decent desktop system these days, anyway. List of command-line parameters to control sink level is available via traditional "-h" or "--help" option, although interactive stream levels tuning doesn't need any of them.

posted on 2010-12-25 21:55 YEKT

desktop python

Dec 15, 2010

os.listdir and os.walk in python without lists (by the grace of c-api generator) and recursion (custom stack)

As I got around to update some older crap in the my shared codebase (I mean mostly fgc by that), I've noticed that I use os.walk (although in most cases indirectly) in quite a few places, and it's implementation leaves a few things to be desired:

First of all, it's recursive, so it has to mirror fs nesing via python call stack, creating a frame objects for every level.

I've yet to see (or... I'd rather not see it, ever!) path structure deep enough to cause OOM problems or depleting stack-depth though, but I suppose fs limitations should be well above python's here.

Second thing is that it uses os.listdir, which, contrary to glibc/posix design of opendir(3), returns a list with all nodes in the given path.

Most modern filesystems have fairly loose limits on a number of nodes in a single path, and I actually tested how they handle creation and stat hits/misses for a paths with millions of entries (to check index-paths performance benefits) and the only filesystems with radically degrading performance in such cases were venerable ext2 (on linux, via jbd driver), ufs2 (on freeBSD) and similar, so it's not altogether impossible to stumble upon such path with os.walk and get a 1e6+ element list.

And another annoyance I've found there is it's weird interface - in nearly all cases I need to get nodes just one-by-one, so I'd be able to work with such pipeline with itertools or any other iterable-oriented stuff, but string and two lists is definitely not what I need in any case.

One good thing about the current os.walk I can see though, is that it shouldn't hold the dentry in cache any longer than necessary for a single scan, plus then it goes on to probe all the inodes there, which should be quite cache-friendly behavior, not taking into acount further usage of these entries.

Anyway, to put my mind at ease on the subject, and as a kind of exercise, I thought I'd fix all these issues.

At the lowest level, that's os.listdir, which I thought I'd replace with a simple generator. Alas, generators in py c-api aren't very simple, but certainly nothing to be afraid of either. Most (and probably the only) helpful info on the subject (with non-native C ppl in mind) was this answer on stack overflow, giving the great sample code.

In my case half of the work was done with opendir(3) in the initialization function, and the rest is just readdir(3) with '.' and '..' filtering and to-unicode conversion with PyObject struct holding the DIR pointer. Code can be found here.

Hopefully, it will be another working example of a more complex yet thin c-api usage to augment the python, if not the most elegant or killer-feature-implementing kind.

Recursion, on the other hand, can be solved entirely in python, all that's needed is to maintain the custom processing stack, mirroring the natural recursion pattern. "Depth" ordering control can be easily implemented by making stack double-edged (as collections.deque) and the rest is just a simple logic excercise.

Whole python-side implementation is in fgc.sh module here, just look for "def walk" in the source.

End-result is efficient iteration and simple clean iterable interface.

For some use-cases though, just a blind generator is suboptimal, including quite common ones like filtering - you don't need to recurse into some (and usually the most crowded) paths' contents if the filter already blocks the path itself.

And thanks to python's coroutine-like generators, it's not only possible, but trivial to implement - just check yield feedback value for the path, determining the further direction on it's basis (fgc.sh.crawl function, along with the regex-based filtering).

Don't get me wrong though, the whole thing doesn't really solve any real problem, thus is little more than a puritan excercise aka brainfart, although of course I'd prefer this implementation over the one in stdlib anyday.

Oh, and don't mind the title, I just wanted to give more keywords to the eye-of-google, since generators with python c-api aren't the most elegant and obvious thing, and google don't seem to be very knowledgeable on the subject atm.

posted on 2010-12-15 19:11 YEKT

unix python

Dec 11, 2010

zcat, bzcat, lzcat, xzcat... Arrrgh! Autodetection rocks

Playing with dracut today, noticed that it can create lzma-compressed initrd's without problem, but it's "lsinitrd" script uses zcat to access initrd data, thus failing for lzma or bzip2 compression.

Of course the "problem" is nothing new, and I've bumped against it a zillion times in the past, although it looks like today I was a bit less (or more?) lazy than usual and tried to seek a solution - some *cat tool, which would be able to read any compressed format without the need to specify it explicitly.

Finding nothing of the /usr/bin persuasion, I noticed that there's a fine libarchive project, which can do all sort of magic just for this purpose, alas there seem to be no cli client for it to utilize this magic, so I got around to write my own one.

These few minutes of happy-hacking probably saved me a lot of time in the long run, guess the result may as well be useful to someone else:

#include <archive.h>
#include <archive_entry.h>
#include <stdio.h>
#include <stdlib.h>

const int BS = 16384;

int main(int argc, const char **argv) {
    if (argc > 2) {
        fprintf(stderr, "Usage: %s [file]\n", argv[0]);
        exit(1); }

    struct archive *a = archive_read_new();
    archive_read_support_compression_all(a);
    archive_read_support_format_raw(a);

    int err;
    if (argc == 2) err = archive_read_open_filename(a, argv[1], BS);
    else err = archive_read_open_fd(a, 0, BS);
    if (err != ARCHIVE_OK) {
        fprintf(stderr, "Broken archive (1)\n");
        exit(1); }

    struct archive_entry *ae;
    err = archive_read_next_header(a, &ae);
    if (err != ARCHIVE_OK) {
        fprintf(stderr, "Broken archive (2)\n");
        exit(1); }

    (void) archive_read_data_into_fd(a, 1);

    archive_read_finish(a);
    exit(0);
}

Build it with "gcc -larchive excat.c -o excat" and use as "excat /path/to/something.{xz,gz,bz2,...}".

List of formats, supported by libarchive can be found here, note that it can also unpack something like file.gz.xz, although I have no idea why'd someont want to create such thing.

I've also created a project on sourceforge for it, in hopes that it'd save someone like me a bit of time with google-fu, but I doubt I'll add any new features here.

posted on 2010-12-11 06:04 YEKT

unix compression

Dec 09, 2010

Further improvements for notification-daemon

It's been a while since I augmented libnotify / notification-daemon stack to better suit my (maybe not-so-) humble needs, and it certainly was an improvement, but there's no limit to perfection and since then I felt the urge to upgrade it every now and then.

One early fix was to let messages with priority=critical through without delays and aggregation. I've learned it the hard way when my laptop shut down because of drained battery without me registering any notification about it.

Other good candidates for high priority seem to be real-time messages like emms track updates and network connectivity loss events which either too important to ignore or just don't make much sense after delay. Implementation here is straightforward - just check urgency level and pass these unhindered to notification-daemon.

Another important feature which seem to be missing in reference daemon is the
ability to just cleanup the screen of notifications. Sometimes you just need
to dismiss them to see the content beneath, or you just read them and don't
want them drawing any more attention.
The only available interface for that seem to be CloseNotification method,
which can only close notification message using it's id, hence only useful
from the application that created the note. Kinda makes sense to avoid apps
stepping on each others toes, but since id's in question are sequential, it
won't be much of a problem to an app to abuse this mechanism anyway.
Proxy script, sitting in the middle of dbus communication as it is, don't have
to guess these ids, as can just keep track of them.
So, to clean up the occasional notification-mess I extended the
CloseNotification method to accept 0 as a "special" id, closing all the
currently-displayed notifications.

Binding it to a key is just a matter of (a bit inelegant, but powerful) dbus-send tool invocation:

% dbus-send --type=method_call\
   --dest=org.freedesktop.Notifications\
   /org/freedesktop/Notifications\
   org.freedesktop.Notifications.CloseNotification uint32:0

Expanding the idea of occasional distraction-free needs, I found the idea of the ability to "plug" the notification system - collecting the notifications into the same digest behind the scenes (yet passing urgent ones, if this behavior is enabled) - when necessary quite appealing, so I just added a flag akin to "fullscreen" check, forcing notification aggregation regardless of rate when it's set.

Of course, some means of control over this flag was necessary, so another
extension of the interface was to add "Set" method to control
notification-proxy options. Method was also useful to occasionally toggle
special "urgent" messages treatment, so I empowered it to do so as well by
making it accept a key-value array of parameters to apply.
And since now there is a plug, I also found handy to have a complimentary
"Flush" method to dump last digested notifications.
Same handy dbus-send tool comes to rescue again, when these need to be toggled
or set via cli:

% dbus-send --type=method_call\
   --dest=org.freedesktop.Notifications\
   /org/freedesktop/Notifications\
   org.freedesktop.Notifications.Set\
   dict:string:boolean:plug_toggle,true

In contrast to cleanup, I occasionally found myself monitoring low-traffic IRC
conversations entirely through notification boxes - no point switching the
apps if you can read the whole lines right there, but there was a catch of
course - you have to immediately switch attention from whatever you're doing
to a notification box to be able to read it before it times out and
disappears, which of course is a quite inconvenient.
Easy solution is to just override "timeout" value in notification boxes to
make them stay as long as you need to, so one more flag for the "Set" method
to handle plus one-liner check and there it is.
Now it's possible to read them with minimum distraction from the current
activity and dismiss via mentioned above extended CloseNotification method.

As if the above was not enough, sometimes I found myself willing to read and react to the stuff from one set of sources, while temporarily ignoring the traffic from the others, like when you're working at some hack, discussing it (and the current implications / situation) in parallel over jabber or irc, while heated discussion (but interesting none the less) starts in another channel.

Shutting down the offending channel in ERC, leaving BNC to monitor the conversation or just supress notifications with some ERC command would probably be the right way to handle that, yet it's not always that simple, especially since every notification-enabled app then would have to implement some way of doing that, which of course is not the case at all.

Remedy is in the customizable filters for notifications, which can be a simple set of regex'es, dumped into some specific dot-file, but even as I started to implement the idea, I could think of several different validation scenarios like "match summary against several regexes", "match message body", "match simple regex with a list of exceptions" or even some counting and more complex logic for them.

Idea of inventing yet another perlish (poorly-designed, minimal, ambiguous, write-only) DSL for filtering rules didn't struck me as an exactly bright one, so I thought for looking for some lib implementation of clearly-defined and thought-through syntax for such needs, yet found nothing designed purely for such filtering task (could be one of the reasons why every tool and daemon hard-codes it's own DSL for that *sigh*).

On that note I thought of some generic yet easily extensible syntax for such
rules, and came to realization that simple SICP-like subset of scheme/lisp
with regex support would be exactly what I need.
Luckily, there are plenty implementations of such embedded languages in
python, and since I needed a really simple and customizabe one, I've decided
to stick with extended 90-line "lis.py",
described by Peter Norvig here and extended
here. Out goes unnecessary file-handling,
plus regexes and some minor fixes and the result is "make it into whatever you
need" language.
Just added a stat and mtime check on a dotfile, reading and compiling the
matcher-function from it on any change. Contents may look like this:

(define-macro define-matcher (lambda
  (name comp last rev-args)
  `(define ,name (lambda args
    (if (= (length args) 1) ,last
      (let ((atom (car args)) (args (cdr args)))
      (,comp
        (~ ,@(if rev-args '((car args) atom) '(atom (car args))))
        (apply ,name (cons atom (cdr args))))))))))

(define-matcher ~all and #t #f)
(define-matcher all~ and #t #t)
(define-matcher ~any or #f #f)
(define-matcher any~ or #f #t)
(lambda (summary body)
  (not (and
    (~ "^erc: #\S+" summary)
    (~ "^\*\*\* #\S+ (was created on|modes:) " body))
    (all~ summary "^erc: #pulseaudio$" "^mail:")))

Which kinda shows what can you do with it, making your own syntax as you go
along (note that stuff like "and" is also a macro, just defined on a higher
level).
Even with weird macros I find it much more comprehensible than rsync filters,
apache/lighttpd rewrite magic or pretty much any pseudo-simple magic set of
string-matching rules I had to work with.
I considered using python itself to the same end, but found that it's syntax
is both more verbose and less flexible/extensible for such goal, plus it
allows to do far too much for a simple filtering script which can potentially
be evaluated by process with elevated privileges, hence would need some sort
of sandboxing anyway.

In my case all this stuff is bound to convenient key shortcuts via fluxbox wm:

# Notification-proxy control
Print :Exec dbus-send --type=method_call\
    --dest=org.freedesktop.Notifications\
    /org/freedesktop/Notifications org.freedesktop.Notifications.Set\
    dict:string:boolean:plug_toggle,true
Shift Print :Exec dbus-send --type=method_call\
    --dest=org.freedesktop.Notifications\
    /org/freedesktop/Notifications org.freedesktop.Notifications.Set\
    dict:string:boolean:cleanup_toggle,true
Pause :Exec dbus-send --type=method_call\
    --dest=org.freedesktop.Notifications\
    /org/freedesktop/Notifications\
    org.freedesktop.Notifications.CloseNotification\
    uint32:0
Shift Pause :Exec dbus-send --type=method_call\
    --dest=org.freedesktop.Notifications\
    /org/freedesktop/Notifications\
    org.freedesktop.Notifications.Flush

Pretty sure there's more room for improvement in this aspect, so I'd have to extend the system once again, which is fun all by itself.

Resulting (and maybe further extended) script is here, now linked against a bit revised lis.py scheme implementation.

posted on 2010-12-09 03:58 YEKT

python desktop unix notification rate-limiting lisp

Dec 07, 2010

MooseFS usage experiences

It's been three months since I've replaced gluster with moose and I've had a few questions about it's performance so far.

Info on the subject in the internets is a bit scarce, so here goes my case. Keep in mind however that it's not a stress-benchmark of any kind and actually rather degenerate use-case, since loads aren't pushing hardware to any limits.

Guess I can say that it's quite remarkable in a way that it's really unremarkable - I just kinda forgot it's there, which is probably the best thing one can expect from a filesystem.

My setup is 4 physical nodes at most, with 1.3 TiB of data in fairly large
files (3361/52862 dirs/files, calculated average is about 250 MiB for a file).
Spanned fs hosts a storage for distfiles, media content, vm images and pretty
much anything that comes in the compressed form and worth keeping for further
usage. Hence the access to files is highly sequential in most cases (as in
reading gzip, listening to mp3, watching a movie, etc).
OS on the nodes is gentoo/exherbo/fedora mix and is a subject to constant
software updates, breakages and rebuilds. Naturally, mfs is proven to be quite
resilent in these conditions, since it doesn't depend on boost or other
volatile crap and just consists of several binaries and configuration files,
which work fine even with defaults right out of the box.
One node is a "master", eating 100 MiB RAM (115 VSZ) on the few-month average
(according to atop logs). Others have metalogger slaves which cost virtually
nothing (<3 MiB VSZ), so it's not a big deal to keep metadata fully-replicated
just in case.
Chunkservers have 500 GiB - 3 TiB space on btrfs. These usually hang on 10 MiB
RAM, occasional 50-100 MiB in VSZ, though it's not swapped-out, just unused.
Cpu usage for each is negligible, even though mfsmaster + mfsmount +
mfschunkserver node is Atom D510 on miniITX board.

mfsmount maintains persistent connection to master and on-demand to chunkservers.

It doesn't seem to mind if some of them are down though, so I guess it's perfectly possible to upload files via mfsmount to one (the only accessible) node and they'll be replicated to others from there (more details on that below), although I'm unsure what will happen when you'll try to retrieve chunks, stored exclusively on inaccessible nodes (guess

it's easy enough to test, anyway).

I use only one mfsmount on the same machine as master, and re-export (mostly for reading) it over NFS, SFTP, WebDAV and plain HTTP to other machines.

Re-export is there because that way I don't need access to all machines in cluster, which can be in a separate network (i.e. if I access fs from work), plus stuff like NFS comes out of the box (no need for separate client) and have a nice FS-Cache support, which saves a lot of bandwidth, webdav/sftp works for ms-os machines as well and server-based replication saves more precious bandwidth all by itself.

FS bandwidth in my case in constant ~1 MiB read 24/7 plus any on-demand reading on speeds, which are usually slower than any single hdd (over slower network links like 100 Mbps LAN and WiFi), and using only a few threads as well, so I'm afraid I can't give any real-world stress results here.

On a local bulk-copy operations to/from mfs mount though, disk always seem to be a bottleneck, with all other parameters far below any possible limitations, but in my case it's a simple "wd green" low-speed/noise high-capacity disks or seagate/hitachi disks with AAM threshold set to lowest level via "hdparm -M" (works well for sound, but

I never really cared about how it affects speed to check).

Chunkservers' storage consists of idexed (AA/AABCD...) paths, according to chunk names, which can be easily retreived from master. They rely on fs scanning to determine which chunks they have, so I've been able to successfully merge two nodes into one w/o storing the chunks on different filesystems/paths (which is also perfectly possible).

Chunkservers talk to each other on p2p-basis (doesn't imply that they don't need connection to master, but bandwidth there doesn't seem to be an issue at all) to maintain requested replication goal and auto-balance disk space between themselves, so the free percentage tries to be equal on all nodes (w/o compromising the goal, of course), so with goal=2 and 4 nodes I have 30% space usage on backend-fs on both 500 GiB node and 3 TiB one.

Balancing seem to be managed by every chunkserver in background (not quite sure if I've seen it in any docs, but there's a "chunk testing" process, which seem to imply that, and can be tuned btw), according to info about chunk and other currently-available nodes' space utilization from master.

Hence, adding/removing nodes is a bliss - just turn it on/off, no configuration changes for other nodes are necessary - master sees the change (new/lost connection) and all the chunkservers start relocating/getting the chunks to restore the balance and maintain the requested goal. In a few hours everything will be balanced again.

Whole approach seem superior to dumb round-robin of the chunks on creation or rehashing and relocating every one of them on single node failure, and suggests that it might be easy to implement custom replication and balancing scheme just by rsync'ing chunks between nodes as necessary (i.e. to make most of small ssd buffer, putting most-demanded files' chunks there).

And indeed I've utilized that feature twice to merge different nodes and filesystems, although the latter is not really necessary, since chunkserver can work with several storage paths on different filesystems, but it's just seem irrational to keep several btrfs trees these days, as they can even span to multiple devices.

But the best part, enabling me not to look further for alternatives, is the simple fact that I've yet to see any problem in the stability department - it still just works. mfsmount never refused to give or receive a file, node daemons never crashed or came back up with a weird inconsistency (which I don't think is easy to produce with such simple setup/design, anyway).

Connection between nodes has failed quite often - sometimes my NIC/switch/cables went to 30% packet loss for no apparent reason, sometimes I've messed up openswan and ipsec or some other network setup, shut down and hard-rebooted the nodes as necessary, but such failures were always unnoticeable here, without any need to restart anything on the mfs level - chunkservers just reconnect, forget obsolete chunks and keep going about their business.

Well, there *was* one exception: one time I've managed to hard-reboot a
master machine and noticed that mfsmaster has failed to start.
Problem was missing metadata.mfs file in /var/lib, which I believe is created
on mfsmaster stop and checkpointed every hour to .back file, so, knowing there
was no changes to fs in the last few minutes, I just removed the .back suffix
and everything started just fine.
Doing it The Right Way would've involved stopping any of the metalogger nodes
(or signaling it somehow) and retreiving this file from there, or just
starting master on that node, updating the mfsmaster ns entry, since they're
identical.

Of course, it's just a commodity hardware and lighter loads, but it's still way above other stuff I've tried here in virtually every aspect, so thumbs up for moose.

posted on 2010-12-07 22:22 YEKT

sysadmin nfs replication

Nov 12, 2010

Moar free time!

As of today, I'm unemployed once again.

Guess now I'll have time to debug and report a btrfs-systemd crash, read all the feeds, fix some long-standing issues on my home servers, update an antique web setup, write a few watch-notify scripts there, deploy/update a configuration management systems, update/finish/publish a few of my spare-time projects, start playing with a lot of new ideas, check out networking tools like connman, wicd, nm and a bunch of other cool-stuff oss projects, write a few hooks for plotting and graphing stuff in real-time, adapt emacs mail/rss tools, update other elisp stuff, emacs itself, a few symbian-to pc event hooks, check out gnustep environment, ltu lang articles, pybrain and a few other simple machine-learning implementations, some lua-ai for spring, play a lot of games I've missed in past few years, read a few dozen books I've already uploaded but never had a time to, study linear and geometric algebra... maybe find a new job, even, before I starve?

Nah, nobody in the world have that much time... ;)

posted on 2010-11-12 13:33 YEKT

rl epic

Nov 05, 2010

From Baselayout to Systemd setup on Exherbo

It's been more than a week since I've migrated from sysvinit and gentoo'ish baselayout scripts to systemd with it's units, and aside from few initial todos it's been surprisingly easy.

Nice guide for migration (which actually tipped me into trying systemd) can be found here, in this post I'd rather summarize my experiences.

Most distributions seem to take "the legacy" way of migration, starting all
the old initscripts from systemd just as sysinit did before that.
It makes some sense, since all the actions necessary to start the service are
already written there, but most of them are no longer necessary with systemd -
you don't need pidfiles, daemonization, killing code, LSB headers and most
checks for other stuff... which kinda leaves nothing at all for 95% of
software I've encountered!
I haven't really tried to adapt fedora or debian init for systemd (since my
setup runs exherbo), so I may be missing some
crucial points here, but it looks like even in these systems initscripts,
written in simple unaugmented *sh, are unnecessary evil, each one doing the
same thing in it's own crappy way.

With exherbo (or gentoo, for that matter), which has a bit more advanced init system, it's even harder to find some sense in keeping these scripts. Baselayout allows some cool stuff beyond simple LSB headers, but does so in it's own way, typical initscript here looks like this:

#!/sbin/runscript
depend() {
    use logger
    need clock hostname
    provide cron
}
start() {
    ebegin "Starting ${SVCNAME}"
    start-stop-daemon --start --pidfile ${FCRON_PIDFILE}\
      --exec /usr/sbin/fcron -- -c ${FCRON_CONF}
    eend $?
}
stop() {
    ebegin "Stopping ${SVCNAME}"
    start-stop-daemon --stop --pidfile ${FCRON_PIDFILE}
    eend $?
}

...with $SVCNAME taken from the script name and other vars from complimentary "/etc/conf.d/someservice" file (with sensible defaults in initscript itself).

Such script already allows nice and logged output (with e* commands) and clearly-defined, relatively uncluttered sections for startup and shutdown. You don't have to parse commandline arguments (although it's perfectly possible), since baselayout scripts will do that, and every daemon is accounted for via "start-stop-daemon" wrapper - it has a few simple ways to check their status via passed --pidfile or --exec lines, plus it handles forking (if necessary), IO redirection, dropping privileges and stuff like that.

All these feats lead to much more consistent init and control over services' state:

root@damnation:~# rc-status -a
Runlevel: shutdown
  killprocs        [ stopped ]
  savecache        [ stopped ]
  mount-ro         [ stopped ]
Runlevel: single
Runlevel: nonetwork
  local            [ started ]
Runlevel: cryptinit
  rsyslog          [ started ]
  ip6tables        [ started ]
...
  twistd           [ started ]
  local            [ started ]
Runlevel: sysinit
  dmesg            [ started ]
  udev             [ started ]
  devfs            [ started ]
Runlevel: boot
  hwclock          [ started ]
  lvm              [ started ]
...
  wdd              [ started ]
  keymaps          [ started ]
Runlevel: default
  rsyslog          [ started ]
  ip6tables        [ started ]
...
  twistd           [ started ]
  local            [ started ]
Dynamic Runlevel: hotplugged
Dynamic Runlevel: needed
  sysfs            [ started ]
  rpc.pipefs       [ started ]
...
  rpcbind          [ started ]
  rpc.idmapd       [ started ]
Dynamic Runlevel: manual

One nice colored list of everything that should be running, is running, failed to start, crashed and whatever. One look and you know if unscheduled reboot has any surprises for you. Weird that such long-lived and supported distros as debian and fedora make these simple tasks so much harder (chkconfig --list? You can keep it! ;)

Furthermore, it provides as many custom and named runlevels as you want, as a way to flip the state of the whole system with a painless one-liner.

Now, systemd provides all of these features, in a cleaner nicer form and much more, but that makes migration from one to the other actually harder.

Systemd is developed/tested mainly on and for fedora, so abscence of LSB headers in these scripts is a problem (no dependency information), and presence of other headers (which start another scripts w/o systemd help or permission) is even more serious problem.

start-stop-daemon interference is also redundant and actually harmful and so is e* (and other special bl-commands and wrappers), and they won't work w/o baselayout framework.

Thus, it makes sense for systemd on exherbo to be totally independent of baselayout and it's scripts, and having a separate package option to install systemd and baselayout-specific init stuff:

root@sacrilege:~# cave show -f acpid
* sys-power/acpid
   ::arbor   2.0.6-r2* {:0}
   ::installed  2.0.6-r2 {:0}
   sys-power/acpid-2.0.6-r2:0::installed
   Description
acpid is designed to notify user-space programs of ACPI events. It will
will attempt to connect to the Linux kernel via the input layer and
netlink. When an ACPI event is received from one of these sources, acpid
will examine a list of rules, and execute the rules that match the event.
   Homepage  http://tedfelix.com/linux/acpid-netlink.html
   Summary  A configurable ACPI policy daemon for Linux
   From repositories arbor
   Installed time Thu Oct 21 23:11:55 YEKST 2010
   Installed using paludis-0.55.0-git-0.54.2-44-g203a470
   Licences  GPL-2
   Options  (-baselayout) (systemd) build_options: -trace

  sys-power/acpid-2.0.6-r2:0::arbor
   Homepage  http://tedfelix.com/linux/acpid-netlink.html
   Summary  A configurable ACPI policy daemon for Linux
   Description
acpid is designed to notify user-space programs of ACPI events. It will
will attempt to connect to the Linux kernel via the input layer and
netlink. When an ACPI event is received from one of these sources, acpid
will examine a list of rules, and execute the rules that match the event.
   Options  -baselayout systemd
     build_options: -recommended_tests split strip jobs -trace -preserve_work
   Overridden Masks
     Supported platforms ~amd64 ~x86

So, basically, the migration to systemd consists of enabling the option and flipping the "eclectic init" switch:

root@sacrilege:~# eclectic init list
Available providers for init:
 [1] systemd *
 [2] sysvinit

Of course, in reality things are little more complicated, and breaking init is quite undesirable prospect, so I took advantage of virtualization capabilities of cpu on my new laptop and made a complete virtual replica of the system.

Things got a bit more complicated since dm-crypt/lvm setup I've described before, but overally creating such a vm is trivial:

A dedicated lv for whole setup.
luksFormat it, so it'd represent an encrypted "raw" partition.
pvcreate / vgcreate / lvcreate / mkfs on top of it, identical (although much smaller) to original system.
A script to mount all these and rsync the "original" system to this replica, with a few post-sync hooks to make some vm-specific changes - different vg name, no extra devices for media content, simpler passwords.

I have this script here, list of "exclusions" for rsync is actually taken from backup scripts, since it's designed to omit various heavy and non-critical paths like caches, spools and debugging info, plus there's not much point syncing most /home contents. All in all, whole setup is about 2-3G and rsync makes a fast job of updating it.

vm (qemu-kvm) startup is right there in the script and uses exactly the same kernel/initrd as the host machine, although I skip encryption part (via kernel cmdline) for faster bootup.

And the first launch gave quite a mixed result: systemd fired a bunch of basic stuff at once, then hanged for about a minute before presenting a getty. After login, it turned out that none of the filesystems in /etc/fstab got mounted.

Systemd handles mounts in quite a clever (and fully documented) way - from each device in fstab it creates a "XXX.device" unit, "fsck@XXX.service", and either "XXX.mount" or "XXX.automount" from mountpoints (depending on optional "comment=" mount opts). All the autogenerated "XXX.mount" units without explicit "noauto" option will get started on boot.

And they do get started, hence that hang. Each .mount, naturally, depends on corresponding .device unit (with fsck in between), and these are considered started when udev issues an event.

In my case, even after exherbo-specific lvm2.service, which does vgscan and vgchange -ay stuff, these events are never generated, so .device units hang for 60 seconds and systemd marks them as "failed" as well as dependent .mount units.

It looks like my local problem, since I actually activate and use these in initrd, so I just worked around it by adding "ExecStart=-/sbin/udevadm trigger --subsystem-match=block --sysname-match=dm-*" line to lvm2.service. That generated the event in parallel to still-waiting .device units, so they got started, then fsck, then just mounted.

While this may look a bit like a problem, it's quite surprising how transparent and easy-to-debug whole process is, regardless of it's massively-parallel nature - all the information is available via "systemctl" and it's show/status commands, all the services are organized (and monitored) in systemd-cgls tree, and can be easily debugged with systemd monitoring and console/dmesg-logging features:

root@sacrilege:~# systemd-cgls
├ 2 [kthreadd]
├ 3 [ksoftirqd/0]
├ 6 [migration/0]
├ 7 [migration/1]
├ 9 [ksoftirqd/1]
├ 10 [kworker/0:1]
...
├ 2688 [kworker/0:2]
├ 2700 [kworker/u:0]
├ 2728 [kworker/u:2]
├ 2729 [kworker/u:4]
├ user
│ └ fraggod
│ └ no-session
│ ├ 1444 /bin/sh /usr/bin/startx
│ ├ 1462 xinit /home/fraggod/.xinitrc -- /etc/X11/xinit/xserverrc :0 -auth /home/fraggod/.serveraut...
...
│ ├ 2407 ssh root@anathema -Y
│ └ 2751 systemd-cgls
└ systemd-1
 ├ 1 /sbin/init
 ├ var-src.mount
 ├ var-tmp.mount
 ├ ipsec.service
 │ ├ 1059 /bin/sh /usr/lib/ipsec/_plutorun --debug --uniqueids yes --force_busy no --nocrsend no --str...
 │ ├ 1060 logger -s -p daemon.error -t ipsec__plutorun
 │ ├ 1061 /bin/sh /usr/lib/ipsec/_plutorun --debug --uniqueids yes --force_busy no --nocrsend no --str...
 │ ├ 1062 /bin/sh /usr/lib/ipsec/_plutoload --wait no --post
 │ ├ 1064 /usr/libexec/ipsec/pluto --nofork --secretsfile /etc/ipsec.secrets --ipsecdir /etc/ipsec.d -...
 │ ├ 1069 pluto helper # 0
 │ ├ 1070 pluto helper # 1
 │ ├ 1071 pluto helper # 2
 │ └ 1223 _pluto_adns
 ├ sys-kernel-debug.mount
 ├ var-cache-fscache.mount
 ├ net@.service
 ├ rpcidmapd.service
 │ └ 899 /usr/sbin/rpc.idmapd -f
 ├ rpcstatd.service
 │ └ 892 /sbin/rpc.statd -F
 ├ rpcbind.service
 │ └ 890 /sbin/rpcbind -d
 ├ wpa_supplicant.service
 │ └ 889 /usr/sbin/wpa_supplicant -c /etc/wpa_supplicant/wpa_supplicant.conf -u -Dwext -iwlan0
 ├ cachefilesd.service
 │ └ 883 /sbin/cachefilesd -n
 ├ dbus.service
 │ └ 784 /usr/bin/dbus-daemon --system --address=systemd: --nofork --systemd-activation
 ├ acpid.service
 │ └ 775 /usr/sbin/acpid -f
 ├ openct.service
 │ └ 786 /usr/sbin/ifdhandler -H -p etoken64 usb /dev/bus/usb/002/003
 ├ ntpd.service
 │ └ 772 /usr/sbin/ntpd -u ntp:ntp -n -g -p /var/run/ntpd.pid
 ├ bluetooth.service
 │ ├ 771 /usr/sbin/bluetoothd -n
 │ └ 1469 [khidpd_046db008]
 ├ syslog.service
 │ └ 768 /usr/sbin/rsyslogd -n -c5 -6
 ├ getty@.service
 │ ├ tty1
 │ │ └ 1451 /sbin/agetty 38400 tty1
 │ ├ tty3
 │ │ └ 766 /sbin/agetty 38400 tty3
 │ ├ tty6
 │ │ └ 765 /sbin/agetty 38400 tty6
 │ ├ tty5
 │ │ └ 763 /sbin/agetty 38400 tty5
 │ ├ tty4
 │ │ └ 762 /sbin/agetty 38400 tty4
 │ └ tty2
 │ └ 761 /sbin/agetty 38400 tty2
 ├ postfix.service
 │ ├ 872 /usr/lib/postfix/master
 │ ├ 877 qmgr -l -t fifo -u
 │ └ 2631 pickup -l -t fifo -u
 ├ fcron.service
 │ └ 755 /usr/sbin/fcron -f
 ├ var-cache.mount
 ├ var-run.mount
 ├ var-lock.mount
 ├ var-db-paludis.mount
 ├ home-fraggod-.spring.mount
 ├ etc-core.mount
 ├ var.mount
 ├ home.mount
 ├ boot.mount
 ├ fsck@.service
 ├ dev-mapper-prime\x2dswap.swap
 ├ dev-mqueue.mount
 ├ dev-hugepages.mount
 ├ udev.service
 │ ├ 240 /sbin/udevd
 │ ├ 639 /sbin/udevd
 │ └ 640 /sbin/udevd
 ├ systemd-logger.service
 │ └ 228 //lib/systemd/systemd-logger
 └ tmp.mount

root@sacrilege:~# systemctl status ipsec.service
ipsec.service - IPSec (openswan)
  Loaded: loaded (/etc/systemd/system/ipsec.service)
  Active: active (running) since Fri, 05 Nov 2010 15:16:54 +0500; 2h 16min ago
  Process: 981 (/usr/sbin/ipsec setup start, code=exited, status=0/SUCCESS)
  Process: 974 (/bin/sleep 10, code=exited, status=0/SUCCESS)
  CGroup: name=systemd:/systemd-1/ipsec.service
   ├ 1059 /bin/sh /usr/lib/ipsec/_plutorun --debug --uniqueids yes --force_busy no --noc...
   ├ 1060 logger -s -p daemon.error -t ipsec__plutorun
   ├ 1061 /bin/sh /usr/lib/ipsec/_plutorun --debug --uniqueids yes --force_busy no --noc...
   ├ 1062 /bin/sh /usr/lib/ipsec/_plutoload --wait no --post
   ├ 1064 /usr/libexec/ipsec/pluto --nofork --secretsfile /etc/ipsec.secrets --ipsecdir ...
   ├ 1069 pluto helper # 0
   ├ 1070 pluto helper # 1
   ├ 1071 pluto helper # 2
   └ 1223 _pluto_adns

It's not just hacking at some opaque *sh hacks (like debian init or even interactive-mode baselayout) and takes so little effort to the point that it's really enjoyable process.

But making it mount and start all the default (and available) stuff is not the
end of it, because there are plenty of services not yet adapted to systemd.
I actually expected some (relatively) hard work here, because there are quite
a few initscripts in /etc/init.d, even on a desktop machine, but once again, I
was in for a nice surprise, since systemd just makes all the work go away. All
you need to do is to decide on the ordering (or copy it from baselayout
scripts) and put an appropriate "Type=" and "ExecStart=" lines in .service
file. That's all there is, really.
After that, of course, complete bootup-shutdown test on a vm is in order, and
everything "just works" as it is supposed to.
Bootup on a real hardware is exactly the same as vm, no surprises here.
"udevadm trigger" seem to be necessary as well, proving validity of vm model.

Systemd boot time is way faster than sysvinit, as it is supposed to, although I don't really care, since reboot is seldom necessary here.

As a summary, I'd recommend everyone to give systemd a try, or at least get familiar with it's rationale and features (plus this three-part blog series: one, two, three). My units aren't perfect (and I'll probably update network-related stuff to use ConnMan), but if you're lazy, grab them here. Also, here is a repo with units for archlinux, which I loosely used as a reference point along with /lib/systemd contents.

posted on 2010-11-05 13:27 YEKT

desktop exherbo sysadmin unix systemd

Sep 12, 2010

Info feeds

Thanks to feedjack, I'm able to keep in sync with 120 feeds (many of them, like slashdot or reddit, being an aggregates as well), as of today. Quite a lot of stuff I couldn't even imagine handling a year ago, and a good aggregator definitely helps, keeping all the info just one click away.

And every workstation-based (desktop) aggregator I've seen is a fail:

RSSOwl. Really nice interface and very powerful. That said, it eats more ram than a firefox!!! Hogs CPU till the whole system stutters, and eats more of it than every other app I use combined (yes, including firefox). Just keeping it in the background costs 20-30% of dualcore cpu. Changing "show new" to "show all" kills the system ;)
liferea. Horribly slow, interface hangs on any action (like fetching feed "in the background"), hogs cpu just as RSSOwl and not quite as feature-packed.
Claws-mail's RSSyl. Quite nice resource-wise and very responsive, unlike dedicated software (beats me why). Pity it's also very limited interface-wise and can't reliably keep track of many of feeds by itself, constantly loosing a few if closed non-properly (most likely it's a claws-mail fault, since it affects stuff like nntp as well).
Emacs' gnus and newsticker. Good for a feed or two, epic fail in every way with more dozen of them.
Various terminal-based readers. Simply intolerable.

Server-based aggregator on the other hand is a bliss - any hoards of stuff as you want it, filtered, processed, categorized and re-exported to any format (same rss, but not a hundred of them, for any other reader works as well) and I don't give a damn about how many CPU-hours it spends doing so (yet it tend to be very few, since processing and storage is done via production-grade database and modules, not some crappy ad-hoc wheel re-invention).

And it's simple as a doorknob, so any extra functionality can be added with no effort.

Maybe someday I'll get around to use something like Google Reader, but it's still one hell of a mess, and it's no worse than similar web-based services out there. So much for the cloud services. *sigh*

posted on 2010-09-12 10:54 YEKT

web syndication

← Previous Next → Page 15 of 17