Dec 29, 2015

Getting log of all function calls from specific source file using gdb

Maybe I'm doing debugging wrong, but messing with code written by other people, first question for me is usually not "what happens in function X" (done by setting a breakpoint on it), but rather "which file/func do I look into".

I.e. having an observable effect - like "GET_REPORT messages get sent on HID level to bluetooth device, and replies are ignored", it's easy to guess that it's either linux kernel or bluetoothd - part of BlueZ.

Question then becomes "which calls in app happen at the time of this observable effect", and luckily there's an easy, but not very well-documented (unless my google is that bad) way to see it via gdb for C apps.

For scripts, it's way easier of course, e.g. in python you can do python -m trace ... and it can dump even every single line of code it runs.

First of all, app in question has to be compiled with "-g" option and not "stripped", of course, which should be easy to set via CFLAGS, usually, defining these in distro-specific ways if rebuilding a package to include that (e.g. for Arch - have debug !strip in OPTIONS line from /etc/makepkg.conf).

Then running app under gdb can be done via something like gdb --args someapp arg1 arg2 (and typing "r" there to start it), but if the goal is to get a log of all function calls (and not just in a "call graph" way profiles like gprof do) from a specific file, first - interactivity has to go, second - breakpoints have to be set for all these funcs and then logged when app runs.

Alas, there seem to be no way to add break point to every func in a file.

One common suggestion (does NOT work, don't copy-paste!) I've seen is doing rbreak device\.c: ("rbreak" is a regexp version of "break") to match e.g. profiles/input/device.c:extract_hid_record (as well as all other funcs there), which would be "filename:funcname" pattern in my case, but it doesn't work and shouldn't work, as "rbreak" only matches "filename".

So trivial script is needed to a) get list of funcs in a source file (just name is enough, as C has only one namespace), and b) put a breakpoint on all of them.

This is luckily quite easy to do via ctags, with this one-liner:

% ctags -x --c-kinds=fp profiles/input/device.c |
  awk 'BEGIN {print "set pagination off\nset height 0\nset logging on\n\n"}\
    {print "break", $1 "\ncommands\nbt 5\necho ------------------\\n\\n\nc\nend\n"}\
    END {print "\n\nrun"}' > gdb_device_c_ftrace.txt

Should generate a script for gdb, starting with "set pagination off" and whatever else is useful for logging, with "commands" block after every "break", running "bt 5" (displays backtrace) and echoing a nice-ish separator (bunch of hyphens), ending in "run" command to start the app.

Resulting script can/should be fed into gdb with something like this:

% gdb -ex 'source gdb_device_c_ftrace.txt' -ex q --args\
  /usr/lib/bluetooth/bluetoothd --nodetach --debug

This will produce the needed list of all the calls to functions from that "device.c" file into "gdb.txt" and have output of the app interleaved with these in stdout/stderr (which can be redirected, or maybe closed with more gdb commands in txt file or before it with "-ex"), and is non-interactive.

From here, seeing where exactly the issue seem to occur, one'd probably want to look thru the code of the funcs in question, run gdb interactively and inspect what exactly is happening there.

Definitely nowhere near the magic some people script gdb with, but haven't found similar snippets neatly organized anywhere else, so here they go, in case someone might want to do the exact same thing.

Can also be used to log a bunch of calls from multiple files, of course, by giving "ctags" more files to parse.

Dec 07, 2015

Resizing first FAT32 partition to microSD card size on boot from Raspberry Pi

One other thing I've needed to do recently is to have Raspberry Pi OS resize its /boot FAT32 partition to full card size (i.e. "make it as large as possible") from right underneath itself.

RPis usually have first FAT (fat16 / fat32 / vfat) partition needed by firmware to load config.txt and uboot stuff off, and that is the only partition one can see in Windows OS when plugging microSD card into card-reader (which is a kinda arbitrary OS limitation).

Map of the usual /dev/mmcblk0 on RPi (as seen in parted):

Number  Start   End     Size    Type     File system  Flags
        32.3kB  1049kB  1016kB           Free Space
 1      1049kB  106MB   105MB   primary  fat16        lba
 2      106MB   1887MB  1782MB  primary  ext4

Resizing that first partition is naturally difficult, as it is followed by ext4 one with RPi's OS, but when you want to have small (e.g. <2G) and easy-to-write "rpi.img" file for any microSD card, there doesn't seem to be a way around that - img have to have as small initial partitions as possible to fit on any card.

Things get even more complicated by the fact that there don't seem to be any tools around for resizing FAT fs'es, so it has to be re-created from scratch.

There is quite an easy way around all these issues however, which can be summed-up as a sequence of the following steps:

  • Start while rootfs is mounted read-only or when it can be remounted as such, i.e. on early boot.

    Before=systemd-remount-fs.service local-fs-pre.target in systemd terms.

  • Grab sfdisk/parted map of the microSD and check if there's "Free Space" chunk left after last (ext4/os) partition.

    If there is, there's likely a lot of it, as SD cards increase in 2x size factors, so 4G image written on larger card will have 4+ gigs there, in fact a lot more for 16G or 32G cards.

    Or there can be only a few megs there, in case of matching card size, where it's usually a good idea to make slightly smaller images, as actual cards do vary in size a bit.

  • "dd" whole rootfs to the end of the microSD card.

    This is safe with read-only rootfs, and dumb "dd" approach to copying it (as opposed to dmsetup + mkfs + cp) seem to be simpliest and least error-prone.

  • Update partition table to have rootfs in the new location (at the very end of the card) and boot partition covering rest of the space.

  • Initiate reboot, so that OS will load from the new rootfs location.

  • Starting on early-boot again, remount rootfs rw if necessary, temporary copy all contents of boot partition (which should still be small) to rootfs.

  • Run mkfs.vfat on the new large boot partition and copy stuff back to it from rootfs.

  • Reboot once again, in case whatever boot timeouts got triggered.

  • Avoid running same thing on all subsequent boots.

    E.g. touch /etc/boot-resize-done and have ConditionPathExists=!/etc/boot-resize-done in the systemd unit file.

That should do it \o/

resize-rpi-fat32-for-card (in fgtk repo) is a script I wrote to do all of this stuff, exactly as described.

systemd unit file for the thing (can also be printed by running script with "--print-systemd-unit" option):

[Unit]
DefaultDependencies=no
After=systemd-fsck-root.service
Before=systemd-remount-fs.service -.mount local-fs-pre.target local-fs.target
ConditionPathExists=!/etc/boot-resize-done

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/local/bin/resize-rpi-fat32-for-card

[Install]
WantedBy=local-fs.target

It does use lsblk -Jnb JSON output to get rootfs device and partition, and get whether it's mounted read-only, then parted -ms /dev/... unit B print free to grab machine-readable map of the device.

sfdisk -J (also JSON output) could've been better option than parted (extra dep, which is only used to get that one map), except it doesn't conveniently list "free space" blocks and device size, pity.

If partition table doesn't have extra free space at the end, "fsstat" tool from sleuthkit is used to check whether FAT filesystem covers whole partition and needs to be resized.

After that, and only if needed, either "dd + sfdisk" or "cp + mkfs.vfat + cp back" sequence gets executed, followed by a reboot command.

Extra options for the thing:

  • "--skip-ro-check" - don't bother checkin/forcing read-only rootfs before "dd" step, which should be fine, if there's no activity there (e.g. early boot).

  • "--done-touch-file" - allows to specify location of file to create (if missing) when "no resize needed" state gets reached.

    Script doesn't check whether this file exists and always does proper checks of partition table and "fsstat" when deciding whether something has to be done, only creates the file at the end (if it doesn't exist already).

  • "--overlay-image" uses splash.go tool that I've mentioned earlier (be sure to compile it first, ofc) to set some "don't panic, fs resize in progress" image (resized/centered and/or with text and background) during the whole process, using RPi's OpenVG GPU API, covering whatever console output.

  • Misc other stuff for setup/debug - "--print-systemd-unit", "--debug", "--reboot-delay".

    Easy way to debug the thing with these might be to add StandardOutput=tty to systemd unit's Service section and ... --debug --reboot-delay 60 options there, or possibly adding extra ExecStart=/bin/sleep 60 after the script (and changing its ExecStart= to ExecStart=-, so delay will still happen on errors).

    This should provide all the info on what's happening in the script (has plenty of debug output) to the console (one on display or UART).

One more link to the script: resize-rpi-fat32-for-card

Apr 11, 2015

Skype setup on amd64 without multilib/multiarch/chroot

Did a kinda-overdue migration of a desktop machine to amd64 a few days ago.
Exherbo has multiarch there, but I didn't see much point in keeping (and maintaining in various ways) a full-blown set of 32-bit libs just for Skype, which I found that I still need occasionally.

Solution I've used before (documented in the past entry) with just grabbing 32-bit Skype binary and full set of libs it needs from whatever distro still works and applies here, not-so-surprisingly.

What I ended up doing is:

  • Grab the latest Fedora "32-bit workstation" iso (Fedora-Live-Workstation-i686-21-5.iso).

  • Install/run it on a virtual machine (plain qemu-kvm).

  • Download "Dynamic" Skype version (distro-independent tar.gz with files) from Skype site to/on a VM, "tar -xf" it.

  • ldd skype-4.3.0.37/skype | grep 'not found' to see which dependency-libs are missing.

  • Install missing libs - yum install qtwebkit libXScrnSaver

  • scp build_skype_env.bash (from skype-space repo that I have from old days of using skype + bitlbee) to vm, run it on a skype-dir - e.g. ./build_skype_env.bash skype-4.3.0.37.

    Should finish successfully and produce "skype_env" dir in the current path.

  • Copy that "skype_env" dir with all the libs back to pure-amd64 system.

  • Since skype binary has "/lib/ld-linux.so.2" as a hardcoded interpreter (as it should be), and pure-amd64 system shouldn't have one (not to mention missing multiarch prefix) - patch it in the binary with patchelf:

    patchelf --set-interpreter ./ld-linux.so.2 skype
    
  • Run it (from that env dir with all the libs):

    LD_LIBRARY_PATH=. ./skype --resources=.
    

    Should "just work" \o/

One big caveat is that I don't care about any features there except for simple text messaging, which is probably not how most people use Skype, so didn't test if e.g. audio would work there.
Don't think sound should be a problem though, especially since iirc modern skype could use pulseaudio (or even using it by default?).

Given that skype itself a huge opaque binary, I do have AppArmor profile for the thing (uses "~/.Skype/env/" dir for bin/libs) - home.skype.

Mar 25, 2015

gnuplot for live "last 30 seconds" sliding window of "free" (memory) data

Was looking at a weird what-looks-like-a-memleak issue somewhere in the system on changing desktop background (somewhat surprisingly complex operation btw) and wanted to get a nice graph of "last 30s of free -m output", with some labels and easy access to data.

A simple enough task for gnuplot, but resulting in a somewhat complicated solution, as neither "free" nor gnuplot are perfect tools for the job.

First thing is that free -m -s1 doesn't actually give a machine-readable data, and I was too lazy to find something better (should've used sysstat and sar!) and thought "let's just parse that with awk":

free -m -s $interval |
  awk '
    BEGIN {
      exports="total used free shared available"
      agg="free_plot.datz"
      dst="free_plot.dat"}
    $1=="total" {
      for (n=1;n<=NF;n++)
        if (index(exports,$n)) headers[n+1]=$n }
    $1=="Mem:" {
      first=1
      printf "" >dst
      for (n in headers) {
        if (!first) {
          printf " " >>agg
          printf " " >>dst }
        printf "%d", $n >>agg
        printf "%s", headers[n] >>dst
        first=0 }
      printf "\n" >>agg
      printf "\n" >>dst
      fflush(agg)
      close(dst)
      system("tail -n '$points' " agg " >>" dst) }'

That might be more awk than one ever wants to see, but I imagine there'd be not too much space to wiggle around it, as gnuplot is also somewhat picky in its input (either that or you can write same scripts there).

I thought that visualizing "live" stream of data/measurements would be kinda typical task for any graphing/visualization solution, but meh, apparently not so much for gnuplot, as I haven't found better way to do it than "reread" command.

To be fair, that command seem to do what I want, just not in a much obvious way, seamlessly updating output in the same single window.

Next surprising quirk was "how to plot only last 30 points from big file", as it seem be all-or-nothing with gnuplot, and googling around, only found that people do it via the usual "tail" before the plotting.

Whatever, added that "tail" hack right to the awk script (as seen above), need column headers there anyway.

Then I also want nice labels - i.e.:

  • How much available memory was there at the start of the graph.
  • How much of it is at the end.
  • Min for that parameter on the graph.
  • Same, but max.
stats won't give first/last values apparently, unless I missed those in the PDF (only available format for up-to-date docs, le sigh), so one solution I came up with is to do a dry-run plot command with set terminal unknown and "grab first value" / "grab last value" functions to "plot".
Which is not really a huge deal, as it's just a preprocessed batch of 30 points, not a huge array of data.

Ok, so without further ado...

src='free_plot.dat'
y0=100; y1=2000;
set xrange [1:30]
set yrange [y0:y1]

# --------------------
set terminal unknown
stats src using 5 name 'y' nooutput

is_NaN(v) = v+0 != v
y_first=0
grab_first_y(y) = y_first = y_first!=0 && !is_NaN(y_first) ? y_first : y
grab_last_y(y) = y_last = y

plot src u (grab_first_y(grab_last_y($5)))
x_first=GPVAL_DATA_X_MIN
x_last=GPVAL_DATA_X_MAX

# --------------------
set label 1 sprintf('first: %d', y_first) at x_first,y_first left offset 5,-1
set label 2 sprintf('last: %d', y_last) at x_last,y_last right offset 0,1
set label 3 sprintf('min: %d', y_min) at 0,y0-(y1-y0)/15 left offset 5,0
set label 4 sprintf('max: %d', y_max) at 0,y0-(y1-y0)/15 left offset 5,1

# --------------------
set terminal x11 nopersist noraise enhanced
set xlabel 'n'
set ylabel 'megs'

set style line 1 lt 1 lw 1 pt 2 pi -1 ps 1.5
set pointintervalbox 2

plot\
  src u 5 w linespoints linestyle 1 t columnheader,\
  src u 1 w lines title columnheader,\
  src u 2 w lines title columnheader,\
  src u 3 w lines title columnheader,\
  src u 4 w lines title columnheader,\

# --------------------
pause 1
reread

Probably the most complex gnuplot script I composed to date.

Yeah, maybe I should've just googled around for an app that does same thing, though I like how this lore potentially gives ability to plot whatever other stuff in a similar fashion.

That, and I love all the weird stuff gnuplot can do.

For instance, xterm apparently has some weird "plotter" interface hardware terminals had in the past:

gnuplot and Xterm Tektronix 4014 Mode

And there's also the famous "dumb" terminal for pseudographics too.

Regular x11 output looks nice and clean enough though:

gnuplot x11 output

It updates smoothly, with line crawling left-to-right from the start and then neatly flowing through. There's a lot of styling one can do to make it prettier, but I think I've spent enough time on such a trivial thing.

Didn't really help much with debugging though. Oh well...

Full "free | awk | gnuplot" script is here on github.

Jan 12, 2015

Starting systemd service instance for device from udev

Needed to implement a thing that would react on USB Flash Drive inserted (into autonomous BBB device) - to get device name, mount fs there, rsync stuff to it, unmount.

To avoid whatever concurrency issues (i.e. multiple things screwing with device in parallel), proper error logging and other startup things, most obvious thing is to wrap the script in a systemd oneshot service.

Only non-immediately-obvious problem for me here was how to pass device to such service properly.

With a bit of digging through google results (and even finding one post here somehow among them), eventually found "Pairing udev's SYSTEMD_WANTS and systemd's templated units" resolved thread, where what seem to be current-best approach is specified.

Adapting it for my case and pairing with generic patterns for device-instantiated services, resulted in the following configuration.

99-sync-sensor-logs.rules:

SUBSYSTEM=="block", ACTION=="add", ENV{ID_TYPE}="disk", ENV{DEVTYPE}=="partition",\
  PROGRAM="/usr/bin/systemd-escape -p --template=sync-sensor-logs@.service $env{DEVNAME}",\
  ENV{SYSTEMD_WANTS}+="%c"

sync-sensor-logs@.service:

[Unit]
BindTo=%i.device
After=%i.device

[Service]
Type=oneshot
TimeoutStartSec=300
ExecStart=/usr/local/sbin/sync-sensor-logs /%I
This makes things stop if it works for too long or if device vanishes (due to BindTo=) and properly delays script start until device is ready.
"sync-sensor-logs" script at the end gets passed original unescaped device name as an argument.
Easy to apply all the systemd.exec(5) and systemd.service(5) parameters on top of this.

Does not need things like systemctl invocation or manual systemd escaping re-implementation either, though running "systemd-escape" still seem to be necessary evil there.

systemd-less alternative seem to be having a script that does per-device flock, timeout logic and a lot more checks for whether device is ready and/or still there, so this approach looks way saner and clearer, with a caveat that one should probably be familiar with all these systemd features.

Oct 05, 2014

Simple aufs setup for Arch Linux ARM and boards like RPi, BBB or Cubie

Experimenting with all kinds of arm boards lately (nyms above stand for Raspberry Pi, Beaglebone Black and Cubieboard), I can't help but feel a bit sorry of microsd cards in each one of them.

These are even worse for non-bulk writes than SSD, having less erase cycles plus larger blocks, and yet when used for all fs needs of the board, even typing "ls" into shell will usually emit a write (unless shell doesn't keep history, which sucks).

Great explaination of how they work can be found on LWN (as usual).

Easy and relatively hassle-free way to fix the issue is to use aufs, but as doing it for the whole rootfs requires initramfs (which is not needed here otherwise), it's a lot easier to only use it for commonly-writable parts - i.e. /var and /home in most cases.

Home for "root" user is usually /root, so to make it aufs material as well, it's better to move that to /home (which probably shouldn't be a separate fs on these devices), leaving /root as a symlink to that.

It seem to be impossible to do when logged-in as /root (mv will error with EBUSY), but trivial from any other machine:

# mount /dev/sdb2 /mnt # mount microsd
# cd /mnt
# mv root home/
# ln -s home/root
# cd
# umount /mnt

As aufs2 is already built into Arch Linux ARM kernel, only thing that's left is to add early-boot systemd unit for mounting it, e.g. /etc/systemd/system/aufs.service:

[Unit]
DefaultDependencies=false

[Install]
WantedBy=local-fs-pre.target

[Service]
Type=oneshot
RemainAfterExit=true

# Remount /home and /var as aufs
ExecStart=/bin/mount -t tmpfs tmpfs /aufs/rw
ExecStart=/bin/mkdir -p -m0755 /aufs/rw/var /aufs/rw/home
ExecStart=/bin/mount -t aufs -o br:/aufs/rw/var=rw:/var=ro none /var
ExecStart=/bin/mount -t aufs -o br:/aufs/rw/home=rw:/home=ro none /home

# Mount "pure" root to /aufs/ro for syncing changes
ExecStart=/bin/mount --bind / /aufs/ro
ExecStart=/bin/mount --make-private /aufs/ro

And then create the dirs used there and enable unit:

# mkdir -p /aufs/{rw,ro}
# systemctl enable aufs

Now, upon rebooting the board, you'll get aufs mounts for /home and /var, making all the writes there go to respective /aufs/rw dirs on tmpfs while allowing to read all the contents from underlying rootfs.

To make sure systemd doesn't waste extra tmpfs space thinking it can sync logs to /var/log/journal, I'd also suggest to do this (before rebooting with aufs mounts):

# rm -rf /var/log/journal
# ln -s /dev/null /var/log/journal

Can also be done via journald.conf with Storage=volatile.

One obvious caveat with aufs is, of course, how to deal with things that do expect to have permanent storage in /var - examples can be a pacman (Arch package manager) on system updates, postfix or any db.
For stock Arch Linux ARM though, it's only pacman on manual updates.

And depending on the app and how "ok" can loss of this data might be, app dir in /var (e.g. /var/lib/pacman) can be either moved + symlinked to /srv or synced before shutdown or after it's done with writing (for manual oneshot apps like pacman).

For moving stuff back to permanent fs, aubrsync from aufs2-util.git can be used like this:

# aubrsync move /var/ /aufs/rw/var/ /aufs/ro/var/

As even pulling that from shell history can be a bit tedious, I've made a simplier ad-hoc wrapper - aufs_sync - that can be used (with mountpoints similar to presented above) like this:

# aufs_sync
Usage: aufs_sync { copy | move | check } [module]
Example (flushes /var): aufs_sync move var

# aufs_sync check
/aufs/rw
/aufs/rw/home
/aufs/rw/home/root
/aufs/rw/home/root/.histfile
/aufs/rw/home/.wh..wh.orph
/aufs/rw/home/.wh..wh.plnk
/aufs/rw/home/.wh..wh.aufs
/aufs/rw/var
/aufs/rw/var/.wh..wh.orph
/aufs/rw/var/.wh..wh.plnk
/aufs/rw/var/.wh..wh.aufs
--- ... just does "find /aufs/rw"

# aufs_sync move
--- does "aubrsync move" for all dirs in /aufs/rw

Just be sure to check if any new apps might write something important there (right after installing these) and do symlinks (to something like /srv) for their dirs, as even having "aufs_sync copy" on shutdown definitely won't prevent data loss for these on e.g. sudden power blackout or any crashes.

Sep 23, 2014

tmux rate-limiting magic against terminal spam/flood lock-ups

Update 2015-11-08: No longer necessary (or even supported in 2.1) - tmux' new "backoff" rate-limiting approach works like a charm with defaults \o/

Had the issue of spammy binary locking-up terminal for a long time, but never bothered to do something about it... until now.

Happens with any terminal I've seen - just run something like this in the shell there:

# for n in {1..500000}; do echo "$spam $n"; done
And for at least several seconds, terminal is totally unresponsive, no matter how many screen's / tmux'es are running there.
It's usually faster to kill term window via WM and re-attach to whatever was inside from a new one.

xterm seem to be one of the most resistant *terms to this, e.g. terminology - much less so, which I guess just means that it's more fancy and hence slower to draw millions of output lines.

Anyhow, tmuxrc magic:

set -g c0-change-trigger 150
set -g c0-change-interval 100

"man tmux" says that 250/100 are defaults, but it doesn't seem to be true, as just setting these "defaults" explicitly here fixes the issue, which exists with the default configuration.

Fix just limits rate of tmux output to basically 150 newlines (which is like twice my terminal height anyway) per 100 ms, so xterm won't get overflown with "rendering megs of text" backlog, remaining apparently-unresponsive (to any other output) for a while.

Since I always run tmux as a top-level multiplexer in xterm, totally solved the annoyance for me. Just wish I've done that much sooner - would've saved me a lot of time and probably some rage-burned neurons.

Jul 16, 2014

(yet another) Dynamic DNS thing for tinydns (djbdns)

Tried to find any simple script to update tinydns (part of djbdns) zones that'd be better than ssh dns_update@remote_host update.sh, but failed - they all seem to be hacky php scripts, doomed to run behind httpds, send passwords in url, query random "myip" hosts or something like that.

What I want instead is something that won't be making http, tls or ssh connections (and stirring all the crap behind these), but would rather just send udp or even icmp pings to remotes, which should be enough for update, given source IPs of these packets and some authentication payload.

So yep, wrote my own scripts for that - tinydns-dynamic-dns-updater project.

Tool sends UDP packets with 100 bytes of "( key_id || timestamp ) || Ed25519_sig" from clients, authenticating and distinguishing these server-side by their signing keys ("key_id" there is to avoid iterating over them all, checking which matches signature).

Server zone files can have "# dynamic: ts key1 key2 ..." comments before records (separated from static records after these by comments or empty lines), which says that any source IPs of packets with correct signatures (and more recent timestamps) will be recorded in A/AAAA records (depending on source AF) that follow instead of what's already there, leaving anything else in the file intact.

Zone file only gets replaced if something is actually updated and it's possible to use dynamic IP for server as well, using dynamic hostname on client (which is resolved for each delayed packet).

Lossy nature of UDP can be easily mitigated by passing e.g. "-n5" to the client script, so it'd send 5 packets (with exponential delays by default, configurable via --send-delay), plus just having the thing on fairly regular intervals in crontab.

Putting server script into socket-activated systemd service file also makes all daemon-specific pains like using privileged ports (and most other security/access things), startup/daemonization, restarts, auto-suspend timeout and logging woes just go away, so there's --systemd flag for that too.

Given how easy it is to run djbdns/tinydns instance, there really doesn't seem to be any compelling reason not to use your own dynamic dns stuff for every single machine or device that can run simple python scripts.

Github link: tinydns-dynamic-dns-updater

Jun 15, 2014

Running isolated Steam instance with its own UID and session

Finally got around to installing Steam platform to a desktop linux machine.
Been using Win7 instance here for games before, but as another fan in my laptop died, have been too lazy to reboot into dedicated games-os here.

Given that Steam is a closed-source proprietary DRM platform for mass software distribution, it seem to be either an ideal malware spread vector or just a recipie for disaster, so of course not keen on giving it any access in a non-dedicated os.

I also feel a bit guilty on giving the thing any extra PR, as it's the worst kind of always-on DRM crap in principle, and already pretty much monopolized PC Gaming market.
These days even many game critics push for filtering and essentially abuse of that immense leverage - not a good sign at all.
To its credit, of course, Steam is nice and convenient to use, as such things (e.g. google, fb, droids, apple, etc) tend to be.

So, isolation:

  • To avoid having Steam and any games anywhere near $HOME, giving it separate UID is a good way to go.

  • That should also allow for it to run in a separate desktop session - i.e. have its own cgroup, to easily contain, control and set limits for games:

    % loginctl user-status steam
    steam (1001)
      Since: Sun 2014-06-15 18:40:34 YEKT; 31s ago
      State: active
      Sessions: *7
        Unit: user-1001.slice
              └─session-7.scope
                ├─7821 sshd: steam [priv]
                ├─7829 sshd: steam@notty
                ├─7830 -zsh
                ├─7831 bash /usr/bin/steam
                ├─7841 bash /home/steam/.local/share/Steam/steam.sh
                ├─7842 tee /tmp/dumps/steam_stdout.txt
                ├─7917 /home/steam/.local/share/Steam/ubuntu12_32/steam
                ├─7942 dbus-launch --autolaunch=e52019f6d7b9427697a152348e9f84ad ...
                └─7943 /usr/bin/dbus-daemon --fork --print-pid 5 ...
    
  • AppArmor should allow to further isolate processes from having any access beyond what's absolutely necessary for them to run, warn when these try to do strange things and allow to just restrict these from doing outright stupid things.

  • Given separate UID and cgroup, network access from all Steam apps can be easily controlled via e.g. iptables, to avoid Steam and games scanning and abusing other things in LAN, for example.


Creating steam user should be as simple as useradd steam, but then switching to that UID from within a running DE should still allow it to access same X server and start systemd session for it, plus not have any extra env, permissions, dbus access, fd's and such from the main session.

By far the easiest way to do that I've found is to just ssh steam@localhost, putting proper pubkey into ~steam/.ssh/authorized_keys first, of course.
That should ensure that nothing leaks from DE but whatever ssh passes, and it's rather paranoid security-oriented tool, so can be trusted with that .
Steam comes with a bootstrap script (e.g. /usr/bin/steam) to install itself, which also starts the thing when it's installed, so Steam AppArmor profile (github link) is for that.
It should allow to both bootstrap and install stuff as well as run it, yet don't allow steam to poke too much into other shared dirs or processes.

To allow access to X, xhost or ~/.Xauthority cookie can be used along with some extra env in e.g. ~/.zshrc:

export DISPLAY=':1.0'

In similar to ssh fashion, I've used pulseaudio network streaming to main DE sound daemon on localhost for sound (also in ~/.zshrc):

export PULSE_SERVER='{e52019f6d7b9427697a152348e9f84ad}tcp6:malediction:4713'
export PULSE_COOKIE="$HOME"/.pulse-cookie

(I have pulse network streaming setup anyway, for sharing sound from desktop to laptop - to e.g. play videos on a big screen there yet hear sound from laptop's headphones)

Running Steam will also start its own dbus session (maybe it's pulse client lib doing that, didn't check), but it doesn't seem to be used for anything, so there seem to be no need to share it with main DE.


That should allow to start Steam after ssh'ing to steam@localhost, but process can be made much easier (and more foolproof) with e.g. ~/bin/steam as:

#!/bin/bash

cmd=$1
shift

steam_wait_exit() {
  for n in {0..10}; do
    pgrep -U steam -x steam >/dev/null || return 0
    sleep 0.1
  done
  return 1
}

case "$cmd" in
  '')
    ssh steam@localhost <<EOF
source .zshrc
exec steam "$@"
EOF
    loginctl user-status steam ;;

  s*) loginctl user-status steam ;;

  k*)
    steam_exited=
    pgrep -U steam -x steam >/dev/null
    [[ $? -ne 0 ]] && steam_exited=t
    [[ -z "$steam_exited" ]] && {
      ssh steam@localhost <<EOF
source .zshrc
exec steam -shutdown
EOF
      steam_wait_exit
      [[ $? -eq 0 ]] && steam_exited=t
    }
    sudo loginctl kill-user steam
    [[ -z "$steam_exited" ]] && {
      steam_wait_exit || sudo loginctl -s KILL kill-user steam
    } ;;

  *) echo >&2 "Usage: $(basename "$0") [ status | kill ]"
esac

Now just steam in the main DE will run the thing in its own $HOME.

For further convenience, there's steam status and steam kill to easily monitor or shutdown running Steam session from the terminal.

Note the complicated shutdown thing - Steam doesn't react to INT or TERM signals cleanly, passing these to the running games instead, and should be terminated via its own cli option (and the rest can then be killed-off too).


With this setup, iptables rules for outgoing connections can use user-slice cgroup match (in 3.14 at least) or -m owner --uid-owner steam matches for socket owner uid.

The only non-WAN things Steam connects to here are DNS servers and aforementioned pulseaudio socket on localhost, the rest can be safely firewalled.


Finally, running KSP there on Exherbo, I quickly discovered that sound libs and plugins - alsa and pulse - in ubuntu "runtime" steam bootstrap setups don't work well - either there's no sound or game fails to load at all.

Easy fix is to copy the runtime it uses (32-bit one for me) and cleanup alien stuff from there for what's already present in the system, i.e.:

% cp -R .steam/bin32/steam-runtime my-runtime
% find my-runtime -type f\
  \( -path '*asound*' -o -path '*alsa*' -o -path '*pulse*' \) -delete

And then add something like this to ~steam/.zshrc:

steam() { STEAM_RUNTIME="$HOME"/my-runtime command steam "$@"; }

That should keep all of the know-working Ubuntu libs that steam bootsrap gets away from the rest of the system (where stuff like Mono just isn't needed, and others will cause trouble) while allowing to remove any of them from the runtime to use same thing in the system.

And yay - Kerbal Space Program seem to work here way faster than on Win7.

KSP and Steam on Linux

Aug 08, 2013

Encrypted root on a remote vds

Most advice wrt encryption on a remote hosts (VPS, VDS) don't seem to involve full-disk encryption as such, but is rather limited to encrypting /var and /home, so that machine will boot from non-crypted / and you'll be able to ssh to it, decrypt these parts manually, then start services that use data there.

That seem to be in contrast with what's generally used on local machines - make LUKS container right on top of physical disk device, except for /boot (if it's not on USB key) and don't let that encryption layer bother you anymore.

Two policies seem to differ in that former one is opt-in - you have to actively think which data to put onto encrypted part (e.g. /etc/ssl has private keys? move to /var, shred from /etc), while the latter is opt-out - everything is encrypted, period.

So, in spirit of that opt-out way, I thought it'd be a drag to go double-think wrt which data should be stored where and it'd be better to just go ahead and put everything possible to encrypted container for a remote host as well, leaving only /boot with kernel and initramfs in the clear.

Naturally, to enter encryption password and not have it stored alongside LUKS header, some remote login from the network is in order, and sshd seem to be the secure and easy way to go about it.
Initramfs in question should then also be able to setup network, which luckily dracut can. Openssh sshd is a bit too heavy for it though, but there are much lighter sshd's like dropbear.

Searching around for someone to tie the two things up, found a bit incomplete and non-packaged solutions like this RH enhancement proposal and a set of hacky scripts and instructions in dracut-crypt-wait repo on bitbucket.

Approach outlined in RH bugzilla is to make dracut "crypt" module to operate normally and let cryptsetup query for password in linux console, but also start sshd in the background, where one can login and use a simple tool to echo password to that console (without having it echoed).
dracut-crypt-wait does a clever hack of removing "crypt" module hook instead and basically creates "rescure" console on sshd, where user have to manually do all the decryption necessary and then signal initramfs to proceed with the boot.

I thought first way was rather more elegant and clever, allowing dracut to figure out which device to decrypt and start cryptsetup with all the necessary, configured and documented parameters, also still allowing to type passphrase into console - best of both worlds, so went along with that one, creating dracut-crypt-sshd project.

As README there explains, using it is as easy as adding it into dracut.conf (or passing to dracut on command line) and adding networking to grub.cfg, e.g.:

menuentry "My Linux" {
        linux /vmlinuz ro root=LABEL=root
                rd.luks.uuid=7a476ea0 rd.lvm.vg=lvmcrypt rd.neednet=1
                ip=88.195.61.177::88.195.61.161:255.255.255.224:myhost:enp0s9:off
        initrd /dracut.xz
}

("ip=dhcp" might be simplier way to go, but doesn't yield default route in my case)

And there, you'll have sshd on that IP port 2222 (configurable), with pre-generated (during dracut build) keys, which might be a good idea to put into "known_hosts" for that ip/port somewhere. "authorized_keys" is taken from /root/.ssh by default, but also configurable via dracut.conf, if necessary.

Apart from sshd, that module includes two tools for interaction with console - console_peek and console_auth (derived from auth.c in the bugzilla link above).

Logging in to that sshd then yields sequence like this:

[214] Aug 08 13:29:54 lastlog_perform_login: Couldn't stat /var/log/lastlog: No such file or directory
[214] Aug 08 13:29:54 lastlog_openseek: /var/log/lastlog is not a file or directory!

# console_peek
[    1.711778] Write protecting the kernel text: 4208k
[    1.711875] Write protecting the kernel read-only data: 1116k
[    1.735488] dracut: dracut-031
[    1.756132] systemd-udevd[137]: starting version 206
[    1.760022] tsc: Refined TSC clocksource calibration: 2199.749 MHz
[    1.760109] Switching to clocksource tsc
[    1.809905] systemd-udevd[145]: renamed network interface eth0 to enp0s9
[    1.974202] 8139too 0000:00:09.0 enp0s9: link up, 100Mbps, full-duplex, lpa 0x45E1
[    1.983151] dracut: sshd port: 2222
[    1.983254] dracut: sshd key fingerprint: 2048 0e:14:...:36:f9  root@congo (RSA)
[    1.983392] dracut: sshd key bubblebabble: 2048 xikak-...-poxix  root@congo (RSA)
[185] Aug 08 13:29:29 Failed reading '-', disabling DSS
[186] Aug 08 13:29:29 Running in background
[    2.093869] dracut: luksOpen /dev/sda3 luks-...
Enter passphrase for /dev/sda3:
[213] Aug 08 13:29:50 Child connection from 188.226.62.174:46309
[213] Aug 08 13:29:54 Pubkey auth succeeded for 'root' with key md5 0b:97:bb:...

# console_auth
Passphrase:

#
First command - "console_peek" - allows to see which password is requested (if any) and second one allows to login.
Note that fingerprints of host keys are also echoed to console on sshd start, in case one has access to console but still needs sshd later.
I quickly found out that such initramfs with sshd is also a great and robust rescue tool, especially if "debug" and/or "rescue" dracut modules are enabled.
And as it includes fairly comprehensive network-setup options, might be a good way to boot multiple different OS'es with same (machine-specific) network parameters,

Probably obligatory disclaimer for such post should mention that crypto above won't save you from malicious hoster or whatever three-letter-agency that will coerce it into cooperation, should it take interest in your poor machine - it'll just extract keys from RAM image (especially if it's a virtualized VPS) or backdoor kernel/initramfs and force a reboot.

Threat model here is more trivial - be able to turn off and decomission host without fear of disks/images then falling into some other party's hands, which might also happen if hoster eventually goes bust or sells/scraps disks due to age or bad blocks.

Also, even minor inconvenience like forcing to extract keys like outlined above might be helpful in case of quite well-known "we came fishing to a datacenter, shut everything down, give us all the hardware in these racks" tactic employed by some agencies.

Absolute security is a myth, but these measures are fairly trivial and practical to be employed casually to cut off at least some number of basic threats.

So, yay for dracut, the amazingly cool and hackable initramfs project, which made it that easy.

Code link: https://github.com/mk-fg/dracut-crypt-sshd

Next → Page 1 of 4
Member of The Internet Defense League