Aug 05, 2016

D3 chart for common temperature/rh time-series data

More D3 tomfoolery!

It's been a while since I touched the thing, but recently been asked to make a simple replacement for processing common-case time-series from temperature + relative-humidity (calling these "t" and "rh" here) sensors (DHT22, sht1x, or what have you), that's been painstakingly done in MS Excel (from tsv data) until now.

So here's the plot:

d3 t/rh chart image
Interactive version can be run directly from mk-fg/fgtk: d3-temp-rh-sensor-tsv-series-chart.html
Bunch of real-world data samples for the script: d3-temp-rh-sensor-tsv-series-chart.zip

Misc feats of the thing, in no particular order:

  • Single-html d3.v4.js + ES6 webapp (assembed by html-embed script) that can be opened from localhost or any static httpd on the net.
  • Drag-and-drop or multi-file browse/pick box, for uploading any number of tsv files (in whatever order, possibly with gaps in data) instantly to JS on the page.
  • Line chart with two Y axes (one for t, one for rh).
  • Smaller "overview" chart below that, where one can "brush" needed timespan (i.e. subset of uploaded data) for all the other charts and readouts.
  • Mouseover "vertical line" display snapping to specific datapoints.
  • List of basic stats for picked range - min/max, timespan, value count.
  • Histograms for value distribution, to easily see typical values for picked timespan, one for t and rh.

Kinda love this sort of interactive vis stuff, and it only takes a bunch of hours to put it all together with d3, as opposed to something like rrdtool, its dead images and quirky mini-language.

Also, surprisingly common use-case for this particular chart, as having such sensors connected to some RPi is pretty much first thing people usually want to do (or maybe close second after LEDs and switches).

Will probably look a bit further to make it into an offline-first Service Worker app, just for the heck of it, see how well this stuff works these days.

No point to this post, other than forgetting to write stuff for months is bad ;)

May 15, 2016

Debounce bogus repeated mouse clicks in Xorg with xbindkeys

My current Razer E-Blue mouse had this issue since I've got it - Mouse-2 / BTN_MIDDLE / middle-click (useful mostly as "open new tab" in browsers and "paste" in X) sometimes produces two click events (in rapid succession) instead of one.

It was more rare before, but lately it feels like it's harder to make it click once than twice.

Seem to be either hardware problem with debouncing circuitry or logic in the controller, or maybe a button itself not mashing switch contacts against each other hard enough... or soft enough (i.e. non-elastic), actually, given that they shouldn't "bounce" against each other.

Since there's no need to double-click that wheel-button ever, it looks rather easy to debounce the click on Xorg input level, by ignoring repeated button up/down events after producing the first full "click".

Easiest solution of that kind that I've found was to use guile (scheme) script with xbindkeys tool to keep that click-state data and perform clicks selectively, using xdotool:

(define razer-delay-min 0.2)
(define razer-wait-max 0.5)
(define razer-ts-start #f)
(define razer-ts-done #f)
(define razer-debug #f)

(define (mono-time)
  "Return monotonic timestamp in seconds as real."
  (+ 0.0 (/ (get-internal-real-time) internal-time-units-per-second)))

(xbindkey-function '("b:8") (lambda ()
  (let ((ts (mono-time)))
    (when
      ;; Enforce min ts diff between "done" and "start" of the next one
      (or (not razer-ts-done) (>= (- ts razer-ts-done) razer-delay-min))
      (set! razer-ts-start ts)))))

(xbindkey-function '(Release "b:8") (lambda ()
  (let ((ts (mono-time)))
    (when razer-debug
      (format #t "razer: ~a/~a delay=~a[~a] wait=~a[~a]\n"
        razer-ts-start razer-ts-done
        (and razer-ts-done (- ts razer-ts-done)) razer-delay-min
        (and razer-ts-start (- ts razer-ts-start)) razer-wait-max))
    (when
      (and
        ;; Enforce min ts diff between previous "done" and this one
        (or (not razer-ts-done) (>= (- ts razer-ts-done) razer-delay-min))
        ;; Enforce max "click" wait time
        (and razer-ts-start (<= (- ts razer-ts-start) razer-wait-max)))
      (set! razer-ts-done ts)
      (when razer-debug (format #t "razer: --- click!\n"))
      (run-command "xdotool click 2")))))

Note that xbindkeys actually grabs "b:8" here, which is a "mouse button 8", as if it was "b:2", then "xdotool click 2" command will recurse into same code, so wheel-clicker should be bound to button 8 in X for that to work.

Rebinding buttons in X is trivial to do on-the-fly, using standard "xinput" tool - e.g. xinput set-button-map "My Mouse" 1 8 3 (xinitrc.input script can be used as an extended example).

Running "xdotool" to do actual clicks at the end seem a bit wasteful, as xbindkeys already hooks into similar functionality, but unfortunately there's no "send input event" calls exported to guile scripts (as of 1.8.6, at least).

Still, works well enough as it is, fixing that rather annoying issue.

[xbindkeysrc.scm on github]

Mar 03, 2016

Python 3 killer feature - asyncio

I've been really conservative with the whole py2 -> py3 migration (shiny new langs don't seem to be my thing), but one feature that finally makes it worth the effort is well-integrated - by now (Python-3.5 with its "async" and "await" statements) - asyncio eventloop framework.

Basically, it's a twisted core, including eventloop hooked into standard socket/stream ops, sane futures implementation, all the Transports/Protocols/Tasks base classes and such concepts, standardized right there in Python's stdlib.

On one hand, baking this stuff into language core seem to be somewhat backwards, but I think it's actually really smart thing to do - not only it eliminates whole "tech zoo" problem nodejs ecosystem has, but also gets rid of "require huge twisted blob or write my own half-assed eventloop base" that pops-up in every second script, even the most trivial ones.

Makes it worth starting any py script with py3 shebang for me, at last \o/

Dec 29, 2015

Tool to interleave and colorize lines from multiple log (or any other) files

There's multitail thing to tail multiple logs, potentially interleaved, in one curses window, which is a painful-to-impossible to browse through, as you'd do with simple "less".

There's lnav for parsing and normalizing a bunch of logs, and continuously monitoring these, also interactive.

There's rainbow to color specific lines based on regexp, which can't really do any interleaving.

And this has been bugging me for a while - there seem to be no easy way to get this:

interleaved_and_colorized_output_image

This is an interleaved output from several timestamped log files, for events happening at nearly the same time (which can be used to establish the sequence between these and correlate output of multiple tools/instances/etc), browsable via the usual "less" (or whatever other $PAGER) in an xterm window.

In this case, logfiles are from "btmon" (bluetooth sniffer tool), "bluetoothd" (bluez) debug output and an output from gdb attached to that bluetoothd pid (showing stuff described in previous entry about gdb).

Output for neither of these tools have timestamps by default, but this is easy to fix by piping it through any tool which would add them into every line, svlogd for example.

To be concrete (and to show one important thing about such log-from-output approach), here's how I got these particular logs:

# mkdir -p debug_logs/{gdb,bluetoothd,btmon}

# gdb -ex 'source gdb_device_c_ftrace.txt' -ex q --args\
        /usr/lib/bluetooth/bluetoothd --nodetach --debug\
        1> >(svlogd -r _ -ttt debug_logs/gdb)\
        2> >(svlogd -r _ -ttt debug_logs/bluetoothd)

# stdbuf -oL btmon |\
        svlogd -r _ -ttt debug_logs/btmon

Note that "btmon" runs via coreutils stdbuf tool, which can be critical for anything that writes to its stdout via libc's fwrite(), i.e. can have buffering enabled there, which causes stuff to be output delayed and in batches, not how it'd appear in the terminal (where line buffering is used), resulting in incorrect timestamps, unless stdbuf or any other option disabling such buffering is used.

With three separate logs from above snippet, natural thing you'd want is to see these all at the same time, so for each logical "event", there'd be output from btmon (network packet), bluetoothd (debug logging output) and gdb's function call traces.

It's easy to concatenate all three logs and sort them to get these interleaved, but then it can be visually hard to tell which line belongs to which file, especially if they are from several instances of the same app (not really the case here though).

Simple fix is to add per-file distinct color to each line of each log, but then you can't sort these, as color sequences get in the way, it's non-trivial to do even that, and it all adds-up to a script.

Seem to be hard to find any existing tools for the job, so wrote a script to do it - liac (in the usual mk-fg/fgtk github repo), which was used to produce output in the image above - that is, interleave lines (using any field for sorting btw), add tags for distinct ANSI colors to ones belonging to different files and optional prefixes.

Thought it might be useful to leave a note for anyone looking for something similar.

[script source link]

Getting log of all function calls from specific source file using gdb

Maybe I'm doing debugging wrong, but messing with code written by other people, first question for me is usually not "what happens in function X" (done by setting a breakpoint on it), but rather "which file/func do I look into".

I.e. having an observable effect - like "GET_REPORT messages get sent on HID level to bluetooth device, and replies are ignored", it's easy to guess that it's either linux kernel or bluetoothd - part of BlueZ.

Question then becomes "which calls in app happen at the time of this observable effect", and luckily there's an easy, but not very well-documented (unless my google is that bad) way to see it via gdb for C apps.

For scripts, it's way easier of course, e.g. in python you can do python -m trace ... and it can dump even every single line of code it runs.

First of all, app in question has to be compiled with "-g" option and not "stripped", of course, which should be easy to set via CFLAGS, usually, defining these in distro-specific ways if rebuilding a package to include that (e.g. for Arch - have debug !strip in OPTIONS line from /etc/makepkg.conf).

Then running app under gdb can be done via something like gdb --args someapp arg1 arg2 (and typing "r" there to start it), but if the goal is to get a log of all function calls (and not just in a "call graph" way profiles like gprof do) from a specific file, first - interactivity has to go, second - breakpoints have to be set for all these funcs and then logged when app runs.

Alas, there seem to be no way to add break point to every func in a file.

One common suggestion (does NOT work, don't copy-paste!) I've seen is doing rbreak device\.c: ("rbreak" is a regexp version of "break") to match e.g. profiles/input/device.c:extract_hid_record (as well as all other funcs there), which would be "filename:funcname" pattern in my case, but it doesn't work and shouldn't work, as "rbreak" only matches "filename".

So trivial script is needed to a) get list of funcs in a source file (just name is enough, as C has only one namespace), and b) put a breakpoint on all of them.

This is luckily quite easy to do via ctags, with this one-liner:

% ctags -x --c-kinds=fp profiles/input/device.c |
  awk 'BEGIN {print "set pagination off\nset height 0\nset logging on\n\n"}\
    {print "break", $1 "\ncommands\nbt 5\necho ------------------\\n\\n\nc\nend\n"}\
    END {print "\n\nrun"}' > gdb_device_c_ftrace.txt

Should generate a script for gdb, starting with "set pagination off" and whatever else is useful for logging, with "commands" block after every "break", running "bt 5" (displays backtrace) and echoing a nice-ish separator (bunch of hyphens), ending in "run" command to start the app.

Resulting script can/should be fed into gdb with something like this:

% gdb -ex 'source gdb_device_c_ftrace.txt' -ex q --args\
  /usr/lib/bluetooth/bluetoothd --nodetach --debug

This will produce the needed list of all the calls to functions from that "device.c" file into "gdb.txt" and have output of the app interleaved with these in stdout/stderr (which can be redirected, or maybe closed with more gdb commands in txt file or before it with "-ex"), and is non-interactive.

From here, seeing where exactly the issue seem to occur, one'd probably want to look thru the code of the funcs in question, run gdb interactively and inspect what exactly is happening there.

Definitely nowhere near the magic some people script gdb with, but haven't found similar snippets neatly organized anywhere else, so here they go, in case someone might want to do the exact same thing.

Can also be used to log a bunch of calls from multiple files, of course, by giving "ctags" more files to parse.

Dec 09, 2015

Transparent buffer/file processing in emacs on load/save/whatever-io ops

Following-up on my gpg replacement endeavor, also needed to add transparent decryption for buffers loaded from *.ghg files, and encryption when writing stuff back to these.

git filters (defined via gitattributes file) do same thing when interacting with the repo.

Such thing is already done by a few exising elisp modules, such as jka-compr.el for auto-compression-mode (opening/saving .gz and similar files as if they were plaintext), and epa.el for transparent gpg encryption.

While these modules do this The Right Way by adding "file-name-handler-alist" entry, googling for a small ad-hoc boilerplate, found quite a few examples that do it via hooks, which seem rather unreliable and with esp. bad failure modes wrt transparent encryption.

So, in the interest of providing right-er boilerplate for the task (and because I tend to like elisp) - here's fg_sec.el example (from mk-fg/emacs-setup) of how it can be implemented cleaner, in similar fashion to epa and jka-compr.

Code calls ghg -do when visiting/reading files (with contents piped to stdin) and ghg -eo (with stdin/stdout buffers) when writing stuff back.

Entry-point/hook there is "file-name-handler-alist", where regexp to match *.ghg gets added to call "ghg-io-handler" for every i/o operation (including path ops like "expand-file-name" or "file-exists-p" btw), with only "insert-file-contents" (read) and "write-region" (write) being overidden.

Unlike jka-compr though, no temporary files are used in this implementation, only temp buffers, and "insert-file-contents" doesn't put unauthenticated data into target buffer as it arrives, patiently waiting for subprocess to exit with success code first.

Fairly sure that this bit of elisp can be used for any kind of processing, by replacing "ghg" binary with anything else that can work as a pipe (stdin -> processing -> stdout), which opens quite a lot of possibilities.

For example, all JSON files can be edited as a pretty YAML version, without strict syntax and all the brackets of JSON, or the need to process/convert them purely in elisp's json-mode or something - just plug python -m pyaml and python -m json commands for these two i/o ops and it should work.

Suspect there's gotta be something that'd make such filters easier in MELPA already, but haven't been able to spot anything right away, maybe should put up a package there myself.

[fg_sec.el code link]

Dec 08, 2015

GHG - simplier GnuPG (gpg) replacement for file encryption

Have been using gpg for many years now, many times a day, as I keep lot of stuff in .gpg files, but still can't seem to get used to its quirky interface and practices.

Most notably, it's "trust" thing, keyrings and arcane key editing, expiration dates, gpg-agent interaction and encrypted keys are all sources of dread and stress for me.

Last drop, following the tradition of many disastorous interactions with the tool, was me loosing my master signing key password, despite it being written down on paper and working before. #fail ;(

Certainly my fault, but as I'll be replacing the damn key anyway, why not throw out the rest of that incomprehensible tangle of pointless and counter-productive practices and features I never use?

Took ~6 hours to write a replacement ghg tool - same thing as gpg, except with simple and sane key management (which doesn't assume you entering anything, ever!!!), none of that web-of-trust or signing crap, good (and non-swappable) djb crypto, and only for file encryption.

Does everything I've used gpg for from the command-line, and has one flat file for all the keys, so no more hassle with --edit-key nonsense.

Highly suggest to anyone who ever had trouble and frustration with gpg to check ghg project out or write their own (trivial!) tool, and ditch the old thing - life's too short to deal with that constant headache.

Dec 07, 2015

Resizing first FAT32 partition to microSD card size on boot from Raspberry Pi

One other thing I've needed to do recently is to have Raspberry Pi OS resize its /boot FAT32 partition to full card size (i.e. "make it as large as possible") from right underneath itself.

RPis usually have first FAT (fat16 / fat32 / vfat) partition needed by firmware to load config.txt and uboot stuff off, and that is the only partition one can see in Windows OS when plugging microSD card into card-reader (which is a kinda arbitrary OS limitation).

Map of the usual /dev/mmcblk0 on RPi (as seen in parted):

Number  Start   End     Size    Type     File system  Flags
        32.3kB  1049kB  1016kB           Free Space
 1      1049kB  106MB   105MB   primary  fat16        lba
 2      106MB   1887MB  1782MB  primary  ext4

Resizing that first partition is naturally difficult, as it is followed by ext4 one with RPi's OS, but when you want to have small (e.g. <2G) and easy-to-write "rpi.img" file for any microSD card, there doesn't seem to be a way around that - img have to have as small initial partitions as possible to fit on any card.

Things get even more complicated by the fact that there don't seem to be any tools around for resizing FAT fs'es, so it has to be re-created from scratch.

There is quite an easy way around all these issues however, which can be summed-up as a sequence of the following steps:

  • Start while rootfs is mounted read-only or when it can be remounted as such, i.e. on early boot.

    Before=systemd-remount-fs.service local-fs-pre.target in systemd terms.

  • Grab sfdisk/parted map of the microSD and check if there's "Free Space" chunk left after last (ext4/os) partition.

    If there is, there's likely a lot of it, as SD cards increase in 2x size factors, so 4G image written on larger card will have 4+ gigs there, in fact a lot more for 16G or 32G cards.

    Or there can be only a few megs there, in case of matching card size, where it's usually a good idea to make slightly smaller images, as actual cards do vary in size a bit.

  • "dd" whole rootfs to the end of the microSD card.

    This is safe with read-only rootfs, and dumb "dd" approach to copying it (as opposed to dmsetup + mkfs + cp) seem to be simpliest and least error-prone.

  • Update partition table to have rootfs in the new location (at the very end of the card) and boot partition covering rest of the space.

  • Initiate reboot, so that OS will load from the new rootfs location.

  • Starting on early-boot again, remount rootfs rw if necessary, temporary copy all contents of boot partition (which should still be small) to rootfs.

  • Run mkfs.vfat on the new large boot partition and copy stuff back to it from rootfs.

  • Reboot once again, in case whatever boot timeouts got triggered.

  • Avoid running same thing on all subsequent boots.

    E.g. touch /etc/boot-resize-done and have ConditionPathExists=!/etc/boot-resize-done in the systemd unit file.

That should do it \o/

resize-rpi-fat32-for-card (in fgtk repo) is a script I wrote to do all of this stuff, exactly as described.

systemd unit file for the thing (can also be printed by running script with "--print-systemd-unit" option):

[Unit]
DefaultDependencies=no
After=systemd-fsck-root.service
Before=systemd-remount-fs.service -.mount local-fs-pre.target local-fs.target
ConditionPathExists=!/etc/boot-resize-done

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/local/bin/resize-rpi-fat32-for-card

[Install]
WantedBy=local-fs.target

It does use lsblk -Jnb JSON output to get rootfs device and partition, and get whether it's mounted read-only, then parted -ms /dev/... unit B print free to grab machine-readable map of the device.

sfdisk -J (also JSON output) could've been better option than parted (extra dep, which is only used to get that one map), except it doesn't conveniently list "free space" blocks and device size, pity.

If partition table doesn't have extra free space at the end, "fsstat" tool from sleuthkit is used to check whether FAT filesystem covers whole partition and needs to be resized.

After that, and only if needed, either "dd + sfdisk" or "cp + mkfs.vfat + cp back" sequence gets executed, followed by a reboot command.

Extra options for the thing:

  • "--skip-ro-check" - don't bother checkin/forcing read-only rootfs before "dd" step, which should be fine, if there's no activity there (e.g. early boot).

  • "--done-touch-file" - allows to specify location of file to create (if missing) when "no resize needed" state gets reached.

    Script doesn't check whether this file exists and always does proper checks of partition table and "fsstat" when deciding whether something has to be done, only creates the file at the end (if it doesn't exist already).

  • "--overlay-image" uses splash.go tool that I've mentioned earlier (be sure to compile it first, ofc) to set some "don't panic, fs resize in progress" image (resized/centered and/or with text and background) during the whole process, using RPi's OpenVG GPU API, covering whatever console output.

  • Misc other stuff for setup/debug - "--print-systemd-unit", "--debug", "--reboot-delay".

    Easy way to debug the thing with these might be to add StandardOutput=tty to systemd unit's Service section and ... --debug --reboot-delay 60 options there, or possibly adding extra ExecStart=/bin/sleep 60 after the script (and changing its ExecStart= to ExecStart=-, so delay will still happen on errors).

    This should provide all the info on what's happening in the script (has plenty of debug output) to the console (one on display or UART).

One more link to the script: resize-rpi-fat32-for-card

Nov 28, 2015

Raspberry Pi early boot splash / logo screen

Imagine you have RPi with some target app (e.g. kiosk-mode browser) starting in X session, and want to have a full-screen splash for the whole time that device will be booting, and no console output or getty's of any kind, and no other splash screens in there - only black screen to logo to target app.

In case of average Raspberry Pi boot, there is:

  • A firmware "color test" splash when device gets powered-on.

    Removed with disable_splash=1 in /boot/config.txt.

  • "Rainbow-colored square" over-current indicator popping up on the right, regardless of PSU or cables, it seems.

    avoid_warnings=1 in the same config.txt

  • As kernel boots - Raspberry Pi logo embedded in it.

    logo.nologo to /boot/cmdline.txt.

    Replacing that logo with proper splash screen is not really an option, as logos that work there have to be tiny - like 80x80 pixels tiny.

    Anything larger than that gives fbcon_init: disable boot-logo (boot-logo bigger than screen), so in-kernel logo isn't that useful, and it's a pain to embed it there anyway (kernel rebuild!).

  • Lots of console output - from kernel and init both.

    cmdline.txt: console=null quiet

  • Getty showing its login prompt.

    systemctl disable getty@tty1

  • More printk stuff, as various kernel modules get initialized and hardware detected.

    console=null in cmdline.txt should've removed all that.

    consoleblank=0 loglevel=1 rootfstype=ext4 helps if console=null is not an option, e.g. because "fbi" should set logo there (see below).

    Need for "rootfstype" is kinda funny, because messages from kernel trying to mount rootfs as ext2/ext3 seem to be emergency-level or something.

  • Removing all the stuff above should (finally!) get a peaceful black screen, but what about the actual splash image?

    fbi -d /dev/fb0 --once --noverbose\
      --autozoom /path/to/image.png </dev/tty1 >/dev/tty1
    

    Or, in early-boot-systemd terms:

    [Unit]
    DefaultDependencies=no
    After=local-fs.target
    
    [Service]
    StandardInput=tty
    StandardOutput=tty
    ExecStart=/usr/bin/fbi -d /dev/fb0\
      --once --noverbose --autozoom /path/to/image.png
    
    [Install]
    WantedBy=sysinit.target
    

    "fbi" is a tool from fbida project.

    console=null should NOT be in cmdline for this tool to work (see above).

    First time you run it, you'll probably get:

    ioctl VT_GETSTATE: Inappropriate ioctl for device (not a linux console?)
    

    A lot of people on the internets seem to suggest something like "just run it from Alt + F1 console", which definitely isn't an option for this case, but I/O redirection to /dev/tty (as shown above) seem to work.

  • Blank black screen and whatever flickering on X startup.

    Running X on a different VT from "fbi" seem to have nice effect that if X will have to be restarted for some reason (e.g. whole user session gets restarted due to target app's watchdog + StartLimitAction=), VT will switch back to a nice logo, not some text console.

    To fix blackness in X before-and-after WM, there're tools like feh:

    feh --bg-scale /path/to/image.png
    

    That's not instant though, as X usually takes its time starting up, so see more on it below.

  • Target app startup cruft - e.g. browser window without anything loaded yet, or worse - something like window elements being drawn.

    • There can be some WM tricks to avoid showing unprepared window, including "start minimized, then maximize", switching "virtual desktops", overlay windows, transparency with compositors, etc.

      Depends heavily on WM, obviously, and needs one that can be controlled from the script (which is rather common among modern standalone WMs).

    • Another trick is to start whole X without switching VT - i.e. X -novtswitch vt2 - and switch to that VT later when both X and app signal that they're ready, or just been given enough time.

      Until switch happens, splash logo is displayed, courtesy of "fbi" tool.

    • On Raspberry Pi in particular, there're some direct-to-display VideoCore APIs, which allow to overlay anything on top of whatever Linux or X draw in their VTs while starting-up.

      This is actually a cool thing - e.g. starting omxplayer --no-osd --no-keys /path/to/image.png.mp4 (mp4 produced from still image) early on boot (it doesn't need X or anything!) will remove the need for most previous steps, as it will eclipse all the other video output.

      "omxplayer" maybe isn't the best tool for the job, as it's not really meant to display still images, but it's fast and liteweight enough.

      Better alternative I've found is to use OpenVG API via openvg lib, which has nice Go (golang) version, and wrote an overlay-image.go tool to utilize it for this simple "display image and hang forever" (to be stopped when boot finishes) purpose.

      Aforementioned Go tool has "-resize" flag to scale the image to current display size with "convert" and cache it with ".cache-WxH" suffix, and "-bg-color" option to set margins' color otherwise (for e.g. logo centered with solid color around it). Can be built (be sure to set $GOPATH first) with: go get github.com/ajstarks/openvg && go build .

  • Finally some destination state with target app showing what it's supposed to.

    Yay, we got here!

Not a very comprehensive or coherent guide, but might be useful to sweep all the RPi nasties under an exquisite and colorful rug ;)

Update 2015-11-30: Added link to overlay-image.go tool.

Update 2015-11-30: A bit different version (cleaned-up, with build-dep on "github.com/disintegration/gift" instead of optional call to "convert") of this tool has been added to openvg lib repo under "go-client/splash".

Nov 25, 2015

Replacing built-in RTC with i2c battery-backed one on BeagleBone Black from boot

BeagleBone Black (BBB) boards have - and use - RTC (Real-Time Clock - device that tracks wall-clock time, including calendar date and time of day) in the SoC, which isn't battery-backed, so looses track of time each time device gets power-cycled.

This represents a problem if keeping track of time is necessary and there's no network access (or a patchy one) to sync this internal RTC when board boots up.

Easy solution to that, of course, is plugging external RTC device, with plenty of cheap chips with various precision available, most common being Dallas/Maxim ICs like DS1307 or DS3231 (a better one of the line) with I2C interface, which are all supported by Linux "ds1307" module.

Enabling connected chip at runtime can be easily done with a command like this:

echo ds1307 0x68 >/sys/bus/i2c/devices/i2c-2/new_device

(see this post on Fortune Datko blog and/or this one on minix-i2c blog for ways to tell reliably which i2c device in /dev corresponds to which bus and pin numbers on BBB headers, and how to check/detect/enumerate connected devices there)

This obviously doesn't enable device straight from the boot though, which is usually accomplished by adding the thing to Device Tree, and earlier with e.g. 3.18.x kernels it had to be done by patching and re-compiling platform dtb file used on boot.

But since 3.19.x kernels (and before 3.9.x), easier way seem to be to use Device Tree Overlays (usually "/lib/firmware/*.dtbo" files, compiled by "dtc" from dts files), which is kinda like patching Device Tree, only done at runtime.

Code for such patch in my case ("i2c2-rtc-ds3231.dts"), with 0x68 address on i2c2 bus and "ds3231" kernel module (alias for "ds1307", but more appropriate for my chip):

/dts-v1/;
/plugin/;

/* dtc -O dtb -o /lib/firmware/BB-RTC-02-00A0.dtbo -b0 i2c2-rtc-ds3231.dts */
/* bone_capemgr.enable_partno=BB-RTC-02 */
/* https://github.com/beagleboard/bb.org-overlays */

/ {
  compatible = "ti,beaglebone", "ti,beaglebone-black", "ti,beaglebone-green";
  part-number = "BB-RTC-02";
  version = "00A0";

  fragment@0 {
    target = <&i2c2>;

    __overlay__ {
      pinctrl-names = "default";
      pinctrl-0 = <&i2c2_pins>;
      status = "okay";
      clock-frequency = <100000>;
      #address-cells = <0x1>;
      #size-cells = <0x0>;

      rtc: rtc@68 {
        compatible = "dallas,ds3231";
        reg = <0x68>;
      };
    };
  };
};

As per comment in the overlay file, can be compiled ("dtc" comes from "dtc-overlay" package on ArchLinuxARM) to the destination with:

dtc -O dtb -o /lib/firmware/BB-RTC-02-00A0.dtbo -b0 i2c2-rtc-ds3231.dts

And then loaded on early boot (as soon as rootfs with "/lib/firmware" gets mounted) with "bone_capemgr.enable_partno=" cmdline addition, and should be put to something like "/boot/uEnv.txt", for example (with dtb path from command above):

optargs=bone_capemgr.enable_partno=BB-RTC-02

Docs in bb.org-overlays repository have more details and examples on how to write and manage these.

That should ensure that this second RTC appears as "/dev/rtc1" (rtc0 is an internal one) on system startup, but unfortunately it still won't be the first one and kernel will already pick up time from internal rtc0 by the time this one gets detected.

Furthermore, systemd-enabled userspace (as in e.g. ArchLinuxARM) interacts with RTC via systemd-timedated and systemd-timesyncd, which both use "/dev/rtc" symlink (and can't be configured to use other devs), which by default udev points to rtc0 as well, and rtc1 - no matter how early it appears - gets completely ignored there as well.

So two issues are with "system clock" that kernel keeps and userspace daemons using wrong RTC, which is default in both cases.

"/dev/rtc" symlink for userspace gets created by udev, according to "/usr/lib/udev/rules.d/50-udev-default.rules", and can be overidden by e.g. "/etc/udev/rules.d/55-i2c-rtc.rules":

SUBSYSTEM=="rtc", KERNEL=="rtc1", SYMLINK+="rtc", OPTIONS+="link_priority=10", TAG+="systemd"

This sets "link_priority" to 10 to override SYMLINK directive for same "rtc" dev node name from "50-udev-default.rules", which has link_priority=-100.

Also, TAG+="systemd" makes systemd track device with its "dev-rtc.device" unit (auto-generated, see systemd.device(5) for more info), which is useful to order userspace daemons depending on that symlink to start strictly after it's there.

"userspace daemons" in question on a basic Arch are systemd-timesyncd and systemd-timedated, of which only systemd-timesyncd starts early on boot, before all other services, including systemd-timedated, sysinit.target and time-sync.target (for early-boot clock-dependant services).

So basically if proper "/dev/rtc" and system clock gets initialized before systemd-timesyncd (or whatever replacement, like ntpd or chrony), correct time and rtc device will be used for all system daemons (which start later) from here on.

Adding that extra step can be done as a separate systemd unit (to avoid messing with shipped systemd-timesyncd.service), e.g. "i2c-rtc.service":

[Unit]
ConditionCapability=CAP_SYS_TIME
ConditionVirtualization=!container
DefaultDependencies=no
Wants=dev-rtc.device
After=dev-rtc.device
Before=systemd-timesyncd.service ntpd.service chrony.service

[Service]
Type=oneshot
CapabilityBoundingSet=CAP_SYS_TIME
PrivateTmp=yes
ProtectSystem=full
ProtectHome=yes
DeviceAllow=/dev/rtc rw
DevicePolicy=closed
ExecStart=/usr/bin/hwclock --hctosys

[Install]
WantedBy=time-sync.target

Note that Before= above should include whatever time-sync daemon is used on the machine, and there's no harm in listing non-existant or unused units there jic.

Most security-related stuff and conditions are picked from systemd-timesyncd unit file, which needs roughly same access permissions as "hwclock" here.

With udev rule and that systemd service (don't forget to "systemctl enable" it), boot sequence goes like this:

  • Kernel inits internal rtc0 and sets system clock to 1970-01-01.
  • Kernel starts systemd.
  • systemd mounts local filesystems and starts i2c-rtc asap.
  • i2c-rtc, due to Wants/After=dev-rtc.device, starts waiting for /dev/rtc to appear.
  • Kernel detects/initializes ds1307 i2c device.
  • udev creates /dev/rtc symlink and tags it for systemd.
  • systemd detects tagging event and activates dev-rtc.device.
  • i2c-rtc starts, adjusting system clock to realistic value from battery-backed rtc.
  • systemd-timesyncd starts, using proper /dev/rtc and correct system clock value.
  • time-sync.target activates, as it is scheduled to, after systemd-timesyncd and i2c-rtc.
  • From there, boot goes on to sysinit.target, basic.target and starts all the daemons.

udev rule is what facilitates symlink and tagging, i2c-rtc.service unit is what makes boot sequence wait for that /dev/rtc to appear and adjusts system clock right after that.

Haven't found an up-to-date and end-to-end description with examples anywhere, so here it is. Cheers!

Next → Page 1 of 11
Member of The Internet Defense League