Nov 27, 2017

Checking/waiting-on linux network parameters from the scripts

It's one thing that's non-trivial to get right with simple scripts.

I.e. how to check address on an interface? How to wait for it to be assigned? How to check gateway address? Etc.

Few common places to look for these things:


Has easily-accessible list of interfaces and a MAC for each in e.g. /sys/class/net/enp1s0/address (useful as a machine id sometimes).

Pros: easy/reliable to access from any scripts.
Cons: very little useful info there.


As systemd-networkd is a common go-to network management tool these days, this one complements it very nicely.

Allows to wait until some specific interface(s) (or all of them) get fully setup, has built-in timeout option too.

E.g. just run systemd-networkd-wait-online -i enp1s0 from any script or even ExecStartPre= of a unit file and you have it waiting for net to be available reliably, no need to check for specific IP or other details.

Pros: super-easy to use from anywhere, even ExecStartPre= of unit files.
Cons: for one very specific (but very common) all-or-nothing use-case.

Doesn't always work for interfaces that need complicated setup by an extra daemon, e.g. wifi, batman-adv or some tunnels.

Also, undocumented caveat: use ExecStartPre=-/.../systemd-networkd-wait-online ... ("=-" is the important bit) for anything that should start regardless of network issues, as thing can exit with non-0 sometimes when there's no network for a while (which does look like a bug, and might be fixed in future versions).

iproute2 json output

If iproute2 is recent enough (4.13.0-4.14.0 and above), then GOOD NEWS!
ip-address and ip-link there have -json output.
(as well as "tc", stat-commands, and probably many other things in later releases)

Parsing ip -json addr or ip -json link output is trivial anywhere except for sh scripts, so it's a very good option if required parts of "ip" are jsonified already.

Pros: easy to parse, will be available everywhere in the near future.
Cons: still very recent, so quite patchy and not ubiquitous yet, not for sh scripts.

Scraping iproute2 (e.g. "ip" and "ss" tools) non-json outputs

Explicitly discouraged by iproute2 docs and a really hacky solution.

Such outputs there are quirky, don't line-up nicely, and clearly not made for this purpose, plus can break at any point, as suggested by the docs repeatedly.

But for terrible sh hacks, sometimes it works, and "ip monitor" even allows to react to netlink events as soon as they appear, e.g.:

[Unit] systemd-networkd.service

ExecStart=/usr/bin/bash -c "\
  exec 2>/dev/null; trap 'pkill -g 0' EXIT; trap exit TERM;\
  awk '/^([0-9]+:\s+'$$dev')?\s+inet\s/ {chk=1; exit} END {exit !chk}'\
  < <( ip addr show dev $$dev;\
    stdbuf -oL ip monitor address dev $$dev &\
    sleep 1 ; ip addr show dev $$dev ; wait )\
  && ip link set $$dev mtu 1280"


That's example of how ugly it can get with events though - two extra checks around ip-monitor are there for a reason (easy to miss event otherwise).

(this specific hack is a workaround for systemd-networkd failing to set mtu in some cases where it has to be done only after other things)

"ip -brief" output is somewhat of an exception, more suitable for sh scripts, but only works for ip-address and ip-link parts and still mixes-up columns occasionally (e.g. ip -br link for tun interfaces).

Pros: allows to access all net parameters and monitor events, easier for sh than json.
Cons: gets ugly fast, hard to get right, brittle and explicitly discouraged.

APIs of running network manager daemons

E.g. NetworkManager has nice-ish DBus API (see two polkit rules/pkla snippets here for how to enable it for regular users), same for wpa_supplicant/hostapd (see wifi-client-match or wpa-systemd-wrapper scripts), dhcpcd has hooks.

systemd-networkd will probably get DBus API too at some point in the near future, beyond simple up/down one that systemd-networkd-wait-online already uses.

Pros: best place to get such info from, can allow some configuration.
Cons: not always there, can be somewhat limited or hard to access.

Bunch of extra modules/tools

Especially for python and such, there's plenty of tools like pyroute2 and netifaces, with occasional things making it into stdlib - e.g. socket.if_* calls (py 3.3+) or ipaddress module (py 3.3+).

Can make things easier in larger projects, where dragging along a bunch of few extra third-party modules isn't too much of a hassle.

Not a great option for drop-in self-contained scripts though, regardless of how good python packaging gets.

Pros: there's a module/lib for everything.
Cons: extra dependencies, with all the api/packaging/breakage/security hassle.

libc and getifaddrs() - low-level way

Same python has ctypes, so why bother with all the heavy/fragile deps and crap, when it can use libc API directly?

Drop-in snippet to grab all the IPv4/IPv6/MAC addresses (py2/py3):

import os, socket, ctypes as ct

class sockaddr_in(ct.Structure):
  _fields_ = [('sin_family', ct.c_short), ('sin_port', ct.c_ushort), ('sin_addr', ct.c_byte*4)]

class sockaddr_in6(ct.Structure):
  _fields_ = [ ('sin6_family', ct.c_short), ('sin6_port', ct.c_ushort),
    ('sin6_flowinfo', ct.c_uint32), ('sin6_addr', ct.c_byte * 16) ]

class sockaddr_ll(ct.Structure):
  _fields_ = [ ('sll_family', ct.c_ushort), ('sll_protocol', ct.c_ushort),
    ('sll_ifindex', ct.c_int), ('sll_hatype', ct.c_ushort), ('sll_pkttype', ct.c_uint8),
    ('sll_halen', ct.c_uint8), ('sll_addr', ct.c_uint8 * 8) ]

class sockaddr(ct.Structure):
  _fields_ = [('sa_family', ct.c_ushort)]

class ifaddrs(ct.Structure): pass
ifaddrs._fields_ = [ # recursive
  ('ifa_next', ct.POINTER(ifaddrs)), ('ifa_name', ct.c_char_p),
  ('ifa_flags', ct.c_uint), ('ifa_addr', ct.POINTER(sockaddr)) ]

def get_iface_addrs(ipv4=False, ipv6=False, mac=False, ifindex=False):
  if not (ipv4 or ipv6 or mac or ifindex): ipv4 = ipv6 = True
  libc = ct.CDLL('', use_errno=True)
  libc.getifaddrs.restype = ct.c_int
  ifaddr_p = head = ct.pointer(ifaddrs())
  ifaces, err = dict(), libc.getifaddrs(ct.pointer(ifaddr_p))
  if err != 0:
    err = ct.get_errno()
    raise OSError(err, os.strerror(err), 'getifaddrs()')
  while ifaddr_p:
    addrs = ifaces.setdefault(ifaddr_p.contents.ifa_name.decode(), list())
    addr = ifaddr_p.contents.ifa_addr
    if addr:
      af = addr.contents.sa_family
      if ipv4 and af == socket.AF_INET:
        ac = ct.cast(addr, ct.POINTER(sockaddr_in)).contents
        addrs.append(socket.inet_ntop(af, ac.sin_addr))
      elif ipv6 and af == socket.AF_INET6:
        ac = ct.cast(addr, ct.POINTER(sockaddr_in6)).contents
        addrs.append(socket.inet_ntop(af, ac.sin6_addr))
      elif (mac or ifindex) and af == socket.AF_PACKET:
        ac = ct.cast(addr, ct.POINTER(sockaddr_ll)).contents
        if mac:
          addrs.append('mac-' + ':'.join(
            map('{:02x}'.format, ac.sll_addr[:ac.sll_halen]) ))
        if ifindex: addrs.append(ac.sll_ifindex)
    ifaddr_p = ifaddr_p.contents.ifa_next
  return ifaces


Result is a dict of iface-addrs (presented as yaml here):

  - fc65::19
  - fe80::c646:19ff:fe64:632f
  - fe80::1ebd:b9ff:fe86:f439
  - ::1
ve: []

Or to get IPv6+MAC+ifindex only - get_iface_addrs(ipv6=True, mac=True, ifindex=True):

  - mac-c4:46:19:64:63:2f
  - 2
  - fc65::19
  - fe80::c646:19ff:fe64:632f
  - mac-1c:bd:b9:86:f4:39
  - 3
  - fe80::1ebd:b9ff:fe86:f439
  - mac-00:00:00:00:00:00
  - 1
  - ::1
  - mac-36:65:67:f7:99:dc
  - 5
wg: []

Tend to use this as a drop-in boilerplate/snippet in python scripts that need IP address info, instead of adding extra deps - libc API should be way more stable/reliable than these anyway.

Same can be done in any other full-featured scripts, of course, not just python, but bash scripts are sorely out of luck.

Pros: first-hand address info, stable/reliable/efficient, no extra deps.
Cons: not for 10-liner sh scripts, not much info, bunch of boilerplate code.

libmnl - same way as iproute2 does it

libc.getifaddrs() doesn't provide much info beyond very basic ip/mac addrs and iface indexes, and the rest should be fetched from kernel via netlink sockets.

libmnl wraps those, and is used by iproute2, so comes out of the box on any modern linux, so its API can be used in the same way as libc above from full-featured scripts like python:

import os, socket, resource, struct, time, ctypes as ct

class nlmsghdr(ct.Structure):
  _fields_ = [
    ('len', ct.c_uint32),
    ('type', ct.c_uint16), ('flags', ct.c_uint16),
    ('seq', ct.c_uint32), ('pid', ct.c_uint32) ]

class nlattr(ct.Structure):
  _fields_ = [('len', ct.c_uint16), ('type', ct.c_uint16)]

class rtmsg(ct.Structure):
  _fields_ = ( list( (k, ct.c_uint8) for k in
      'family dst_len src_len tos table protocol scope type'.split() )
    + [('flags', ct.c_int)] )

class mnl_socket(ct.Structure):
  _fields_ = [('fd', ct.c_int), ('sockaddr_nl', ct.c_int)]

def get_route_gw(addr=''):
  libmnl = ct.CDLL('', use_errno=True)
  def _check(chk=lambda v: bool(v)):
    def _check(res, func=None, args=None):
      if not chk(res):
        errno_ = ct.get_errno()
        raise OSError(errno_, os.strerror(errno_))
      return res
    return _check
  libmnl.mnl_nlmsg_put_header.restype = ct.POINTER(nlmsghdr)
  libmnl.mnl_nlmsg_put_extra_header.restype = ct.POINTER(rtmsg)
  libmnl.mnl_attr_put_u32.argtypes = [ct.POINTER(nlmsghdr), ct.c_uint16, ct.c_uint32]
  libmnl.mnl_socket_open.restype = mnl_socket
  libmnl.mnl_socket_open.errcheck = _check()
  libmnl.mnl_socket_bind.argtypes = [mnl_socket, ct.c_uint, ct.c_int32]
  libmnl.mnl_socket_bind.errcheck = _check(lambda v: v >= 0)
  libmnl.mnl_socket_get_portid.restype = ct.c_uint
  libmnl.mnl_socket_get_portid.argtypes = [mnl_socket]
  libmnl.mnl_socket_sendto.restype = ct.c_ssize_t
  libmnl.mnl_socket_sendto.argtypes = [mnl_socket, ct.POINTER(nlmsghdr), ct.c_size_t]
  libmnl.mnl_socket_sendto.errcheck = _check(lambda v: v >= 0)
  libmnl.mnl_socket_recvfrom.restype = ct.c_ssize_t
  libmnl.mnl_nlmsg_get_payload.restype = ct.POINTER(rtmsg)
  libmnl.mnl_attr_validate.errcheck = _check(lambda v: v >= 0)
  libmnl.mnl_attr_get_payload.restype = ct.POINTER(ct.c_uint32)

  if '/' in addr: addr, cidr = addr.rsplit('/', 1)
  else: cidr = 32

  buf = ct.create_string_buffer(min(resource.getpagesize(), 8192))
  nlh = libmnl.mnl_nlmsg_put_header(buf)
  nlh.contents.type = 26 # RTM_GETROUTE
  nlh.contents.flags = 1 # NLM_F_REQUEST
  # nlh.contents.flags = 1 | (0x100|0x200) # NLM_F_REQUEST | NLM_F_DUMP
  nlh.contents.seq = seq = int(time.time())
  rtm = libmnl.mnl_nlmsg_put_extra_header(nlh, ct.sizeof(rtmsg)) = socket.AF_INET

  addr, = struct.unpack('=I', socket.inet_aton(addr))
  libmnl.mnl_attr_put_u32(nlh, 1, addr) # 1=RTA_DST
  rtm.contents.dst_len = int(cidr)

  nl = libmnl.mnl_socket_open(0) # NETLINK_ROUTE
  libmnl.mnl_socket_bind(nl, 0, 0) # nl, 0, 0=MNL_SOCKET_AUTOPID
  port_id = libmnl.mnl_socket_get_portid(nl)
  libmnl.mnl_socket_sendto(nl, nlh, nlh.contents.len)

  addr_gw = None

  @ct.CFUNCTYPE(ct.c_int, ct.POINTER(nlattr), ct.c_void_p)
  def data_ipv4_attr_cb(attr, data):
    nonlocal addr_gw
    if attr.contents.type == 5: # RTA_GATEWAY
      libmnl.mnl_attr_validate(attr, 3) # MNL_TYPE_U32
      addr = libmnl.mnl_attr_get_payload(attr)
      addr_gw = socket.inet_ntoa(struct.pack('=I', addr[0]))
    return 1 # MNL_CB_OK

  @ct.CFUNCTYPE(ct.c_int, ct.POINTER(nlmsghdr), ct.c_void_p)
  def data_cb(nlh, data):
    rtm = libmnl.mnl_nlmsg_get_payload(nlh).contents
    if == socket.AF_INET and rtm.type == 1: # RTN_UNICAST
      libmnl.mnl_attr_parse(nlh, ct.sizeof(rtm), data_ipv4_attr_cb, None)
    return 1 # MNL_CB_OK

  while True:
    ret = libmnl.mnl_socket_recvfrom(nl, buf, ct.sizeof(buf))
    if ret <= 0: break
    ret = libmnl.mnl_cb_run(buf, ret, seq, port_id, data_cb, None)
    if ret <= 0: break # 0=MNL_CB_STOP
    break # MNL_CB_OK for NLM_F_REQUEST, don't use with NLM_F_DUMP!!!
  if ret == -1: raise OSError(ct.get_errno(), os.strerror(ct.get_errno()))

  return addr_gw


This specific boilerplate will fetch the gateway IP address to (i.e. to the internet), used it in systemd-watchdog script recently.

It might look a bit too complex for such apparently simple task at this point, but allows to do absolutely anything network-related - everything "ip" (iproute2) does, including configuration (addresses, routes), creating/setting-up new interfaces ("ip link add ..."), all the querying (ip-route, ip-neighbor, ss/netstat, etc), waiting and async monitoring (ip-monitor, conntrack), etc etc...

Pros: can do absolutely anything, directly, stable/reliable/efficient, no extra deps.
Cons: definitely not for 10-liner sh scripts, boilerplate code.


iproute2 with -json output flag should be good enough for most cases where systemd-networkd-wait-online is not sufficient, esp. if more commands there (like ip-route and ip-monitor) will support it in the future (thanks to Julien Fortin and all other people working on this!).

For more advanced needs, it's usually best to query/control whatever network management daemon or go to libc/libmnl directly.

Oct 11, 2017

Force-enable HDMI to specific mode in linux framebuffer console

Bumped into this issue when running latest mainline kernel (4.13) on ODROID-C2 - default fb console for HDMI output have to be configured differently there (and also needs a dtb patch to hook it up).

Old vendor kernels for that have/use a bunch of cmdline options for HDMI - hdmimode, hdmitx (cecconfig), vout (hdmi/dvi/auto), overscan_*, etc - all custom and non-mainline.

With mainline DRM (as in Direct Rendering Manager) and framebuffer modules, video= option seem to be the way to set/force specific output and resolution instead.

When display is connected on boot, it can work without that if stars align correctly, but that's not always the case as it turns out - only 1 out of 3 worked that way.

But even if display works on boot, plugging HDMI after boot never works anyway, and that's the most (only) useful thing for it (debug issues, see logs or kernel panic backtrace there, etc)!

DRM module (meson_dw_hdmi in case of C2) has its HDMI output info in /sys/class/drm/card0-HDMI-A-1/, which is where one can check on connected display, dump its EDID blob (info, supported modes), etc.

cmdline option to force this output to be used with specific (1080p60) mode:


More info on this spec is in Documentation/fb/modedb.txt, but the gist is "<ouput>:<WxH>@<rate><flags>" with "e" flag at the end is "force the display to be enabled", to avoid all that hotplug jank.

Should set mode for console (see e.g. fbset --info), but at least with meson_dw_hdmi this is insufficient, which it's happy to tell why when loading with extra drm.debug=0xf cmdline option - doesn't have any supported modes, returns MODE_BAD for all non-CEA-861 modes that are default in fb-modedb.

Such modelines are usually supplied from EDID blobs by the display, but if there isn't one connected, blob should be loaded from somewhere else (and iirc there are ways to define these via cmdline).

Luckily, kernel has built-in standard EDID blobs, so there's no need to put anything to /lib/firmware, initramfs or whatever:

drm_kms_helper.edid_firmware=edid/1920x1080.bin video=HDMI-A-1:1920x1080@60e

And that finally works.

Not very straightforward, and doesn't seem to be documented in one place anywhere with examples (ArchWiki page on KMS probably comes closest).

Apr 27, 2017

WiFi hostapd configuration for 802.11ac networks

Running Wireless AP on linux is pretty much always done through handy hostapd tool, which sets the necessary driver parameters and handles authentication and key management aspects of an infrastructure mode access point operation.

Its configuration file has plenty of options, which get initialized to a rather conserative defaults, resulting in suboptimal bendwidth with anything from this decade, e.g. 802.11n or 802.11ac cards/dongles.

Furthermore, it seem to assume decent amount of familiarity with IEEE standards on WiFi protocols, which are mostly paywalled (though can easily be pirated ofc, just use google).

Specifically, channel selection for VHT (802.11ac) there is a bit of a nightmare, as hostapd code not only has (undocumented afaict) whitelist for these, but also needs more than one parameter to set them.

I'm not an expert on wireless links and wifi specifically, just had to setup one recently (and even then, without going into STBC, Beamforming and such), so don't take this info as some kind of authoritative "how it must be done" guide - just my 2c and nothing more.

Anyway, first of all, to get VHT ("Very High Throughput") aka 802.11ac mode at all, following hostapd config can be used as a baseline:



# ieee80211d=1
# ieee80211h=1






There, important bits are obviously stuff at the top - ssid and wpa_passphrase.

But also country_code, as it will apply all kinds of restrictions on 5G channels that one can use.

ieee80211d/ieee80211h are related to these country_code restrictions, and are probably necessary for some places and when/if DFS (dynamic frequency selection) is used, but more on that later.

If that config doesn't work (started with e.g. hostapd myap.conf), and not just due to some channel conflict or regulatory domain (i.e. country_code) error, probably worth running hostapd command with -d option and seeing where it fails exactly, though most likely after nl80211: Set freq ... (ht_enabled=1, vht_enabled=1, bandwidth=..., cf1=..., cf2=...) log line (and list of options following it), with some "Failed to set X: Invalid argument" error from kernel driver.

When that's the case, if it's not just bogus channel (see below), probably worth to stop right here and see why driver rejects this basic stuff - could be that it doesn't actually supports running AP and/or VHT mode (esp. for proprietary ones) or something, which should obviously be addressed first.

VHT (Very High Throughput mode, aka 802.11ac, page 214 in 802.11ac-2013.pdf) is extension of HT (High Throughput aka 802.11n) mode and can use 20 MHz, 40 MHz, 80 MHz, 160 MHz and 80+80 MHz channel widths, which basically set following caps on bandwidth:

  • 20 MHz - 54 Mbits/s
  • 40 MHz - 150-300 Mbits/s
  • 80 MHz - 300+ Mbits/s
  • 160 MHz or 80+80 MHz (two non-contiguous 80MHz chans) - moar!!!

Most notably, 802.11ac requires to support only up to 80MHz-wide chans, with 160 and 80+80 being optional, so pretty much guaranteed to be not supported by 95% of cheap-ish dongles, even if they advertise "full 802.11ac support!", "USB 3.0!!!" or whatever - forget it.

"vht_oper_chwidth" parameter sets channel width to use, so "vht_oper_chwidth=1" (80 MHz) is probably safe choice for ac here.

Unless ACS - Automatic Channel Selection - is being used (which is maybe a good idea, but not described here at all), both "channel" and "vht_oper_centr_freq_seg0_idx" parameters must be set (and also "vht_oper_centr_freq_seg1_idx" for 80+80 vht_oper_chwidth=3 mode).

"vht_oper_centr_freq_seg0_idx" is "dot11CurrentChannelCenterFrequencyIndex0" from 802.11ac-2013.pdf ( on page 248 and 22.3.14 on page 296), while "channel" option is "dot11CurrentPrimaryChannel".

Relation between these for 80MHz channels is the following one:

vht_oper_centr_freq_seg0_idx = channel + 6

Where "channel" can only be picked from the following list (see hw_features_common.c in hostapd sources):

36 44 52 60 100 108 116 124 132 140 149 157 184 192

And vht_oper_centr_freq_seg0_idx can only be one of:

42 58 106 122 138 155

Furthermore, picking anything but 36/42 and 149/155 is probably restricted by DFS and/or driver, and if you have any other 5G APs around, can also be restricted by conflicts with these, as detected/reported by hostapd on start.

Which is kinda crazy - you've got your fancy 802.11ac hardware and maybe can't even use it because hostapd refuses to use any channels if there's other 5G AP or two around.

BSS conflicts (with other APs) are detected on start only and are easy to patch-out with hostapd-2.6-no-bss-conflicts.patch - just 4 lines to hw_features.c and hw_features_common.c there, should be trivial to adopt for any newer hostpad version.

But that still leaves all the DFS/no-IR and whatever regdb-special channels locked, which is safe for legal reasons, but also easy to patch-out in crda (loader tool for regdb) and wireless-regdb (info on regulatory domains, e.g. US and such) packages, e.g.:

crda patch is needed to disable signature check on loaded db.txt file, and alternatively different public key can be used there, but it's less hassle this way.

Note that using DFS/no-IR-marked frequencies with these patches is probably breaking the law, though no idea if and where these are actually enforced.

Also, if crda/regdb is not installed or country_code not picked, "00" regulatory domain is used by the kernel, which is the most restrictive subset (to be ok to use anywhere), and is probably never a good idea.

All these tweaks combined should already provide ~300 Mbits/s (half-duplex) on a single 80 MHz channel (any from the lists above).

Beyond that, I think "vht_capab" set should be tweaked to enable STBC/LDPC (space-time block coding) capabilities - i.e. using multiple RX/TX antennas - which are all disabled by default, and beamforming stuff.

These are all documented in hostapd.conf, but dongles and/or rtl8812au driver I've been using didn't have support for any of that, so didn't go there myself.

There's also bunch of wmm_* and tx_queue_* parameters, which seem to be for QoS (prioritizing some packets over others when at 100% capacity). Tinkering with these doesn't affect iperf3 resutls obviously, and maybe should be done in linux QoS subsystem ("tc" tool) instead anyway. Plenty of snippets for tweaking them are available on mailing lists and such, but should probably be adjusted for specific traffic/setup.

One last important bandwidth optimization for both AP and any clients (stations) is disabling all the power saving stuff with iw dev wlan0 set power_save off.

Failing to do that can completely wreck performance, and can usually be done via kernel module parameter in /etc/modprobe.d/ instead of running "iw".

No patches or extra configuration for wpa_supplicant (tool for infra-mode "station" client) are necessary - it will connect just fine to anything and pick whatever is advertised, if hw supports all that stuff.

Feb 13, 2017

Xorg input driver - the easy way, via evdev and uinput

Got to reading short stories in Column Reader from laptop screen before sleep recently, and for an extra-lazy points, don't want to drag my hand to keyboard to flip pages (or columns, as the case might be).

Easy fix - get any input device and bind stuff there to keys you'd normally use.
As it happens, had Xbox 360 controller around for that.

Hard part is figuring out how to properly do it all in Xorg - need to build xf86-input-joystick first (somehow not in Arch core), then figure out how to make it act like a dumb event source, not some mouse emulator, and then stuff like xev and xbindkeys will probably help.

This is way more complicated than it needs to be, and gets even more so when you factor-in all the Xorg driver quirks, xev's somewhat cryptic nature (modifier maps, keysyms, etc), fact that xbindkeys can't actually do "press key" actions (have to use stuff like xdotool for that), etc.

All the while reading these events from linux itself is as trivial as evtest /dev/input/event11 (or for event in dev.read_loop(): ...) and sending them back is just ui.write(e.EV_KEY, e.BTN_RIGHT, 1) via uinput device.

Hence whole binding thing can be done by a tiny python loop that'd read events from whatever specified evdev and write corresponding (desired) keys to uinput.

So instead of +1 pre-naptime story, hacked together a script to do just that - evdev-to-xev (python3/asyncio) - which reads mappings from simple YAML and runs the loop.

For example, to bind right joystick's (on the same XBox 360 controller) extreme positions to cursor keys, plus triggers, d-pad and bumper buttons there:


  ## Right stick
  # Extreme positions are ~32_768
  ABS_RX <-30_000: left
  ABS_RX >30_000: right
  ABS_RY <-30_000: up
  ABS_RY >30_000: down

  ## Triggers
  # 0 (idle) to 255 (fully pressed)
  ABS_Z >200: left
  ABS_RZ >200: right

  ## D-pad
  ABS_HAT0Y -1: leftctrl leftshift equal
  ABS_HAT0Y 1: leftctrl minus
  ABS_HAT0X -1: pageup
  ABS_HAT0X 1: pagedown

  ## Bumpers
  BTN_TL 1: [h,e,l,l,o,space,w,o,r,l,d,enter]
  BTN_TR 1: right

  hold: 0.02
  delay: 0.02
  repeat: 0.5
Run with e.g.: evdev-to-xev -c xbox-scroller.yaml /dev/input/event11
(see also less /proc/bus/input/devices and evtest /dev/input/event11).
Running the thing with no config will print example one with comments/descriptions.

Given how all iterations of X had to work with whatever input they had at the time, plus not just on linux, even when evdev was around, hard to blame it for having a bit of complexity on top of way simplier input layer underneath.

In linux, aforementioned Xbox 360 gamepad is supported by "xpad" module (so that you'd get evdev node for it), and /dev/uinput for simulating arbitrary evdev stuff is "uinput" module.

Script itself needs python3 and python-evdev, plus evtest can be useful.
No need for any extra Xorg drivers beyond standard evdev.

Most similar tool to such script seem to be actkbd, though afaict, one'd still need to run xdotool from it to simulate input :O=

Github link: evdev-to-xev script (in the usual mk-fg/fgtk scrap-heap)

Feb 06, 2017

nftables dnat from loopback to somewhere else

Honestly didn't think NAT'ing traffic from "lo" interface was even possible, because traffic to host's own IP doesn't go through *ROUTING chains with iptables, and never used "-j DNAT" with OUTPUT, which apparently works there as well.

And then also, according to e.g. Netfilter-packet-flow.svg, unlike with nat-prerouting, nat-output goes after routing decision was made, so no point mangling IPs there, right?

Wrong, totally possible to redirect "OUT=lo" stuff to go out of e.g. "eth0" with the usual dnat/snat, with something like this:

table ip nat {
  chain in { type nat hook input priority -160; }
  chain out { type nat hook output priority -160; }
  chain pre { type nat hook prerouting priority -90; }
  chain post { type nat hook postrouting priority 110; }

add rule ip nat out oifname lo \
  ip saddr $own-ip ip daddr $own-ip \
  tcp dport {80, 443} dnat $somehost
add rule ip nat post oifname eth0 \
  ip saddr $own-ip ip daddr $somehost \
  tcp dport {80, 443} masquerade

Note the bizarre oifname lo ip saddr $own-ip ip daddr $own-ip thing.

One weird quirk - if "in" (arbitrary name, nat+input hook is the important bit) chain isn't defined, dnat will only work one-way, not rewriting IPs in response packets.

One explaination wrt routing decision here might be arbitrary priorities that nftables allows to set for hooks (and -160 is before iptables mangle stuff).

So, from-loopback-and-back forwarding, huh.
To think of all the redundant socats and haproxies I've seen and used for this purpose earlier...

Oct 16, 2016

Redirecting hosts or replacing files just for one app with mount namespaces

My problem was this: how do you do essentially a split-horizon DNS for different apps in the same desktop session.

E.g. have claws-mail mail client go to localhost for (because it has port forwarded thru "ssh -L"), while the rest of them (e.g. browser and such) keep using normal public IP.

Usually one'd use /etc/hosts for something like that, but it applies to all apps on the machine, of course.

Next obvious option (mostly because it's been around forever) is to LD_PRELOAD something that'd either override getaddrinfo() or open() for /etc/hosts, but that sounds like work and not included in util-linux (yet?).

Easiest and newest (well, new-ish, CLONE_NEWNS has been around since linux-3.8 and 2013) way to do that is to run the thing in its own "mount namespace", which sounds weird until you combine that with the fact that you can bind-mount files (like that /etc/hosts one).

So, the magic line is:

# unshare -m sh -c\
  'mount -o bind /etc/hosts.forwarding /etc/hosts
    && exec sudo -EHin -u myuser -- exec claws-mail'

Needs /etc/hosts.forwarding replacement-file for this app, which it will see as a proper /etc/hosts, along with root privileges (or CAP_SYS_ADMIN) for CLONE_NEWNS.

Crazy "sudo -EHin" shebang is to tell sudo not to drop much env, but still behave kinda as if on login, run zshrc and all that. "su - myuser" or "machinectl shell myuser@ -- ..." can also be used there.

Replacing files like /etc/nsswitch.conf or /etc/{passwd,group} that way, one can also essentially do any kind of per-app id-mapping - cool stuff.

Of course, these days sufficiently paranoid or advanced people might as well run every app in its own set of namespaces anyway, and have pretty much everything per-app that way, why the hell not.

Sep 25, 2016

nftables re-injected IPSec matching without xt_policy

As of linux-4.8, something like xt_policy is still - unfortunately - on the nftables TODO list, so to match traffic pre-authenticated via IPSec, some workaround is needed.

Obvious one is to keep using iptables/ip6tables to mark IPSec packets with old xt_policy module, as these rules interoperate with nftables just fine, with only important bit being ordering of iptables hooks vs nft chain priorities, which are rather easy to find in "netfilter_ipv{4,6}.h" files, e.g.:

enum nf_ip_hook_priorities {
  NF_IP_PRI_RAW = -300,
  NF_IP_PRI_MANGLE = -150,
  NF_IP_PRI_NAT_DST = -100,
  NF_IP_PRI_NAT_SRC = 100,

(see also Netfilter-packet-flow.svg by Jan Engelhardt for general overview of the iptables hook positions, nftables allows to define any number of chains before/after these)

So marks from iptables/ip6tables rules like these:

-A PREROUTING -m policy --dir in --pol ipsec --mode transport -j MARK --or-mark 0x101
-A OUTPUT -m policy --dir out --pol ipsec --mode transport -j MARK --or-mark 0x101

Will be easy to match in priority=0 input/ouput hooks (as NF_IP_PRI_RAW=-300) of nft ip/ip6/inet tables (e.g. mark and 0x101 == 0x101 accept)

But that'd split firewall configuration between iptables/nftables, adding more hassle to keep whole "iptables" thing initialized just for one or two rules.

xfrm transformation (like ipsec esp decryption in this case) seem to preserve all information about the packet intact, including packet marks (but not conntrack states, which track esp connection), which - as suggested by Florian Westphal in #netfilter - can be utilized to match post-xfrm packets in nftables by this preserved mark field.

E.g. having this (strictly before ct state {established, related} accept for stateful firewalls, as each packet has to be marked):

define cm.ipsec = 0x101
add rule inet filter input ip protocol esp mark set mark or $cm.ipsec
add rule inet filter input ip6 nexthdr esp mark set mark or $cm.ipsec
add rule inet filter input mark and $cm.ipsec == $cm.ipsec accept

Will mark and accept both still-encrypted esp packets (IPv4/IPv6) and their decrypted payload.

Note that this assumes that all IPSec connections are properly authenticated and trusted, so be sure not to use anything like that if e.g. opportunistic encryption is enabled.

Much simplier nft-only solution, though still not a full substitute for what xt_policy does, of couse.

Nov 28, 2015

Raspberry Pi early boot splash / logo screen

Imagine you have RPi with some target app (e.g. kiosk-mode browser) starting in X session, and want to have a full-screen splash for the whole time that device will be booting, and no console output or getty's of any kind, and no other splash screens in there - only black screen to logo to target app.

In case of average Raspberry Pi boot, there is:

  • A firmware "color test" splash when device gets powered-on.

    Removed with disable_splash=1 in /boot/config.txt.

  • "Rainbow-colored square" over-current indicator popping up on the right, regardless of PSU or cables, it seems.

    avoid_warnings=1 in the same config.txt

  • As kernel boots - Raspberry Pi logo embedded in it.

    logo.nologo to /boot/cmdline.txt.

    Replacing that logo with proper splash screen is not really an option, as logos that work there have to be tiny - like 80x80 pixels tiny.

    Anything larger than that gives fbcon_init: disable boot-logo (boot-logo bigger than screen), so in-kernel logo isn't that useful, and it's a pain to embed it there anyway (kernel rebuild!).

  • Lots of console output - from kernel and init both.

    cmdline.txt: console=null quiet

  • Getty showing its login prompt.

    systemctl disable getty@tty1

  • More printk stuff, as various kernel modules get initialized and hardware detected.

    console=null in cmdline.txt should've removed all that.

    consoleblank=0 loglevel=1 rootfstype=ext4 helps if console=null is not an option, e.g. because "fbi" should set logo there (see below).

    Need for "rootfstype" is kinda funny, because messages from kernel trying to mount rootfs as ext2/ext3 seem to be emergency-level or something.

  • Removing all the stuff above should (finally!) get a peaceful black screen, but what about the actual splash image?

    fbi -d /dev/fb0 --once --noverbose\
      --autozoom /path/to/image.png </dev/tty1 >/dev/tty1

    Or, in early-boot-systemd terms:

    ExecStart=/usr/bin/fbi -d /dev/fb0\
      --once --noverbose --autozoom /path/to/image.png

    "fbi" is a tool from fbida project.

    console=null should NOT be in cmdline for this tool to work (see above).

    First time you run it, you'll probably get:

    ioctl VT_GETSTATE: Inappropriate ioctl for device (not a linux console?)

    A lot of people on the internets seem to suggest something like "just run it from Alt + F1 console", which definitely isn't an option for this case, but I/O redirection to /dev/tty (as shown above) seem to work.

  • Blank black screen and whatever flickering on X startup.

    Running X on a different VT from "fbi" seem to have nice effect that if X will have to be restarted for some reason (e.g. whole user session gets restarted due to target app's watchdog + StartLimitAction=), VT will switch back to a nice logo, not some text console.

    To fix blackness in X before-and-after WM, there're tools like feh:

    feh --bg-scale /path/to/image.png

    That's not instant though, as X usually takes its time starting up, so see more on it below.

  • Target app startup cruft - e.g. browser window without anything loaded yet, or worse - something like window elements being drawn.

    • There can be some WM tricks to avoid showing unprepared window, including "start minimized, then maximize", switching "virtual desktops", overlay windows, transparency with compositors, etc.

      Depends heavily on WM, obviously, and needs one that can be controlled from the script (which is rather common among modern standalone WMs).

    • Another trick is to start whole X without switching VT - i.e. X -novtswitch vt2 - and switch to that VT later when both X and app signal that they're ready, or just been given enough time.

      Until switch happens, splash logo is displayed, courtesy of "fbi" tool.

    • On Raspberry Pi in particular, there're some direct-to-display VideoCore APIs, which allow to overlay anything on top of whatever Linux or X draw in their VTs while starting-up.

      This is actually a cool thing - e.g. starting omxplayer --no-osd --no-keys /path/to/image.png.mp4 (mp4 produced from still image) early on boot (it doesn't need X or anything!) will remove the need for most previous steps, as it will eclipse all the other video output.

      "omxplayer" maybe isn't the best tool for the job, as it's not really meant to display still images, but it's fast and liteweight enough.

      Better alternative I've found is to use OpenVG API via openvg lib, which has nice Go (golang) version, and wrote an overlay-image.go tool to utilize it for this simple "display image and hang forever" (to be stopped when boot finishes) purpose.

      Aforementioned Go tool has "-resize" flag to scale the image to current display size with "convert" and cache it with ".cache-WxH" suffix, and "-bg-color" option to set margins' color otherwise (for e.g. logo centered with solid color around it). Can be built (be sure to set $GOPATH first) with: go get && go build .

  • Finally some destination state with target app showing what it's supposed to.

    Yay, we got here!

Not a very comprehensive or coherent guide, but might be useful to sweep all the RPi nasties under an exquisite and colorful rug ;)

Update 2015-11-30: Added link to overlay-image.go tool.

Update 2015-11-30: A bit different version (cleaned-up, with build-dep on "" instead of optional call to "convert") of this tool has been added to openvg lib repo under "go-client/splash".

Nov 25, 2015

Replacing built-in RTC with i2c battery-backed one on BeagleBone Black from boot

Update 2018-02-16:

There is a better and simplier solution for switching default rtc than one described here - to swap rtc0/rtc1 via aliases in dt overlay.

I.e. using something like this in the dts:

fragment@0 {
  __overlay__ {

    aliases {
      rtc0 = "/ocp/i2c@4802a000/ds3231@68";
      rtc1 = "/ocp/rtc@44e3e000";
Which is also how it's done in repo these days:

BeagleBone Black (BBB) boards have - and use - RTC (Real-Time Clock - device that tracks wall-clock time, including calendar date and time of day) in the SoC, which isn't battery-backed, so looses track of time each time device gets power-cycled.

This represents a problem if keeping track of time is necessary and there's no network access (or a patchy one) to sync this internal RTC when board boots up.

Easy solution to that, of course, is plugging external RTC device, with plenty of cheap chips with various precision available, most common being Dallas/Maxim ICs like DS1307 or DS3231 (a better one of the line) with I2C interface, which are all supported by Linux "ds1307" module.

Enabling connected chip at runtime can be easily done with a command like this:

echo ds1307 0x68 >/sys/bus/i2c/devices/i2c-2/new_device

(see this post on Fortune Datko blog and/or this one on minix-i2c blog for ways to tell reliably which i2c device in /dev corresponds to which bus and pin numbers on BBB headers, and how to check/detect/enumerate connected devices there)

This obviously doesn't enable device straight from the boot though, which is usually accomplished by adding the thing to Device Tree, and earlier with e.g. 3.18.x kernels it had to be done by patching and re-compiling platform dtb file used on boot.

But since 3.19.x kernels (and before 3.9.x), easier way seem to be to use Device Tree Overlays (usually "/lib/firmware/*.dtbo" files, compiled by "dtc" from dts files), which is kinda like patching Device Tree, only done at runtime.

Code for such patch in my case ("i2c2-rtc-ds3231.dts"), with 0x68 address on i2c2 bus and "ds3231" kernel module (alias for "ds1307", but more appropriate for my chip):


/* dtc -O dtb -o /lib/firmware/BB-RTC-02-00A0.dtbo -b0 i2c2-rtc-ds3231.dts */
/* bone_capemgr.enable_partno=BB-RTC-02 */
/* */

/ {
  compatible = "ti,beaglebone", "ti,beaglebone-black", "ti,beaglebone-green";
  part-number = "BB-RTC-02";
  version = "00A0";

  fragment@0 {
    target = <&i2c2>;

    __overlay__ {
      pinctrl-names = "default";
      pinctrl-0 = <&i2c2_pins>;
      status = "okay";
      clock-frequency = <100000>;
      #address-cells = <0x1>;
      #size-cells = <0x0>;

      rtc: rtc@68 {
        compatible = "dallas,ds3231";
        reg = <0x68>;

As per comment in the overlay file, can be compiled ("dtc" comes from same-name package on ArchLinuxARM) to the destination with:

dtc -O dtb -o /lib/firmware/BB-RTC-02-00A0.dtbo -b0 i2c2-rtc-ds3231.dts

Update 2017-09-19: This will always produce a warning for each "fragment" section, safe to ignore.

And then loaded on early boot (as soon as rootfs with "/lib/firmware" gets mounted) with "bone_capemgr.enable_partno=" cmdline addition, and should be put to something like "/boot/uEnv.txt", for example (with dtb path from command above):

setenv bootargs "... bone_capemgr.enable_partno=BB-RTC-02"

Docs in repository have more details and examples on how to write and manage these.

Update 2017-09-19:

Modern ArchLinuxARM always uses initramfs image, which is starting rootfs, where kernel will look up stuff in /lib/firmware, so all *.dtbo files loaded via bone_capemgr.enable_partno= must be included there.

With Arch-default mkinitcpio, it's easy to do via FILES= in "/etc/mkinitcpio.conf" - e.g. FILES="/lib/firmware/BB-RTC-02-00A0.dtbo" - and re-running the usual mkinitcpio -g /boot/initramfs-linux.img command.

If using e.g. i2c-1 instead (with &bb_i2c1_pins), BB-I2C1 should also be included and loaded there, strictly before rtc overlay.

Update 2017-09-19:

bone_capemgr seem to be broken in linux-am33x packages past 4.9, and produces kernel BUG instead of loading any overlays - be sure to try loading that dtbo at runtime (checking dmesg) before putting it into cmdline, as that might make system unbootable.

Workaround is to downgrade kernel to 4.9, e.g. one from beagleboard/linux tree, where it's currently the latest supported release.

That should ensure that this second RTC appears as "/dev/rtc1" (rtc0 is the internal one) on system startup, but unfortunately it still won't be the first one and kernel will already pick up time from internal rtc0 by the time this one gets detected.

Furthermore, systemd-enabled userspace (as in e.g. ArchLinuxARM) interacts with RTC via systemd-timedated and systemd-timesyncd, which both use "/dev/rtc" symlink (and can't be configured to use other devs), which by default udev points to rtc0 as well, and rtc1 - no matter how early it appears - gets completely ignored there as well.

So two issues are with "system clock" that kernel keeps and userspace daemons using wrong RTC, which is default in both cases.

"/dev/rtc" symlink for userspace gets created by udev, according to "/usr/lib/udev/rules.d/50-udev-default.rules", and can be overidden by e.g. "/etc/udev/rules.d/55-i2c-rtc.rules":

SUBSYSTEM=="rtc", KERNEL=="rtc1", SYMLINK+="rtc", OPTIONS+="link_priority=10", TAG+="systemd"

This sets "link_priority" to 10 to override SYMLINK directive for same "rtc" dev node name from "50-udev-default.rules", which has link_priority=-100.

Also, TAG+="systemd" makes systemd track device with its "dev-rtc.device" unit (auto-generated, see systemd.device(5) for more info), which is useful to order userspace daemons depending on that symlink to start strictly after it's there.

"userspace daemons" in question on a basic Arch are systemd-timesyncd and systemd-timedated, of which only systemd-timesyncd starts early on boot, before all other services, including systemd-timedated, and (for early-boot clock-dependant services).

So basically if proper "/dev/rtc" and system clock gets initialized before systemd-timesyncd (or whatever replacement, like ntpd or chrony), correct time and rtc device will be used for all system daemons (which start later) from here on.

Adding that extra step can be done as a separate systemd unit (to avoid messing with shipped systemd-timesyncd.service), e.g. "i2c-rtc.service":

Before=systemd-timesyncd.service ntpd.service chrony.service

DeviceAllow=/dev/rtc rw
ExecStart=/usr/bin/hwclock -f /dev/rtc --hctosys


Update 2017-09-19: -f /dev/rtc must be specified these days, as hwclock seem to use /dev/rtc0 by default, pretty sure it didn't used to.

Note that Before= above should include whatever time-sync daemon is used on the machine, and there's no harm in listing non-existant or unused units there jic.

Most security-related stuff and conditions are picked from systemd-timesyncd unit file, which needs roughly same access permissions as "hwclock" here.

With udev rule and that systemd service (don't forget to "systemctl enable" it), boot sequence goes like this:

  • Kernel inits internal rtc0 and sets system clock to 1970-01-01.
  • Kernel starts systemd.
  • systemd mounts local filesystems and starts i2c-rtc asap.
  • i2c-rtc, due to Wants/After=dev-rtc.device, starts waiting for /dev/rtc to appear.
  • Kernel detects/initializes ds1307 i2c device.
  • udev creates /dev/rtc symlink and tags it for systemd.
  • systemd detects tagging event and activates dev-rtc.device.
  • i2c-rtc starts, adjusting system clock to realistic value from battery-backed rtc.
  • systemd-timesyncd starts, using proper /dev/rtc and correct system clock value.
  • activates, as it is scheduled to, after systemd-timesyncd and i2c-rtc.
  • From there, boot goes on to, and starts all the daemons.

udev rule is what facilitates symlink and tagging, i2c-rtc.service unit is what makes boot sequence wait for that /dev/rtc to appear and adjusts system clock right after that.

Haven't found an up-to-date and end-to-end description with examples anywhere, so here it is. Cheers!

Oct 22, 2015

Arch Linux chroot and sshd from boot on rooted Android with SuperSU

Got myself Android device only recently, and first thing I wanted to do, of course, was to ssh into it.

But quick look at the current F-Droid and Play apps shows that they either based on a quite limited dropbear (though perfectly fine for one-off shell access) or "based on openssh code", where infamous Debian OpenSSL code patch comes to mind, not to mention that most look like ad-ridden proprietary piece of crap.

Plus I'd want a proper package manager, shell (zsh), tools and other stuff in there, not just some baseline bash and busybox, and chroot with regular linux distro is a way to get all that plus standard OpenSSH daemon.

Under the hood, modern Android (5.11 in my case, with CM 12) phone is not much more than Java VM running on top of SELinux-enabled (which I opted to keep for Android stuff) Linux kernel, on top of some multi-core ARMv7 CPU (quadcore in my case, rather identical to one in RPi2).

Steps I took to have sshd running on boot with such device:

  • Flush stock firmware (usually loaded with adware and crapware) and install some pre-rooted firmware image from or

    My phone is a second-hand Samsung Galaxy S3 Neo Duos (GT-I9300I), and I picked resurrection remix CM-based ROM for it, built for this phone from t3123799, with Open GApps arm 5.1 Pico (minimal set of google stuff).

    That ROM (like most of them, it appears) comes with bash, busybox and - most importantly - SuperSU, which runs its unrestricted init, and is crucial to getting chroot-script to start on boot. All three of these are needed.

    Under Windows, there's Odin suite for flashing zip with all the goodies to USB-connected phone booted into "download mode".

    On Linux, there's Heimdall (don't forget to install adb with that, e.g. via pacman -S android-tools android-udev on Arch), which can dd img files for diff phone partitions, but doesn't seem to support "one zip with everything" format that most firmwares come in.

    Instead of figuring out which stuff from zip to upload where with Heimdall, I'd suggest grabbing a must-have TWRP recovery.img (small os that boots in "recovery mode", grabbed one for my phone from t2906840), flashing it with e.g. heimdall flash --RECOVERY recovery.twrp- and then booting into it to install whatever main OS and apps from zip's on microSD card.

    TWRP (or similar CWM one) is really useful for a lot of OS-management stuff like app-packs installation (e.g. Open GApps) or removal, updates, backups, etc, so I'd definitely suggest installing one of these as recovery.img as the first thing on any Android device.

  • Get or make ARM chroot tarball.

    I did this by bootstrapping a stripped-down ARMv7 Arch Linux ARM chroot on a Raspberry Pi 2 that I have around:

    # pacman -Sg base
     ### Look through the list of packages,
     ###  drop all the stuff that won't ever be useful in chroot on the phone,
     ###  e.g. pciutils, usbutils, mdadm, lvm2, reiserfsprogs, xfsprogs, jfsutils, etc
     ### pacstrap chroot with whatever is left
    # mkdir droid-chroot
    # pacstrap -i -d droid-chroot bash bzip2 coreutils diffutils file filesystem \
        findutils gawk gcc-libs gettext glibc grep gzip iproute2 iputils less \
        licenses logrotate man-db man-pages nano pacman perl procps-ng psmisc \
        sed shadow sysfsutils tar texinfo util-linux which
     ### Install whatever was forgotten in pacstrap
    # pacman -r droid-chroot -S --needed atop busybox colordiff dash fping git \
        ipset iptables lz4 openssh patch pv rsync screen xz zsh fcron \
        python2 python2-pip python2-setuptools python2-virtualenv
     ### Make a tar.gz out of it all
    # rm -f droid-chroot/var/cache/pacman/pkg/*
    # tar -czf droid-chroot.tar.gz droid-chroot

    Same can be obviously done with debootstrap or whatever other distro-of-choice bootstrapping tool, but likely has to be done on a compatible architecture - something that can run ARMv7 binaries, like RPi2 in my case (though VM should do too) - to run whatever package hooks upon installs.

    Easier way (that won't require having spare ARM box or vm) would be to take pre-made image for any ARMv7 platform, from list or such, but unless there's something very generic (which e.g. "ArchLinuxARM-armv7-latest.tar.gz" seem to be), there's likely be some platform-specific cruft like kernel, modules, firmware blobs, SoC-specific tools and such... nothing that can't be removed at any later point with package manager or simple "rm", of course.

    While architecture in my case is ARMv7, which is quite common nowadays (2015), other devices can have Intel SoCs with x86 or newer 64-bit ARMv8 CPUs. For x86, bootstrapping can obviously be done on pretty much any desktop/laptop machine, if needed.

  • Get root shell on device and unpack chroot tarball there.

    adb shell (pacman -S android-tools android-udev or other distro equivalent, if missing) should get "system" shell on a USB-connected phone.

    (btw, "adb" access have to be enabled on the phone via some common "tap 7 times on OS version in Settings-About then go to Developer-options" dance, if not already)

    With SuperSU (or similar "su" package) installed, next step would be running "su" there to get unrestricted root, which should work.

    /system and /data should be on ext4 (check mount | grep ext4 for proper list), f2fs or such "proper" filesystem, which is important to have for all the unix permission bits and uid/gid info which e.g. FAT can't handle (loopback img with ext4 can be created in that case, but shouldn't be necessary in case of internal flash storage, which should have proper fs'es already).

    /system is a bad place for anything custom, as it will be completely flushed on most main-OS changes (when updating ROM from zip with TWRP, for instance), and is mounted with "ro" on boot anyway.

    Any subdir in /data seem to work fine, though one obvious pre-existing place - /data/local - is probably a bad idea, as it is used by some Android dev tools already.

    With busybox and proper bash on the phone, unpacking tarball from e.g. microSD card should be easy:

    # mkdir -m700 /data/chroots
    # cd /data/chroots
    # tar -xpf /mnt/sdcard/droid-chroot.tar.gz

    It should already work, too, so...

    # cd droid-chroot
    # mount -o bind /dev dev \
        && mount -o bind /dev/pts dev/pts \
        && mount -t proc proc proc \
        && mount -t sysfs sysfs sys
    # env -i TERM=$TERM SHELL=/bin/zsh HOME=/root $(which chroot) . /bin/zsh

    ...should produce a proper shell in a proper OS, yay! \o/

    Furthermore, to be able to connect there directly, without adb or USB cable, env -i $(which chroot) . /bin/sshd should work too.

    For sshd in particular, one useful thing to do here is:

    # $(which chroot) . /bin/ssh-keygen -A
 populate /etc/ssh with keys, which are required to start sshd.

  • Setup init script to run sshd or whatever init-stuff from that chroot on boot.

    Main trick here is to run it with unrestricted SELinux context (unless SELinux is disabled entirely, I guess).

    This makes /system/etc/init.d using "sysinit_exec" and /data/local/ with "userinit_exec" unsuitable for the task, only something like "init" ("u:r:init:s0") will work.

    SELinux on Android is documented in Android docs, and everything about SELinux in general applies there, of course, but some su-related roles like above "userinit_exec" actually come with CyanogenMod or whatever similar hacks on top of the base Android OS.

    Most relevant info on this stuff comes with SuperSU though (or rather libsuperuser) -

    That doc has info on how to patch policies, to e.g. transition to unrestricted role for chroot init, setup sub-roles for stuff in there (to also use SELinux in a chroot), which contexts are used where, and - most useful in this case - which custom "init" dirs are used at which stages of the boot process.

    Among other useful stuff, it specifies/describes /system/su.d init-dir, from which SuperSU runs scripts/binaries with unrestricted "init" context, and very early in the process too, hence it is most suitable for starting chroot from.

    So, again, from root (after "su") shell:

    # mount -o remount,rw /system
    # mkdir -m700 /system/su.d
    # cat >/system/su.d/ <<EOF
    exec /data/local/chroots.bash
    # chmod 700 /system/su.d/
    # cat >/data/local/chroots.bash <<EOF
    export PATH=/sbin:/vendor/bin:/system/sbin:/system/bin:/system/xbin
    [[ $(du -m "$log" | awk '{print $1}') -gt 20 ]] && mv "$log"{,.old}
    exec >>"$log" 2>&1
    echo " --- Started $(TZ=UTC date) --- "
    log -p i -t chroots "Starting chroot: droid-chroot"
    /data/chroots/ &
    log -p i -t chroots "Finished chroots init"
    echo " --- Finished $(TZ=UTC date) --- "
    # chmod 700 /data/local/chroots.bash
    # cd /data/chroots
    # mkdir -p droid-chroot/mnt/storage
    # ln -s droid-chroot/
    # cat >droid-chroot/ <<EOF
    set -e -o pipefail
    usage() {
      bin=$(basename $0)
      echo >&2 "Usage: $bin [ stop | chroot ]"
      exit ${1:-0}
    [[ "$#" -gt 1 || "$1" = -h || "$1" = --help ]] && usage
    cd /data/chroots/droid-chroot
    sshd_pid=$(cat run/ 2>/dev/null ||:)
    mountpoint -q dev || mount -o bind /dev dev
    mountpoint -q dev/pts || mount -o bind /dev/pts dev/pts
    mountpoint -q proc || mount -t proc proc proc
    mountpoint -q sys || mount -t sysfs sysfs sys
    mountpoint -q tmp || mount -o nosuid,nodev,size=20%,mode=1777 -t tmpfs tmpfs tmp
    mountpoint -q run || mount -o nosuid,nodev,size=20% -t tmpfs tmpfs run
    mountpoint -q mnt/storage || mount -o bind /data/media/0 mnt/storage
    case "$1" in
        [[ -z "$sshd_pid" ]] || kill "$sshd_pid"
        exit 0 ;;
        exec env -i\
          TERM="$TERM" SHELL=/bin/zsh HOME=/root\
          /system/xbin/chroot . /bin/zsh ;;
      *) [[ -z "$1" ]] || usage 1 ;;
    [[ -n "$sshd_pid" ]]\
      && kill -0 "$sshd_pid" 2>/dev/null\
      || exec env -i /system/xbin/chroot . /bin/sshd
    # chmod 700 droid-chroot/

    To unpack all that wall-of-shell a bit:

    • Very simple /system/su.d/ is created, so that it can easily be replaced if/when /system gets flushed by some update, and also so that it won't need to be edited (needing rw remount) ever.

    • /data/local/chroots.bash is an actual init script for whatever chroots, with Android logging stuff (accessible via e.g. adb logcat, useful to check if script was ever started) and simplier more reliable (and rotated) log in /data/local/chroots.log.

    • /data/chroots/ is a symlink to init script in /data/chroots/droid-chroot/, so that this script can be easily edited from inside of the chroot itself.

    • /data/chroots/droid-chroot/ is the script that mounts all the stuff needed for the chroot and starts sshd there.

      Can also be run from adb root shell to do the same thing, with "stop" arg to kill that sshd, or with "chroot" arg to do all the mounts and chroot into the thing from whatever current sh.

      Basically everything to with that chroot from now on can/should be done through that script.

    "cat" commands can obviously be replaced with "nano" and copy-paste there, or copying same (or similar) scripts from card or whatever other paths (to avoid pasting them into shell, which might be less convenient than Ctrl+S in $EDITOR).

  • Reboot, test sshd, should work.

Anything other than sshd can also be added to that init script, to make some full-featured dns + web + mail + torrents server setup start in chroot.

With more than a few daemons, it'd probably be a good idea to start just one "daemon babysitter" app from there, such as runit, daemontools or whatever. Maybe even systemd will work, though unlikely, given how it needs udev, lots of kernel features and apis initialized in its own way, and such.

Obvious caveat for running a full-fledged linux separately from main OS is that it should probably be managed through local webui's or from some local terminal app, and won't care much about power management and playing nice with Android stuff.

Android shouldn't play nice with such parasite OS either, cutting network or suspending device when it feels convenient, without any regard for conventional apps running there, though can be easily configured not to.

As I'm unlikely to want this device as a phone ever (who needs these, anyway?), turning it into something more like wireless RPi2 with a connected management terminal (represented by Android userspace) sounds like the only good use for it so far.

Update 2016-05-16: Added note on ssh-keygen and rm for pacman package cache after pacstrap.

Next → Page 1 of 2
Member of The Internet Defense League