Dec 07, 2015

Resizing first FAT32 partition to microSD card size on boot from Raspberry Pi

One other thing I've needed to do recently is to have Raspberry Pi OS resize its /boot FAT32 partition to full card size (i.e. "make it as large as possible") from right underneath itself.

RPis usually have first FAT (fat16 / fat32 / vfat) partition needed by firmware to load config.txt and uboot stuff off, and that is the only partition one can see in Windows OS when plugging microSD card into card-reader (which is a kinda arbitrary OS limitation).

Map of the usual /dev/mmcblk0 on RPi (as seen in parted):

Number  Start   End     Size    Type     File system  Flags
        32.3kB  1049kB  1016kB           Free Space
 1      1049kB  106MB   105MB   primary  fat16        lba
 2      106MB   1887MB  1782MB  primary  ext4

Resizing that first partition is naturally difficult, as it is followed by ext4 one with RPi's OS, but when you want to have small (e.g. <2G) and easy-to-write "rpi.img" file for any microSD card, there doesn't seem to be a way around that - img have to have as small initial partitions as possible to fit on any card.

Things get even more complicated by the fact that there don't seem to be any tools around for resizing FAT fs'es, so it has to be re-created from scratch.

There is quite an easy way around all these issues however, which can be summed-up as a sequence of the following steps:

  • Start while rootfs is mounted read-only or when it can be remounted as such, i.e. on early boot.

    Before=systemd-remount-fs.service local-fs-pre.target in systemd terms.

  • Grab sfdisk/parted map of the microSD and check if there's "Free Space" chunk left after last (ext4/os) partition.

    If there is, there's likely a lot of it, as SD cards increase in 2x size factors, so 4G image written on larger card will have 4+ gigs there, in fact a lot more for 16G or 32G cards.

    Or there can be only a few megs there, in case of matching card size, where it's usually a good idea to make slightly smaller images, as actual cards do vary in size a bit.

  • "dd" whole rootfs to the end of the microSD card.

    This is safe with read-only rootfs, and dumb "dd" approach to copying it (as opposed to dmsetup + mkfs + cp) seem to be simpliest and least error-prone.

  • Update partition table to have rootfs in the new location (at the very end of the card) and boot partition covering rest of the space.

  • Initiate reboot, so that OS will load from the new rootfs location.

  • Starting on early-boot again, remount rootfs rw if necessary, temporary copy all contents of boot partition (which should still be small) to rootfs.

  • Run mkfs.vfat on the new large boot partition and copy stuff back to it from rootfs.

  • Reboot once again, in case whatever boot timeouts got triggered.

  • Avoid running same thing on all subsequent boots.

    E.g. touch /etc/boot-resize-done and have ConditionPathExists=!/etc/boot-resize-done in the systemd unit file.

That should do it \o/

resize-rpi-fat32-for-card (in fgtk repo) is a script I wrote to do all of this stuff, exactly as described.

systemd unit file for the thing (can also be printed by running script with "--print-systemd-unit" option):

[Unit]
DefaultDependencies=no
After=systemd-fsck-root.service
Before=systemd-remount-fs.service -.mount local-fs-pre.target local-fs.target
ConditionPathExists=!/etc/boot-resize-done

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/local/bin/resize-rpi-fat32-for-card

[Install]
WantedBy=local-fs.target

It does use lsblk -Jnb JSON output to get rootfs device and partition, and get whether it's mounted read-only, then parted -ms /dev/... unit B print free to grab machine-readable map of the device.

sfdisk -J (also JSON output) could've been better option than parted (extra dep, which is only used to get that one map), except it doesn't conveniently list "free space" blocks and device size, pity.

If partition table doesn't have extra free space at the end, "fsstat" tool from sleuthkit is used to check whether FAT filesystem covers whole partition and needs to be resized.

After that, and only if needed, either "dd + sfdisk" or "cp + mkfs.vfat + cp back" sequence gets executed, followed by a reboot command.

Extra options for the thing:

  • "--skip-ro-check" - don't bother checkin/forcing read-only rootfs before "dd" step, which should be fine, if there's no activity there (e.g. early boot).

  • "--done-touch-file" - allows to specify location of file to create (if missing) when "no resize needed" state gets reached.

    Script doesn't check whether this file exists and always does proper checks of partition table and "fsstat" when deciding whether something has to be done, only creates the file at the end (if it doesn't exist already).

  • "--overlay-image" uses splash.go tool that I've mentioned earlier (be sure to compile it first, ofc) to set some "don't panic, fs resize in progress" image (resized/centered and/or with text and background) during the whole process, using RPi's OpenVG GPU API, covering whatever console output.

  • Misc other stuff for setup/debug - "--print-systemd-unit", "--debug", "--reboot-delay".

    Easy way to debug the thing with these might be to add StandardOutput=tty to systemd unit's Service section and ... --debug --reboot-delay 60 options there, or possibly adding extra ExecStart=/bin/sleep 60 after the script (and changing its ExecStart= to ExecStart=-, so delay will still happen on errors).

    This should provide all the info on what's happening in the script (has plenty of debug output) to the console (one on display or UART).

One more link to the script: resize-rpi-fat32-for-card

Oct 05, 2014

Simple aufs setup for Arch Linux ARM and boards like RPi, BBB or Cubie

Experimenting with all kinds of arm boards lately (nyms above stand for Raspberry Pi, Beaglebone Black and Cubieboard), I can't help but feel a bit sorry of microsd cards in each one of them.

These are even worse for non-bulk writes than SSD, having less erase cycles plus larger blocks, and yet when used for all fs needs of the board, even typing "ls" into shell will usually emit a write (unless shell doesn't keep history, which sucks).

Great explaination of how they work can be found on LWN (as usual).

Easy and relatively hassle-free way to fix the issue is to use aufs, but as doing it for the whole rootfs requires initramfs (which is not needed here otherwise), it's a lot easier to only use it for commonly-writable parts - i.e. /var and /home in most cases.

Home for "root" user is usually /root, so to make it aufs material as well, it's better to move that to /home (which probably shouldn't be a separate fs on these devices), leaving /root as a symlink to that.

It seem to be impossible to do when logged-in as /root (mv will error with EBUSY), but trivial from any other machine:

# mount /dev/sdb2 /mnt # mount microsd
# cd /mnt
# mv root home/
# ln -s home/root
# cd
# umount /mnt

As aufs2 is already built into Arch Linux ARM kernel, only thing that's left is to add early-boot systemd unit for mounting it, e.g. /etc/systemd/system/aufs.service:

[Unit]
DefaultDependencies=false

[Install]
WantedBy=local-fs-pre.target

[Service]
Type=oneshot
RemainAfterExit=true

# Remount /home and /var as aufs
ExecStart=/bin/mount -t tmpfs tmpfs /aufs/rw
ExecStart=/bin/mkdir -p -m0755 /aufs/rw/var /aufs/rw/home
ExecStart=/bin/mount -t aufs -o br:/aufs/rw/var=rw:/var=ro none /var
ExecStart=/bin/mount -t aufs -o br:/aufs/rw/home=rw:/home=ro none /home

# Mount "pure" root to /aufs/ro for syncing changes
ExecStart=/bin/mount --bind / /aufs/ro
ExecStart=/bin/mount --make-private /aufs/ro

And then create the dirs used there and enable unit:

# mkdir -p /aufs/{rw,ro}
# systemctl enable aufs

Now, upon rebooting the board, you'll get aufs mounts for /home and /var, making all the writes there go to respective /aufs/rw dirs on tmpfs while allowing to read all the contents from underlying rootfs.

To make sure systemd doesn't waste extra tmpfs space thinking it can sync logs to /var/log/journal, I'd also suggest to do this (before rebooting with aufs mounts):

# rm -rf /var/log/journal
# ln -s /dev/null /var/log/journal

Can also be done via journald.conf with Storage=volatile.

One obvious caveat with aufs is, of course, how to deal with things that do expect to have permanent storage in /var - examples can be a pacman (Arch package manager) on system updates, postfix or any db.
For stock Arch Linux ARM though, it's only pacman on manual updates.

And depending on the app and how "ok" can loss of this data might be, app dir in /var (e.g. /var/lib/pacman) can be either moved + symlinked to /srv or synced before shutdown or after it's done with writing (for manual oneshot apps like pacman).

For moving stuff back to permanent fs, aubrsync from aufs2-util.git can be used like this:

# aubrsync move /var/ /aufs/rw/var/ /aufs/ro/var/

As even pulling that from shell history can be a bit tedious, I've made a simplier ad-hoc wrapper - aufs_sync - that can be used (with mountpoints similar to presented above) like this:

# aufs_sync
Usage: aufs_sync { copy | move | check } [module]
Example (flushes /var): aufs_sync move var

# aufs_sync check
/aufs/rw
/aufs/rw/home
/aufs/rw/home/root
/aufs/rw/home/root/.histfile
/aufs/rw/home/.wh..wh.orph
/aufs/rw/home/.wh..wh.plnk
/aufs/rw/home/.wh..wh.aufs
/aufs/rw/var
/aufs/rw/var/.wh..wh.orph
/aufs/rw/var/.wh..wh.plnk
/aufs/rw/var/.wh..wh.aufs
--- ... just does "find /aufs/rw"

# aufs_sync move
--- does "aubrsync move" for all dirs in /aufs/rw

Just be sure to check if any new apps might write something important there (right after installing these) and do symlinks (to something like /srv) for their dirs, as even having "aufs_sync copy" on shutdown definitely won't prevent data loss for these on e.g. sudden power blackout or any crashes.

Sep 26, 2013

FAT32 driver in python

Wrote a driver for still common FAT32 recently, while solving the issue with shuffle on cheap "usb stick with microsd slot" mp3 player.

It's kinda amazing how crappy firmware in these things can be.

Guess one should know better than to get such crap with 1-line display, gapful playback, weak battery, rewind at non-accelerating ~3x speed, no ability to pick tracks while playing and plenty of other annoying "features", but the main issue I've had with the thing by far is missing shuffle functionality - it only plays tracks in static order in which they were uploaded (i.e. how they're stored on fs).

Seems like whoever built the thing made it deliberately hard to shuffle the tracks offline - just one sort by name would've made things a lot easier, and it's clear that device reads the full dir listing from the time it spends opening dirs with lots of files.


Most obvious way to do such "offline shuffle", given how the thing orders files, is to re-upload tracks in different order, which is way too slow and wears out flash ram.

Second obvious for me was to dig into FAT32 and just reorder entries there, which is what the script does.

It's based off example of a simplier fat16 parser in construct module repo and processes all the necessary metadata structures like PDRs, FATs (cluster maps) and directory tables with vfat long-name entries inside.

Given that directory table on FAT32 is just an array (with vfat entries linked to dentries after them though), it's kinda easy just to shuffle entries there and write data back to clusters from where it was read.


One less obvious solution to shuffle, coming from understanding how vfat lfn entries work, is that one can actually force fs driver to reorder them by randomizing filename length, as it'll be forced to move longer entries to the end of the directory table.

But that idea came a bit too late, and such parser can be useful for extending FAT32 to whatever custom fs (with e.g. FUSE or 9p interface) or implementing some of the more complex hacks.


It's interestng that fat dentries can (and apparently known to) store unix-like modes and uid/gid instead of some other less-useful attrs, but linux driver doesn't seem to make use of it.

OS'es also don't allow symlinks or hardlinks on fat, while technically it's possible, as long as you keep these read-only - just create dentries that point to the same cluster.

Should probably work for both files and dirs and allow to create multiple hierarchies of the same files, like several dirs where same tracks are shuffled with different seed, alongside dirs where they're separated by artist/album/genre or whatever other tags.

It's very fast and cheap to create these, as each is basically about "(name_length + 32B) * file_count" in size, which is like just 8 KiB for dir structure holding 100+ files.

So plan is to extend this small hack to use mutagen to auto-generate such hierarchies in the future, or maybe hook it directly into beets as an export plugin, combined with transcoding, webui and convenient music-db there.

Also, can finally tick off "write proper on-disk fs driver" from "things to do in life" list ;)

Member of The Internet Defense League