Aug 08, 2013
Most advice wrt encryption on a remote hosts (VPS, VDS) don't seem to involve
full-disk encryption as such, but is rather limited to encrypting /var and
/home, so that machine will boot from non-crypted / and you'll be able to ssh to
it, decrypt these parts manually, then start services that use data there.
That seem to be in contrast with what's generally used on local machines - make
LUKS container right on top of physical disk device, except for /boot (if it's
not on USB key) and don't let that encryption layer bother you anymore.
Two policies seem to differ in that former one is opt-in - you have to actively
think which data to put onto encrypted part (e.g. /etc/ssl has private keys?
move to /var, shred from /etc), while the latter is opt-out - everything is
encrypted, period.
So, in spirit of that opt-out way, I thought it'd be a drag to go double-think
wrt which data should be stored where and it'd be better to just go ahead and
put everything possible to encrypted container for a remote host as well,
leaving only /boot with kernel and initramfs in the clear.
Naturally, to enter encryption password and not have it stored alongside LUKS
header, some remote login from the network is in order, and sshd seem to be
the secure and easy way to go about it.
Initramfs in question should then also be able to setup network, which luckily
dracut can. Openssh sshd is a bit too heavy for it though, but there are much
lighter sshd's like
dropbear.
Searching around for someone to tie the two things up, found a bit incomplete
and non-packaged solutions like this RH enhancement proposal and a set of
hacky scripts and instructions in dracut-crypt-wait repo on bitbucket.
Approach outlined in RH bugzilla is to make dracut "crypt" module to operate
normally and let cryptsetup query for password in linux console, but also
start sshd in the background, where one can login and use a simple tool to
echo password to that console (without having it echoed).
dracut-crypt-wait does a clever hack of removing "crypt" module hook instead
and basically creates "rescure" console on sshd, where user have to manually
do all the decryption necessary and then signal initramfs to proceed with the
boot.
I thought first way was rather more elegant and clever, allowing dracut to
figure out which device to decrypt and start cryptsetup with all the necessary,
configured and documented parameters, also still allowing to type passphrase
into console - best of both worlds, so went along with that one, creating
dracut-crypt-sshd project.
As README there explains, using it is as easy as adding it into dracut.conf (or
passing to dracut on command line) and adding networking to grub.cfg, e.g.:
menuentry "My Linux" {
linux /vmlinuz ro root=LABEL=root
rd.luks.uuid=7a476ea0 rd.lvm.vg=lvmcrypt rd.neednet=1
ip=88.195.61.177::88.195.61.161:255.255.255.224:myhost:enp0s9:off
initrd /dracut.xz
}
("ip=dhcp" might be simpler way to go, but doesn't yield default route in my case)
And there, you'll have sshd on that IP port 2222 (configurable), with
pre-generated (during dracut build) keys, which might be a good idea to put into
"known_hosts" for that ip/port somewhere. "authorized_keys" is taken from
/root/.ssh by default, but also configurable via dracut.conf, if necessary.
Apart from sshd, that module includes two tools for interaction with console -
console_peek and console_auth (derived from auth.c in the bugzilla link above).
Logging in to that sshd then yields sequence like this:
[214] Aug 08 13:29:54 lastlog_perform_login: Couldn't stat /var/log/lastlog: No such file or directory
[214] Aug 08 13:29:54 lastlog_openseek: /var/log/lastlog is not a file or directory!
# console_peek
[ 1.711778] Write protecting the kernel text: 4208k
[ 1.711875] Write protecting the kernel read-only data: 1116k
[ 1.735488] dracut: dracut-031
[ 1.756132] systemd-udevd[137]: starting version 206
[ 1.760022] tsc: Refined TSC clocksource calibration: 2199.749 MHz
[ 1.760109] Switching to clocksource tsc
[ 1.809905] systemd-udevd[145]: renamed network interface eth0 to enp0s9
[ 1.974202] 8139too 0000:00:09.0 enp0s9: link up, 100Mbps, full-duplex, lpa 0x45E1
[ 1.983151] dracut: sshd port: 2222
[ 1.983254] dracut: sshd key fingerprint: 2048 0e:14:...:36:f9 root@congo (RSA)
[ 1.983392] dracut: sshd key bubblebabble: 2048 xikak-...-poxix root@congo (RSA)
[185] Aug 08 13:29:29 Failed reading '-', disabling DSS
[186] Aug 08 13:29:29 Running in background
[ 2.093869] dracut: luksOpen /dev/sda3 luks-...
Enter passphrase for /dev/sda3:
[213] Aug 08 13:29:50 Child connection from 188.226.62.174:46309
[213] Aug 08 13:29:54 Pubkey auth succeeded for 'root' with key md5 0b:97:bb:...
# console_auth
Passphrase:
#
First command - "console_peek" - allows to see which password is requested (if
any) and second one allows to login.
Note that fingerprints of host keys are also echoed to console on sshd start,
in case one has access to console but still needs sshd later.
I quickly found out that such initramfs with sshd is also a great and robust
rescue tool, especially if "debug" and/or "rescue" dracut modules are enabled.
And as it includes fairly comprehensive network-setup options, might be a good
way to boot multiple different OS'es with same (machine-specific) network
parameters,
Probably obligatory disclaimer for such post should mention that crypto above
won't save you from malicious hoster or whatever three-letter-agency that will
coerce it into cooperation, should it take interest in your poor machine - it'll
just extract keys from RAM image (especially if it's a virtualized VPS) or
backdoor kernel/initramfs and force a reboot.
Threat model here is more trivial - be able to turn off and decomission host
without fear of disks/images then falling into some other party's hands, which
might also happen if hoster eventually goes bust or sells/scraps disks due to
age or bad blocks.
Also, even minor inconvenience like forcing to extract keys like outlined above
might be helpful in case of quite well-known "we came fishing to a datacenter,
shut everything down, give us all the hardware in these racks" tactic employed
by some agencies.
Absolute security is a myth, but these measures are fairly trivial and practical
to be employed casually to cut off at least some number of basic threats.
So, yay for dracut, the amazingly cool and hackable initramfs project, which
made it that easy.
Code link: https://github.com/mk-fg/dracut-crypt-sshd
Jun 09, 2013
I've been using Hyperboria darknet for about a month now, and after late influx
of russian users there (after this article) got plently of peers, so node is
forwarding a bit of network traffic.
Being a dorknet-proper, of course, you can't see what kind of traffic it is or
to whom it goes (though cjdns doesn't have anonymity as a goal), but I thought
it'd be nice to at least know when my internet lags due to someone launching DoS
flood or abusing torrents.
Over the Internet (called "clearnet" here), cjdns peers using udp, but linux
conntrack seem to be good enough to track these "connections" just as if they
were stateful tcp flows.
Simple-ish traffic accounting on vanilla linux usually boils down to ulogd2,
which can use packet-capturing interfaces (raw sockets via libpcap, netfilter
ULOG and NFLOG targets), but it's kinda heavy-handed here - traffic is opaque,
only endpoints matter, so another one of its interfaces seem to be better
option - conntrack tables/events.
Handy conntrack-tools (or /proc/net/{ip,nf}_conntrack) cat track all the
connections, including simple udp-based ones (like cjdns uses), producing
entries like:
udp 17 179 \
src=110.133.5.117 dst=188.226.51.71 sport=52728 dport=8131 \
src=188.226.51.71 dst=110.133.5.117 sport=8131 dport=52728 \
[ASSURED] mark=16 use=1
First trick is to enable the packet/byte counters there, which is a simple, but
default-off sysctl knob:
# sysctl -w net.netfilter.nf_conntrack_acct=1
That will add "bytes=" and "packets=" values there for both directions.
Of course, polling the table is a good way to introduce extra hangs into system
(/proc files are basically hooks that tend to lock stuff to get consistent
reads) and loose stuff in-between polls, so luckily there's an event-based
netlink interface and ulogd2 daemon to monitor that.
One easy way to pick both incoming and outgoing udp flows in ulogd2 is to add
connmarks to these:
-A INPUT -p udp --dport $cjdns_port -j CONNMARK --set-xmark 0x10/0x10
-A OUTPUT -p udp --sport $cjdns_port -j CONNMARK --set-xmark 0x10/0x10
Then setup filtering by these in ulogd.conf:
...
stack=log:NFCT,mark:MARK,ip2str:IP2STR,print:PRINTFLOW,out:GPRINT
[log]
accept_proto_filter=udp
[mark]
mark=0x10
mask=0x10
[out]
file="/var/log/ulogd2/cjdns.log"
Should produce parseable log of all the traffic flows with IPs and such.
Fairly simple script can then be used to push this data to graphite, munin,
ganglia, cacti or whatever time-series graphing/processing tool.
Linked script is for graphite "carbon" interface.
Jun 06, 2013
Wanted to share three kinda-big-deal fixes I've added to my firefox:
- Patch to remove sticky-on-top focus-grabbing "Do you want to activate plugins
on this page?" popup.
- Patch to prevent plugins (e.g. Abode Flash) from ever grabbing firefox
hotkeys like "Ctrl + w" (close tab) or F5, forcing to do click outside
e.g. YouTube video window to get back to ff.
- Easy "toggle js" fix for JavaScript on pages grabbing controls like keyboard
and mouse (e.g. overriding F5 to retweet instead of reload page, preventing
copy-paste if forms and on pages, etc).
Lately, firefox seem to give more-and-more control into the hands of web
developers, who seem to be hell-bent on abusing that to make browsing UX a
living hell.
FF bug-reports about Flash grabbing all the focus date back to 2001 and are
unresolved still.
Sites override Up/Down, Space, PgUp/PgDown, F5, Ctrl+T/W I've no idea why -
guess some JS developers just don't use keyboard at all, which is somewhat
understandable, combined with the spread of tablet-devices these days.
Overriding clicks in forms to prevent pasting email/password seem to be
completely ignoring valid (or so I think) use-case of using some storage app
for these.
And native "click-to-play" switch seem to be hilariously unusable in FF, giving
cheerful "Hey, there's flash here! Let me pester you with this on every page
load!" popups.
All are known, neither one seem to be going away anytime soon, so onwards to the
fixes.
Removing "Do you want to activate plugins" thing seem to be straightforward js
one-liner patch, as it's implemented in
"browser/base/content/browser-plugins.js" - whole fix is adding
this._notificationDisplayedOnce = true; to break the check there.
"notificationDisplayedOnce" thing is used to not popup that thing on the same
page within the same browing session afaict.
Patch for plugin focus is clever - all one has to do is to switch focus to
browser window (from embedded flash widget) before keypress gets processed and
ff will handle it correctly.
Hackish plugin + ad-hoc perl script solution (to avoid patching/rebuilding ff)
can be found
here.
My hat goes to Alexander Rødseth however, who hacked the patch attached to
ff-bug-78414 - this one is a real problem-solver, though a bit (not
terribly - just context lines got shuffled around since) out-of-date.
JS-click/key-jacking issue seem to require some JS event firewalling, and
sometimes (e.g. JS games or some weird-design sites) can be useful.
So my solution was simply to bind JS-toggle key, which allows not only to
disable all that crap, but also speed some "load-shit-as-you-go" or
JS-BTC-mining (or so it feels) sites rapidly.
var prefs = Components.classes['@mozilla.org/preferences-service;1']
.getService(Components.interfaces.nsIPrefBranch),
state = prefs.getBoolPref('javascript.enabled');
prefs.setBoolPref('javascript.enabled', !state);
That's the whole thing, bound to something like Ctrl+\ (the one above Enter
here), makes a nice "Turbo and Get Off My JS" key.
Fairly sure there are addons that allow to toggle prefs ("javascript.enabled"
above) via keys without needing any code, but I have this one.
Damn glad there are open-source (and uglifyjs-like) browsers like that, hope
proprietary google-ware won't take over the world in the nearest future.
Mentioned patches are available in (and integrated with-) the firefox-nightly
exheres in my repo, forked off awesome sardemff7-pending
firefox-scm.exheres-0 / mozilla-app.exlib work.
Apr 29, 2013
I've tried both of these in the past, but didn't have attention budget to make
them really work for me - which finally found now, so wanted to also give
crawlers a few more keywords on these nice things.
0bin - leak-proof pastebin
As I pastebin a lot of stuff all the time - basically everything multiline -
because all my IM happens in ERC over IRC (with bitlbee linking xmpp and all
the proprietary crap like icq, skype and twitter), and IRC doesn't handle
multiline messages at all.
All sorts of important stuff ends up there - some internal credentials,
contacts, non-public code, bugs, private chat logs, etc - so I always winced a
bit when pasting something in fear that google might index/data-mine it and
preserve forever, so I figured it'll bite me eventually, somewhat like this:
Easy and acceptable solution is to use simple client-side crypto, with link
having decryption key after hashmark, which never gets sent to pastebin server
and doesn't provide crawlers with any useful data. ZeroBin does that.
But original ZeroBin is php, which I don't really want to touch, and have its
share of problems - from the lack of command-line client (for e.g. grep stuff
log | zerobinpaste), to overly-long urls and flaky overloaded interface.
Luckily, there's more hackable python version of it - 0bin, for which I hacked
together a simple zerobinpaste tool, then simplified interface to bare
minimum and updated to use shorter urls (#41, #42) and put to my host -
result is paste.fraggod.net - my own nice robot-proof pastebin.
URLs there aren't any longer than with regular pastebins:
http://paste.fraggod.net/paste/pLmEb0BI#Verfn+7o
Plus the links there expire reliably, and it's easy to force this expiration,
having control over app backend.
Local fork should have all the not-yet-merged stuff as well as the
non-upstreamable simpler white-bootstrap theme.
Convergence - better PKI for TLS keys
Can't really recommend this video highly enough to anyone with even the
slightest bit of interest in security, web or SSL/TLS protocols.
There are lots of issues beyond just key distribution and authentication, but
I'd dare anyone to watch that rundown of just-as-of-2011 issues and remain
convinced that the PKI there is fine or even good enough.
Even fairly simple
Convergence tool implementation is a vast improvement,
giving a lot of control to make informed decisions about who to trust on the
net.
I've been using the plugin in the past, but eventually it broke and I just
disabled it until the better times when it'll be fixed, but Moxie seem to have
moved on to other tasks and project never got the developers' attention it
deserved.
So finally got around to fixing fairly massive list of issues around it
myself.
Bugs around newer firefox plugin were the priority - one was compatibility thing
from PR #170, another is endless hanging on all requests to notaries (PR
#173), more minor issues with adding notaries, interfaces and just plain bugs
that were always there.
Then there was one shortcoming of existing perspective-only verification
mechanism that bugged me - it didn't utilize existing flawed CA lists at all,
making decision of whether random site's cert is signed by at least some
crappy CA or completely homegrown (and thus don't belong on
e.g. "paypal.com").
Not the deciding factor by any means, but allows to make much more informed
decision than just perspectives for e.g. fishing site with typo in URL.
So was able to utilize (and extend a bit) the best part of Convergence - agility
of its trust decision-making - by hacking together a verifier (which can be
easily run on desktop localhost) that queries existing CA lists.
Enabling Convergence with that doesn't even force to give up the old model -
just adds perspective checks on top, giving a clear picture of which of the
checks have failed on any inconsistencies.
Other server-side fixes include nice argparse interface, configuration file
support, loading of verifiers from setuptools/distribute entry points (can be
installed separately with any python package), hackish TLS SNI support (Moxie
actually filed twisted-5374 about more proper fix), sane logging, ...
Filed only a few PR for the show-stopper client bugs, but looks like upstream
repo is simply dead, pity ;(
But all this stuff should be available in
my fork in the meantime.
Top-level README there should provide a more complete list of links and
changes.
Hopefully, upstream development will be picked-up at some point, or maybe
shift to some next incarnation of the idea -
CrossBear seem to potentially
be one.
Until then, at least was able to salvage this one, and hacking ctypes-heavy ff
extension implementing SOCKS MitM proxy was quite rewarding experience all by
itself... certainly broadens horizons on just how damn accessible and simple
it is to implement such seemingly-complex protocol wrappers.
Plan to also add a few other internet-observatory (like OONI, CrossBear crawls,
EFF Observatory, etc) plugins there in the near future, plus some other things
listed in the README here.
Apr 24, 2013
Was hacking on (or rather debugging) Convergence FF plugin and it became
painfully obvious that I really needed something simple to push js changes from
local git clone to ~/.mozilla so that I can test them.
Usually I tend to employ simple ad-hoc for src in $(git st | awk ...); do cat
$src >... hack, and done same thing in this case as well, but was forgetting
to run it after small "debug printf" changes waaay too often.
At this point, I sometimes hack some ad-hoc emacs post-save hook to run the
thing, but this time decided to find some simpler and more generic "run that on
any changes to path" tool.
Until the last few years, the only way to do that was polling or inotify, and
for some project dir it's actually quite fine, but luckily there's fanotify in
kernel now, and fatrace looks like the simliest cli tool based on it.
# fatrace
sadc(977): W /var/log/sa/sa24
sadc(977): W /var/log/sa/sa24
sadc(977): W /var/log/sa/sa24
sadc(977): W /var/log/sa/sa24
qmgr(1195): O /var/spool/postfix/deferred
qmgr(1195): CO /var/spool/postfix/deferred/0
qmgr(1195): CO /var/spool/postfix/deferred/3
qmgr(1195): CO /var/spool/postfix/deferred/7
...
That thing can just watch everything that's being done to all (or any
specific) local mount(s).
Even better - reports the app that does the changes.
I never got over auditd's complexity for such simple use-cases, so was damn
glad that there is a real and simpler alternative now.
Unfortunately, with power of the thing comes the need for root, so one simple
bash wrapper later, my "sync changes" issue was finally resolved:
(root) ~# fatrace_pipe ~user/hatch/project
(user) project% xargs -in1 </tmp/fatrace.fifo make
Looks like a real problem-solver for a lot of real-world "what the hell happens
on the fs there!?" cases as well - can't recommend the thing highly-enough for
all that.
Apr 08, 2013
As discordian folk celebrated Jake Day yesterday, decided that I've had it
with random hanging userspace state-machines, stuck forever with tcp connections
that are not legitimately dead, just waiting on both sides.
And since pretty much every tool can handle transient connection failures and
reconnects, decided to come up with some simple and robust-enough solution to
break such links without (or rather before) patching all the apps to behave.
One last straw was davfs2 failing after a brief net-hiccup, with my options
limited to killing everything that uses (and is hanging dead on) its mount,
then going kill/remount way.
As it uses stateless http connections, I bet it's not even an issue for it to
repeat whatever request it tried last and it sure as hell handles network
failures, just not well in all cases.
I've used such technique to test some twisted-things in the past, so it was easy
to dig scapy-automata code for doing that, though the real trick is not to
craft/send FIN or RST packet, but rather to guess TCP seq/ack numbers to stamp
it with.
Alas, none of the existing tools (e.g. tcpkill) seem to do anything clever in
this regard.
cutter states that
There is a feature of the TCP/IP protocol that we could use to good effect
here - if a packet (other than an RST) is received on a connection that has
the wrong sequence number, then the host responds by sending a corrective
"ACK" packet back.
But neither the tool itself nor the technique described seem to work, and I
actually failed to find (or recall) any mentions (or other uses) of such
corrective behavior. Maybe it was so waaay back, dunno.
Naturally, as I can run such tool on the host where socket endpoint is, local
kernel has these numbers stored, but apparently no one really cared (or had a
legitimate enough use-case) to expose these to the userspace... until very
recently, that is.
Recent work of Parallels folks on
CRIU landed
getsockopt(sk, SOL_TCP,
TCP_QUEUE_SEQ, ...) in one the latest mainline kernel releases.
Trick is then just to run that syscall in the pid that holds the socket fd,
which looks like a trivial enough task, but looking over
crtools (which
unfortunately doesn't seem to work with vanilla kernel yet) and
ptrace-parasite tricks of compiling and injecting shellcode, decided that
it's just too much work for me, plus they share the same x86_64-only codebase,
and I'd like to have the thing working on ia32 machines as well.
Caching all the "seen" seq numbers in advance looks tempting, especially since
for most cases, relevant traffic is processed already by
nflog-zmq-pcap-pipe and
Snort, which can potentially dump
"(endpoint1-endpoint2, seq, len)" tuples to some fast key-value backend.
Invalidation of these might be a minor issue, but I'm not too thrilled about
having some dissection code to pre-cache stuff that's already cached in every
kernel anyway.
Patching kernel to just expose stuff via /proc looks like bit of a burden as
well, though an isolated module code would probably do the job well.
Weird that there doesn't seem to be one of these around already, closest one
being tcp_probe.c code, which hooks into tcp_recv code-path and doesn't really
get seqs without some traffic either.
One interesting idea that got my attention and didn't require a single line of
extra code was proposed on the local xmpp channel - to use tcp keepalives.
Sure, they won't make kernel drop connection when it's userspace that hangs on
both ends, with connection itself being perfectly healthy, but every one of
these carries a seq number that can be spoofed and used to destroy that
"healthy" state.
Pity these are optional and can't be just turned on for all sockets system-wide
on linux (unlike some BSD systems, apparently), and nothing uses these much by
choice (which can be seen in netstat --timer).
Luckily, there's a dead-simple LD_PRELOAD code of
libkeepalive which can be
used to enforce system-wide opt-out behavior for these (at least for
non-static binaries).
For suid stuff (like mount.davfs, mentioned above), it has to be in
/etc/ld.so.preload, not just env, but as I need it "just in case" for all the
connections, that seems fine in my case.
And tuning keepalives to be frequent-enough seem to be a no-brainer and
shouldn't have any effect on 99% of legitimate connections at all, as they
probably pass some traffic every other second, not after minutes or hours.
net.ipv4.tcp_keepalive_time = 900
net.ipv4.tcp_keepalive_probes = 5
net.ipv4.tcp_keepalive_intvl = 156
(default is to send empty keepalive packet after 2 hours of idleness)
With that, tool has to run ~7 min on average to kill any tcp connection in the
system, which totally acceptable, and no fragile non-portable ptrace-shellcode
magic involved (at least yet, I bet it'd be much easier to do in the future).
Code and some docs for the tool/approach can be found on github.
More of the same (update 2013-08-11):
Actually, lacking some better way to send RST/FIN from a machine to itself than
swapping MACs (and hoping that router is misconfigured enough to bounce packet
"from itself" back) or "-j REJECT --reject-with tcp-reset" (plus a "recent"
match or transient-port matching, to avoid blocking reconnect as well),
countdown for a connection should be ~7 + 15 min, as only next keepalive will
reliably produce RST response.
With a bit of ipset/iptables/nflog magic, it was easy to make the one-time
REJECT rule, snatching seq from dropped packet via NFLOG and using that to
produce RST for the other side as well.
Whole magic there goes like this:
-A conn_cutter ! -p tcp -j RETURN
-A conn_cutter -m set ! --match-set conn_cutter src,src -j RETURN
-A conn_cutter -p tcp -m recent --set --name conn_cutter --rsource
-A conn_cutter -p tcp -m recent ! --rcheck --seconds 20\
--hitcount 2 --name conn_cutter --rsource -j NFLOG
-A conn_cutter -p tcp -m recent ! --rcheck --seconds 20\
--hitcount 2 --name conn_cutter --rsource -j REJECT --reject-with tcp-reset
-I OUTPUT -j conn_cutter
"recent" matcher there is a bit redundant in most cases, as outgoing connections
usually use transient-range tcp ports, which shouldn't match for different
attempts, but some apps might bind these explicitly.
ipset turned out to be quite a neat thing to avoid iptables manipulations (to
add/remove match).
It's interesting that this set of rules handles RST to both ends all by itself
if packet arrives from remote first - response (e.g. ACK) from local socket will
get RST but won't reach remote, and retransmit from remote will get RST because
local port is legitimately closed by then.
Current code allows to optionally specify ipset name, whether to use nflog
(via spin-off scapy-nflog-capture driver) or raw sockets, and doesn't do any
mac-swapping, only sending RST to remote (which, again, should still be
sufficient with frequent-enough keepalives).
Now, if only some decade-old undocumented code didn't explicitly disable these
nice keepalives...
Apr 06, 2013
Everyone is probably aware that bits do flip here and there in the supposedly
rock-solid, predictable and deterministic hardware, but somehow every single
data-management layer assumes that it's not its responsibility to fix or even
detect these flukes.
Bitrot in RAM is a known source of bugs, but short of ECC, dunno what one
can do without huge impact on performance.
Disks, on the other hand, seem to have a lot of software layers above them,
handling whatever data arrangement, compression, encryption, etc, and the fact
that bits do flip in magnetic media seem to be just as well-known (
study1,
study2,
study3, ...).
So it really bugged me for quite a while that any modern linux system seem to
be completely oblivious to the issue.
Consider typical linux storage stack on a commodity hardware:
You have closed-box proprietary hdd brick at the bottom, with no way to tell
what it does to protect your data - aside from vendor marketing pitches, that
is.
Then you have well-tested and robust linux driver for some ICH storage
controller.
I wouldn't bet that it will corrupt anything at this point, but it doesn't do
much else to the data but pass around whatever it gets from the flaky device
either.
Linux blkdev layer above, presenting /dev/sdX. No checks, just simple mapping.
device-mapper.
Here things get more interesting.
I tend to use lvm wherever possible, but it's just a convenience layer (or a
set of nice tools to setup mappings) on top of dm, no checks of any kind, but
at least it doesn't make things much worse either - lvm metadata is fairly
redundant and easy to backup/recover.
dm-crypt gives no noticeable performance overhead, exists either above or
under lvm in the stack, and is nice hygiene against accidental leaks
(selling or leasing hw, theft, bugs, etc), but lacking authenticated
encryption modes it doesn't do anything to detect bit-flips.
Worse, it amplifies the issue.
In the most common
CBC mode one flipped bit in the ciphertext will affect
a few other bits of data until the end of the dm block.
Current dm-crypt default (since the latest cryptsetup-1.6.X, iirc) is XTS
block encryption mode, which somewhat limits the damage, but dm-crypt has
little support for changing modes on-the-fly, so tough luck.
But hey, there is
dm-verity, which sounds like exactly what I want,
except it's read-only, damn.
Read-only nature is heavily ingrained in its "hash tree" model of integrity
protection - it is hashes-of-hashes all the way up to the root hash, which
you specify on mount, immutable by design.
Block-layer integrity protection is a bit weird anyway - lots of unnecessary
work potential there with free space (can probably be somewhat solved by
TRIM), data that's already journaled/checksummed by fs and just plain
transient block changes which aren't exposed for long and one might not care
about at all.
Filesystem layer above does the right thing sometimes.
COW fs'es like
btrfs and
zfs have checksums and scrubbing, so seem
to be a good options.
btrfs was slow as hell on rotating plates last time I checked, but zfs port
might be worth a try, though if a single cow fs works fine on all kinds of
scenarios where I use ext4 (mid-sized files), xfs (glusterfs backend) and
reiserfs (hard-linked backups, caches, tiny-file sub trees), then I'd really
be amazed.
Other fs'es plain suck at this. No care for that sort of thing at all.
Above-fs syscall-hooks kernel layers.
IMA/EVM sound great, but are also for immutable security ("integrity")
purposes ;(
In fact, this layer is heavily populated by security stuff like LSM's, which I
can't imagine being sanely used for bitrot-detection purposes.
Security tools are generally oriented towards detecting any changes,
intentional tampering included, and are bound to produce a lot of
false-positives instead of legitimate and actionable alerts.
Plus, upon detecting some sort of failure, these tools generally don't care
about the data anymore acting as a Denial-of-Service attack on you, which is
survivable (everything can be circumvented), but fighting your own tools
doesn't sound too great.
Userspace.
There is tripwire, but it's also a security tool, unsuitable for the task.
Some rare discussions of the problem pop up here and there, but alas, I
failed to salvage anything useable from these, aside from ideas and links to
subject-relevant papers.
Scanning github, bitbucket and xmpp popped up bitrot script and a
proof-of-concept md-checksums md layer, which apparently haven't even made it
to lkml.
So, naturally, following long-standing "... then do it yourself" motto,
introducing fs-bitrot-scrubber tool for all the scrubbing needs.
It should be fairly well-described in the readme, but the gist is that it's just
a simple userspace script to checksum file contents and check changes there over
time, taking all the signs of legitimate file modifications and the fact that it
isn't the only thing that needs i/o in the system into account.
Main goal is not to provide any sort of redundancy or backups, but rather notify
of the issue before all the old backups (or some cluster-fs mirrors in my case)
that can be used to fix it are rotated out of existance or overidden.
Don't suppose I'll see such decay phenomena often (if ever), but I don't like
having the odds, especially with an easy "most cases" fix within grasp.
If I'd keep lot of important stuff compressed (think what will happen if a
single bit is flipped in the middle of few-gigabytes .xz file) or naively
(without storage specifics and corruption in mind) encrypted in cbc mode (or
something else to the same effect), I'd be worried about the issue so much more.
Wish there'd be something common out-of-the-box in the linux world, but I guess
it's just not the time yet (hell, there's not even one clear term in the techie
slang for it!) - with still increasing hdd storage sizes and much more
vulnerable ssd's, some more low-level solution should materialize eventually.
Here's me hoping to raise awareness, if only by a tiny bit.
github project link
Mar 25, 2013
There's
plenty of public cloud storage these days, but trusting any of them
with any kind of data seem reckless - service is free to corrupt, monetize,
leak, hold hostage or just drop it then.
Given that these services are provided at no cost, and generally without much
ads, guess reputation and ToS are the things stopping them from acting like
that.
Not trusting any single one of these services looks like a sane safeguard
against them suddenly collapsing or blocking one's account.
And not trusting any of them with plaintext of the sensitive data seem to be a
good way to protect it from all the shady things that can be done to it.
Tahoe-LAFS is a great capability-based secure distributed storage system,
where you basically do "tahoe put somefile" and get capability string like
"URI:CHK:iqfgzp3ouul7tqtvgn54u3ejee:...u2lgztmbkdiuwzuqcufq:1:1:680"
in return.
That string is sufficient to find, decrypt and check integrity of the file (or
directory tree) - basically to get it back in what guaranteed to be the same
state.
Neither tahoe node state nor stored data can be used to recover that cap.
Retreiving the file afterwards is as simple as GET with that cap in the url.
With remote storage providers, tahoe node works as a client, so all crypto being
client-side, actual cloud provider is clueless about the stuff you store, which
I find to be quite important thing, especially if you stripe data across many of
these leaky and/or plain evil things.
Finally got around to connecting a third backend (box.net) to tahoe today, so
wanted to share a few links on the subject:
https://github.com/mk-fg/tahoe-lafs-public-clouds
Public cloud drivers for tahoe-lafs.
https://github.com/mk-fg/lafs-backup-tool
Tool to intelligently (compression, deduplication, rate-limiting, filtering,
metadata, etc) backup stuff to tahoe.
https://github.com/LeastAuthority/tahoe-lafs
Upstream repo with more enterprisey cloud backend drivers (s3, openstack,
googlestorage, msazure).
https://tahoe-lafs.org/trac/tahoe-lafs/browser/git/docs/specifications/backends/raic.rst
Redundant Array of Independent Clouds concept.
http://www.sickness.it/crazycloudexperiment.txt
A way to link all the clouds together without having any special drivers.
As I run tahoe nodes on a headless linux machines, running proprietary GUI
clients there doesn't sound too appealing, even if they exist for certain
services.
Feb 08, 2013
As suspected before, ended up rewriting skyped glue daemon.
There were just way too many bad practices (from my point of view) accumulated
there (incomplete list can be found in the
github issue #7, as well as some
PRs I've submitted), and I'm quite puzzled why the thing actually works, given
quite weird socket handling going on there, but one thing should be said: it's
there and
it works.
As software goes, that's the most important metric by far.
But as I'm currently purely a remote worker (not sure if I qualify for
"freelancer", being just a drone), and skype is being quite critical for comms
in this field, just working thing that silently drops errors and messages is not
good enough.
Rewritten version is a generic eventloop with non-blocking sockets and
standard handle_in/handle_out low-level recv/send/buffer handlers, with
handle_<event> and dispatch_<event> callbacks on higher level and explicit
conn_state var.
It also features full-fledged and
configurable python logging, with debug
options, (at least) warnings emitted on every unexpected event and proper
non-broad exception handling.
Regardless of whether the thing will be useful upstream, it should finally put a
final dot into skype setup story for me, as the whole setup seem to be robust
and reliable enough for my purposes now.
Unless vmiklos will find it useful enough to merge, I'll probably maintain the
script in this bitlbee fork, rebasing it on top of stable upstream bitlbee.
Feb 04, 2013
Was hacking something irrelevant together again and, as often happens with
such things, realized that I implemented something like that before.
It can be some simple - locking function in python, awk pipe to get some
monitoring data, chunk of argparse-based code to process multiple subcommands,
TLS wrapper for requests, dbapi wrapper, multi-module parser/generator for
human-readable dates, logging buffer, etc...
Point is - some short snippet of code is needed as a base for implementing
something new or maybe even to re-use as-is, yet it's not noteworthy enough on
it's own to split into a module or generally do anything specific about it.
Happens a lot to me, as over the years, a lot of such ad-hoc yet reusable code
gets written, and I can usually remember enough implementation details
(e.g. which modules were used there, how the methods/classes were called and
such), but going "grep" over the source dir takes a shitload of time.
Some things make it faster - ack or pss tools can scan only relevant things
(like e.g. "grep ... **/*.py" will do in zsh), but these also run for
minutes, as even simple "find" does - there're several django source trees in
appengine sdk, php projects with 4k+ files inside, maybe even whole linux kernel
source tree or two...
Traversing all these each time on regular fs to find something that can be
rewritten in a few minutes will never be an option for me, but luckily there're
cool post-fs projects like tmsu, which allow to transcend
single-hierarchy-index limitation of a traditional unix fs in much more elegant
and useful way than gazillion of symlinks and dentries.
tmsu allows to attach any tags to any files, then query these files back using a
set of tags, which it does really fast using sqlite db and clever indexes there.
So, just tagging all the "*.py" files with "lang:py" will allow to:
% time tmsu files lang:py | grep myclass
tmsu files lang:py 0.08s user 0.01s system 98% cpu 0.094 total
grep --color=auto myclass 0.01s user 0.00s system 10% cpu 0.093 total
That's 0.1s instead of several minutes for all the python code in the
development area on this machine.
tmsu can actually do even cooler tricks than that with fuse-tagfs mounts, but
that's all kinda wasted until all the files won't be tagged properly.
Which, of course, is a simple enough problem to solve.
So here's my first useful
Go project -
codetag.
I've added taggers for things that are immediately useful for me to tag files
by - implementation language, code hosting (github, bitbucket, local project, as
I sometimes remember that snippet was in some public tool), scm type (git, hg,
bzr, svn), but it adding a new one is just a metter of writing a "Tagger"
function, which, given the path and config, returns a list of string tags, plus
they're only used if explicitly enabled in config.
Other features include proper python-like logging and rsync-like filtering (but
using more powerful
re2 regexps instead of simple glob patterns).
Being a proper compiled language, Go allows to make the thing into a single
static binary, which is quite neat, as I realized that I now have a tool to
tag all the things everywhere - media files on servers' remote-fs'es, like music
and movies, hundreds of configuration files by the app they belong to (think
tmsu files daemon:apache to find/grep all the horrible ".htaccess" things
and it's "*.conf" includes), distfiles by the os package name, etc... can be
useful.
So, to paraphrase well-known meme, Tag All The Things! ;)
github link