May 18, 2014

The moment of epic fail hilarity with hashes

Just had an epic moment wrt how to fail at kinda-basic math, which seem to be quite representative of how people fail wrt homebrew crypto code (and what everyone and their mom warn against).

So, anyhow, on a d3 vis, I wanted to get a pseudorandom colors for text blobs, but with reasonably same luminosity on HSL scale (Hue - Saturation - Luminosity/Lightness/Level), so that changing opacity on light/dark bg can be seen clearly as a change of L in the resulting color.

There are text items like (literally, in this example) "thing1", "thing2", "thing3" - these should have all distinct and constant colors, ideally.

So how do you pick H and S components in HSL from a text tag? Just use hash, right?

As JS doesn't have any hashes yet (WebCryptoAPI is in the works) and I don't really need "crypto" here, just some str-to-num shuffler for color, decided that I might as well just roll out simple one-liner non-crypto hash func implementation.
There are plenty of those around, e.g. this list.

Didn't want much bias wrt which range of colors get picked, so there are these test results - link1, link2 - wrt how these functions work, e.g. performance and distribution of output values over uint32 range.

Picked random "ok" one - Ly hash, with fairly even output distribution, implemented as this:

hashLy_max = 4294967296 # uint32
hashLy = (str, seed=0) ->
  for n in [0..(str.length-1)]
    c = str.charCodeAt(n)
    while c > 0
      seed = ((seed * 1664525) + (c & 0xff) + 1013904223) % hashLy_max
      c >>= 8

c >>= 8 line and internal loop here because JS has unicode strings, so it's a trivial (non-standard) encoding.

But given any "thing1" string, I need two 0-255 values: H and S, not one 0-(2^32-1). So let's map output to a 0-255 range and just call it twice:

hashLy_chain = (str, count=2, max=255) ->
  [hash, hashes] = [0, []]
  scale = d3.scale.linear()
    .range([0, max]).domain([0, hashLy_max])
  for n in [1..count]
    hash = hashLy(str, hash)
Note how to produce second hash output "hashLy" just gets called with "seed" value equal to the first hash - essentially hash(hash(value) || value).
People do that with md5, sha*, and their ilk all the time, right?

Getting the values from this func, noticed that they look kinda non-random at all, which is not what I came to expect from hash functions, quite used to dealing crypto hashes, which are really easy to get in any lang but JS.

So, sure, given that I'm playing around with d3 anyway, let's just plot the outputs:

Ly hash outputs

"Wat?... Oh, right, makes sense."

Especially with sequential items, it's hilarious how non-random, and even constant the output there is.
And it totally makes sense, of course - it's just a "k1*x + x + k2" function.

It's meant for hash tables, where seq-in/seq-out is fine, and the results in "chain(3)[0]" and "chain(3)[1]" calls are so close on 0-255 that they map to the same int value.

Plus, of course, the results are anything but "random-looking", even for non-sequential strings of d3.scale.category20() range.

Lession learned - know what you're dealing with, be super-careful rolling your own math from primitives you don't really understand, stop and think about wth you're doing for a second - don't just rely on "intuition" (associated with e.g. "hash" word).

Now I totally get how people start with AES and SHA1 funcs, mix them into their own crypto protocol and somehow get something analogous to ROT13 (or even double-ROT13, for extra hilarity) as a result.

Mar 19, 2014

Shadowrun: Dragonfall

Dragonfall is probably the best CRPG (and game, given that it's my fav genre) I've played in a few years.
Right mix of setting, world, characters, tone, choices and mechanics, and it only gets more amazing towards the end.
I think all the praise it gets is well deserved - for me it's up there with early Bioware games (BGs, PST) and Fallouts.
Didn't quite expect it too after okay'ish Dead Man's Switch first SRR experience, and hopefully there's more to come from HBS, would even justify another kickstarter imo.

Nov 12, 2010

Moar free time!

As of today, I'm unemployed once again.

Guess now I'll have time to debug and report a btrfs-systemd crash, read all the feeds, fix some long-standing issues on my home servers, update an antique web setup, write a few watch-notify scripts there, deploy/update a configuration management systems, update/finish/publish a few of my spare-time projects, start playing with a lot of new ideas, check out networking tools like connman, wicd, nm and a bunch of other cool-stuff oss projects, write a few hooks for plotting and graphing stuff in real-time, adapt emacs mail/rss tools, update other elisp stuff, emacs itself, a few symbian-to pc event hooks, check out gnustep environment, ltu lang articles, pybrain and a few other simple machine-learning implementations, some lua-ai for spring, play a lot of games I've missed in past few years, read a few dozen books I've already uploaded but never had a time to, study linear and geometric algebra... maybe find a new job, even, before I starve?

Nah, nobody in the world have that much time... ;)

Member of The Internet Defense League