Interesting. I did the same investigation myself a few years back, but was frustrated by the lack of the -r flag for shuf(1). It seems that's been added at some point recently (though many of my systems do not have it--GNU coreutils percolates slowly through older Debian/Ubuntu versions. :))
Good to know things are still getting better in coreutils!
Oh, I'd imagine so. Good enough for illustrative purposes and for catching any gross errors in my arithmetic when analysing the games. Not good enough for anything 'real'.
I recently discovered shuf since I needed to shuffle a fairly large number of URL's in a file to allow multiple processes to work through them in parallel (yes I could have actually done this using a queue and a producer/consumer, but this was a one time deal so it was faster to just throw a bit more hardware at it). What I was amazed by is that it took 'cat urls.txt | shuf > new-urls.txt' just about a second to complete even thoug the original file was about 1GB. How does it work so incredibly fast?
but doesn't have something like -r to resample. Or a nice way to simply shuffle the whole collection (workaround is to just pass -count as large or larger than the collection).
Note that hnov's awk command is the equivalent of "sort (random order)" at that and shows good randomness properties in the plot. However, that link shows "sort (random comparator)" by default which looks terrible at randomly sorting lists. hnov's awk script should be suitable for most needs, though I'd tweak it a bit:
If you have an unknown input size and a finite number of requested output lines, you can still do it in O(n) and no additional memory, by having a steadily-decreasing chance of replacing one of the outputs with the next line.
Good to know things are still getting better in coreutils!