Posted by Tom Moertel
Fri, 19 Sep 2008 15:06:00 GMT
Via Dare Obasanjo’s blog, I learned
that the much-publicized cracking of Sarah Palin’s Yahoo! email accounts was
accomplished by exploiting the weakness of “security questions”. In short, all the attacker needed to do to convince Yahoo’s computers that he was Palin was answer three questions as if he were Palin:
- What’s your birthday?
- What’s your Zip code?
- Where did you meet your spouse?
The attacker says he obtained the answers to these questions in less
than an hour. Everything he needed was already public knowledge, and
Google and Wikipedia made that knowledge easy to find.
And that’s why when I sign up for web sites that ask me to provide baseline answers for those annoying security
questions, I claim that I met my spouse
in CWmKryWzuxCSAnMDuIg. What? You’ve never been there? Well, that’s not surprising. It’s not a real
place: it’s a password, randomly generated, and remembered for me by
password-management software on my computer.
That’s right. Every time I’m asked to establish my “secret” answer to a
security question, I generate a random string and use that. Here’s a
script I use:
#!/usr/bin/perl
use MIME::Base64;
open my $random, "/dev/urandom"
or die "can't open /dev/urandom";
my $bytes;
read $random, $bytes, 16;
close $random;
my $pw = encode_base64($bytes);
$pw =~ tr/A-Za-z0-9//cd;
print "$pw$/";
Then I store the string in my password-management software, just in
case the web site asks me for it later. Which should only happen if I
forget my primary password for that site. Which should only happen if
I can’t get into my password-management software. Which should only happen if I’m totally screwed, anyway, so what are the security questions buying me again?
In sum, if you care about your security, you’re probably picking good passwords already. In that case, security questions can’t help you, but they can harm you by making it easier for an attacker bypass your passwords. That’s how the Palin-email cracker did it. So treat your answers to security questions as if they were passwords – in effect, that’s what they are.
Posted in security
Tags passwords, security
8 comments
no trackbacks

Posted by Tom Moertel
Mon, 25 Aug 2008 02:56:00 GMT
I finally got around to releasing PXSL Tools on Hackage. The package contains pxslcc, a preprocessor that converts Parsimonious XML Shorthand Language into XML, and supporting documentation.
If you want to hack on the Haskell sources, I’ve put the project on GitHub, too. See the pxsl-tools project page to browse the code, or just clone the repo and hack away:
$ git clone git://github.com/tmoertel/pxsl-tools.git
Tags git, haskell, pxsl
no comments
no trackbacks

Posted by Tom Moertel
Thu, 21 Aug 2008 01:50:00 GMT
Although at work I code mostly in Python – a language from which
lambda and map were nearly removed – I still find that functional-programming experience
has its benefits. One of the
“functional-programming dividends” I notice most often is insight
gained from considering problems from an algebraic perspective.
Recently, for example, I had a small parsing problem. I had to
write code that, given a simple grammar specification as input, emits
a regular expression that matches the language generated by the
grammar.
Here’s a simplified version of the problem. A grammar specification
is limited to a series of one or more atoms. For example, “a b c”
represents the atom “a”, followed by the atom “b”, followed by the
atom “c”. To generate the grammar, the series of atoms is interpreted
such that each atom (except the last) generates a production rule of
the following form:
atom_rule ::=
<the literal atom> (SPACE <the next rule> | NOTHING)
(SPACE represents literal white space and NOTHING represents an
empty string.) The grammar as a whole is rooted in the first atom’s
rule.
Thus the specification “a b c” represents the following grammar:
grammar ::= a_rule
a_rule ::= "a" (SPACE b_rule | NOTHING)
b_rule ::= "b" (SPACE c_rule | NOTHING)
c_rule ::= "c"
Note that the final atom’s production matches only the literal atom
itself: it has no following rule on which to chain.
The grammar, in turn, generates the following language:
a
a b
a b c
Thus, given the grammar specification “a b c”, my code had to produce
a regular expression that would match “a”, “a b”, or “a b c”.
At this point, please stop for a moment and think about this little
programming exercise. Do any solutions leap to mind? How would you
approach the problem? Form your opinions now, because I’m going to
ask you about them later. (If you’re feeling especially caffeinated, try
coding a solution before reading on.)
Read more...
Posted in functional programming
Tags folds, fp, haskell, python
19 comments
no trackbacks

Posted by Tom Moertel
Sun, 10 Aug 2008 18:12:00 GMT
As an update to my previous post on the 2008 convention of the Historical Construction Equipment Association, I have posted an action-packed video of a Type-B Erie steam-powered shovel! Wait until you see this old beast belch steam and smoke and you hear it chug, clank, and huff and puff – it’s like stepping back in time. And it’s definately fun stuff!
I took this footage on 8 August 2008 in Brownsville, Pennsylvania. (I also have footage of other equipment – dozers, draglines, trucks, shovels, and more. Let me know if you’re interested, and I’ll upload those, too.)
Update: I have uploaded the rest of my footage to flickr:
- Thew “O” steam shovel
- Dragline
- CAT 955 bulldozer
- CAT D8 bulldozer
- Vintage Army dump truck
See them all in
my HCEA 2008 photostream.
Posted in fun stuff
Tags fun_stuff, hcea, hcea2008, steam_shovel, video
2 comments
no trackbacks

Posted by Tom Moertel
Sun, 10 Aug 2008 00:23:00 GMT
Today, my dad and I went to the 2008 annual convention of the Historical Construction Equipment Association. We were impressed with the quantity and quality of the machinery on active display: steam shovels, dozers, graders, crawlers, scrapers, cranes, steamrollers, and a bunch of other old but well-maintained construction equipment. I’m talking dozens of massive machines – not just sitting there, but working!
This year’s convention is in Brownsville, Pennsylvania and runs through August 10, 2008. If you are within driving distance and think smoke-belching, earth-shaking construction equipment is fun stuff, don’t miss it. There’s still time to go.
If you can’t make it, I took some photos for you. Not quite the real thing, but better than nothing.

Posted in fun stuff
Tags fun_stuff, hcea, hcea2008
1 comment
no trackbacks

Posted by Tom Moertel
Mon, 05 May 2008 13:58:00 GMT
There’s a great way to explain delimited continuations in the notes of Oleg’s Continuation Fest talk on using delimited continuations for CGI programming. Just so it doesn’t get overlooked, here it is:
I’m obsessed in pointing out that every programmer already knows
and understands the delimited continuations; they might not know that
word though. Everyone knows that when a process executes a system
call like read, it gets suspended. When the disk
delivers the data, the process is resumed. That suspension of a
process is its continuation. It is delimited: it is not the
check-point of the whole OS, it is the check-point of a process only,
from the invocation of main() up to the point
main() returns. Normally these suspensions are resumed
only once, but can be zero times (exit) or twice
(fork).
I especially like the final part about exit and
fork, which drives home the notion that something more
subtle than returning from a typical function call is going on. If
anybody is confused over what suspended means, that last part
ought to clear things up.
The next time I need to explain delimited continuations, I know how
I’m going to do it.
Posted in functional programming
Tags continuations, fp, oleg
no comments
no trackbacks

Posted by Tom Moertel
Fri, 11 Apr 2008 15:58:00 GMT
Via Chris:
$ history | awk '{print $2}' | sort | uniq -c | sort -rn | head
196 git
110 l
102 cd
70 make
34 darcs
30 pushd
23 ssh
23 m
23 ls
20 rm
The l and m commands are aliases:
Posted in interesting stuff
Tags life, memes, programming
1 comment
no trackbacks

Posted by Tom Moertel
Thu, 20 Mar 2008 02:34:00 GMT
At work recently I was writing some tests with Python’s out-of-the-box
unit-testing framework
unittest. I’m new
to Python and accustomed to Perl and Haskell’s testing frameworks,
which are lightweight and let you write tests without much
hoop-jumping. In particular,
QuickCheck and
LectroTest make it easy
to test at the property level instead of the test-case level.
With unittest, I was having to write a lot of code
to get the same level of abstraction.
By “property level,” here’s what I mean. Say I’m testing this thing,
let’s call it a subscriber pool. It has two fundamental properties:
- Subscribe. For all initial states of the pool, if you call subscribe(user), then, assuming there have been no other operations on the pool, user must be in the pool.
- Unsubscribe. For all initial states of the pool, if you call unsubscribe(user), then, assuming there have been no other operations on the pool, user must not be in the pool.
That’s it. If my implementation satisfies both properties, it’s
correct. (This is a simplified version of my real testing problem,
which required additional property checks.)
To test whether my implementation satisfies each property, I must
write individual test cases that together “cover” the property. For
example, to test whether the Subscribe property holds, I might write
four test cases:
class SubscribeProperty(unittest.TestCase):
def setUp(self):
initialize_pool()
def tearDown(self):
destroy_pool()
def testEmpty(self):
load_pool_with_members([])
subscribe("1")
self.assert_("1" in pool_members())
def testOtherGuyAlreadyInPool(self):
load_pool_with_members(["2"])
subscribe("1")
self.assert_("1" in pool_members())
def testSubscriberAlreadyInPool(self):
load_pool_with_members(["1"])
subscribe("1")
self.assert_("1" in pool_members())
def testSubscriberAndOtherGuyAlreadyInPool(self):
load_pool_with_members(["1", "2"])
subscribe("1")
self.assert_("1" in pool_members())
Every one of the test cases has the same form. The repetition
makes me want to refactor the whole thing.
Okay, let’s do it:
Read more...
Posted in testing
Tags nose, properties, python, testing, unittest
1 comment
no trackbacks

Posted by Tom Moertel
Tue, 18 Dec 2007 03:33:00 GMT
XML is fine for representing document-like things, but when it’s
twisted to represent build recipes, configuration files, and little
programming languages, it opens the gates to XML Hell. Once the
gates are opened, the demons of cargo-cult thinking are loosed upon
the world, where they are free to trick innocent programmers into
working with grotesquely twisted XML documents – something no human
mind was designed to comprehend. Ensnared, these programmers are
slowly drawn into the depths of XML Hell, from which their
lamentations echo across the
universe.
When the demons of cargo-cult thinking come for you, don’t be
ensnared! Instead, be prepared – with PXSL – the Parsimonious XML
Shorthand Language
(pronounced “pixel”).
What’s PXSL? It’s a luxurious, thermonuclear smoking jacket that you
can slip on using a convenient preprocessor. Use it whenever you see
grotesque XML on the horizon. Within PXSL’s plush (and stylish)
protection, you can create all the nasty, twisted XML that may be
demanded of you, but you need not descend into XML Hell to do it.
Instead, you can work from the comfort of a well-stocked lounge, where
clarity and conciseness are always on tap.
For example, here’s a snippet from an XSLT stylesheet, in the
original XML:
<xsl:template match="/">
<xsl:for-each select="//*/@src|//*/@href">
<xsl:value-of select="."/>
<xsl:text> </xsl:text>
</xsl:for-each>
</xsl:template>
And here’s the same snippet, written in PXSL:
template /
for-each //*/@src|//*/@href
value-of .
text << >>
Isn’t that refreshing?
Why PXSL?
There are lots of XML shorthands available. (The PXSL FAQ lists about ten of them.) So why choose
PXSL? Here’s why:
Also, PXSL is battle tested. It was first released in 2003 and has
been saving people from XML Hell since. People who try it seem to like it:
- I think PXSL could do wonders for soothing my irrational hatred for all things XML. —kowey
- Impressive… I converted some of my files from XML to PXSL and the readability was much improved. —chris
- Quite aside from the fact that XSLT is finally somewhat readable, the fact that you’ve added a serious macro system means that some serious scripting of XML can occur. I’m very impressed. —invisible
The next time you’re headed for XML Hell, why not give the venerable PXSL a try? You might just find that you like it, too.
This public service announcement was brought to you in celebration of
the 1.0 release of the pxsl-tools package. The PXSL-to-XML compiler
pxslcc is written in Haskell and uses the
cross-platform Haskell Cabal
build/package system to let you use PXSL just about anywhere.
Posted in programming
Tags haskell, pxsl, xml, xslt
8 comments
no trackbacks

Posted by Tom Moertel
Mon, 10 Dec 2007 21:52:00 GMT
About three years ago, I switched to
Darcs
as my primary source-code management system. It was simple,
intuitive, and powerful, and it made managing my projects more fun and
less frustrating than any centralized VCS ever had. That it was
written in Haskell, one of my favorite programming languages, made
it even better. I was hooked.
Since then, the distributed SCM landscape has changed. Darcs hasn’t
improved much, but its competitors have made long strides, especially
Git and
Mercurial. Both
are crazy fast, vigorously developed, and widely used on large, highly
active real-world projects, such as the Linux kernel and Mozilla 2.
In comparison, Darcs has
stagnated.
When I started working for a new company recently, I had to consider
whether to advocate Darcs or something else. In the end, I decided
that Darcs would be a hard sell. Nobody else at the company uses
Haskell, and having to explain how to avoid the occasional corner
case
seemed liked a losing proposition.
After researching and playing around with Git and Mercurial, I settled
on Git. I like Git’s underlying hashed-blobs model better than
Mercurial’s revlogs, and Git seems to have slightly more development
momentum. Still, it was a close call. Either choice would have been
completely reasonable.
Missing Darcs
When I started using Git on real projects, the one thing I really
missed was the ability to easily amend earlier patches, something
Darcs made trivial. Let me
explain. The typical development workflow goes something like this:
- Checkout copy of upstream code base.
- Implement feature X.
- Commit.
- Implement independent feature Y.
- Commit.
- Implement independent feature Z.
- Commit.
- Push new features back upstream.
Now, what really happens is that when I’m implementing Y or Z,
I’ll realize that I made a mistake in X. The trick is then
fixing X so that my fix is part of the changeset/patch for X that
ultimately gets pushed upstream in the last step. That way, the
upstream folks will see only a single, clean patch for feature X – not
a mishmash of patches that together represent X.
In Darcs, amending the original patch is easy because its patch theory
lets me tweak the patch for X independently of the other patches.
Darcs will simply ask me which patch I want to amend, and I’ll select
the orignal patch for X:
$ emacs # fix X
$ darcs amend-record # amend original patch for X
Mon Dec 10 14:43:13 EST 2007 Tom Moertel <tom@moertel.com>
* Implemented Z
Shall I amend this patch? [yNvpq], or ? for help: n
Mon Dec 10 14:42:12 EST 2007 Tom Moertel <tom@moertel.com>
* Implemented Y
Shall I amend this patch? [yNvpq], or ? for help: n
Mon Dec 10 14:41:46 EST 2007 Tom Moertel <tom@moertel.com>
* Implemented X
Shall I amend this patch? [yNvpq], or ? for help: y
hunk ./x 1
-X1
+X2
Shall I add this change? (1/?) [ynWsfqadjkc], or ? for help: y
Finished amending patch:
Mon Dec 10 14:43:25 EST 2007 Tom Moertel <tom@moertel.com>
* Implemented X
That’s it. The exact same process will work regardless of when I
realize I need to fix X: before I start Y, while I’m implementing Y,
after I’ve committed Y, while I’m working on Z, or after I’ve committed
Z.
Learning to love Git
With Git, however, I can amend a commit only if I haven’t committed anything else before making my fix. In Git’s mind, Y depends on X, and Z
depends on Y, even if they really are independent of one another.
So if I commit the original patch for X and then immediately realize I
need to make a fix, before I start working on Y or Z, it’s easy:
$ emacs # implement X
$ git commit -m 'Implemented X'
# discover problem in X
$ emacs # fix X
$ git commit --amend # amend original patch
More typically, it’s only while I’m working on Y that I’ll
realize I need to fix X. Then it’s more complicated
to amend the original commit:
$ emacs # implement X
$ git commit -m 'Implemented X'
$ emacs # start working on Y
# discover problem in X
$ git stash # stash away half-completed work on Y
$ emacs # fix X
$ git commit --amend # amend original patch for X
$ git stash apply # restore work on Y
$ emacs # continue working on Y
While not as convenient as Darcs’s workflow, it’s perfectly workable.
Now let’s consider another fairly typical case: I commit X and Y and
then start working on Z before I notice the problem in X. I used to
think that Git couldn’t handle this case, but it can, thanks to
git rebase --interactive:
$ emacs # implement X
$ git commit -m 'Implemented X'
$ emacs # implement Y
$ git commit -m 'Implemented Y'
$ emacs # start working on Z
# discover problem in X
$ git stash # stash away half-completed work on Z
$ emacs # fix X
$ git commit -m 'Fixed X'
$ git rebase --interactive HEAD~3 # see comments below
$ git stash apply # restore work on Z
$ emacs # continue working on Z
The
git rebase --interactive command is
powerful. What the
command does, as called in the snippet above, is invoke my editor of
choice on a text file describing the last 3 commits (that’s the
HEAD~3 part):
# Rebasing 3ad99a7..b9a8405 onto 3ad99a7
#
# Commands:
# pick = use commit
# edit = use commit, but stop for amending
# squash = use commit, but meld into previous commit
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
pick 0885540 Implemented X
pick 320b115 Implemented Y
pick b9a8405 Fixed X
I can then edit the file to reorder, merge (squash), and/or remove
the commits. In this example, I want to merge the fix for X into
the original commit that implemented X. So I edit the file like so:
pick 0885540 Implemented X
squash b9a8405 Fixed X
pick 320b115 Implemented Y
Then I save the file, at which point Git takes over and makes the
requested changes, merging the fix for X into the
original commit for X. Now the log shows the original implementation
and fix as one commit:
$ git log
commit f387d650976246c0854d028b040cca40e542be56
Author: Tom Moertel <tom@moertel.com>
Date: Mon Dec 10 15:11:26 2007 -0500
Implemented Y
commit 82a1c849ffd1bd688d5bc9d99be0e63548a89c4c
Author: Tom Moertel <tom@moertel.com>
Date: Mon Dec 10 15:13:03 2007 -0500
Implemented X
Fixed X
commit 3ad99a7ef537b7ae99e435e0d2b4b0d03de92c65
Author: Tom Moertel <tom@moertel.com>
Date: Mon Dec 10 15:11:14 2007 -0500
Initial checkin
Once I figured out how to use git rebase --interactive, I stopped
missing Darcs and started loving Git.
Posted in programming
Tags darcs, dvcs, git, haskell, scm
19 comments
no trackbacks
