GIGO: words unreadable aloud

Luhn Haskell code explained in excruciating detail

On Halloween, I posted a scary entry about the Luhn algorithm, and Joey deVilla, whose idea this was, mentioned my little programs.

Now, since there are a whole lot more people who don't know any Haskell than people who do, I thought it'd be nice to write a literate version, aimed at the Haskell novice. Here 'tis. A slightly earlier Haskell "literate source" version is at http://radio.weblogs.com/0100945/gems/Luhn_explain_lhs.txt, with the typical filename munging that Radio UserLand's policies require. (I.e., rename it to Luhn_explain.lhs and you should be able to feed it directly to hugs, ghci, or ghc.)

So, here goes:

To calculate the Luhn algorithm for a number (passed in as a string of digits), convert it to an integer ("read"), and call the sub-luhner with that number and a 1 as the multiplier. If the result of that call is zero mod 10, you have a valid number.

> luhn str = (luhn_sub (read str) 1) `mod` 10 == 0

The sub-luhner takes a big number n (or part of one), and a multiplier m (starts with 1). It will return the funky sum, for its caller to check for divisibility by 10.

If the big number gets to zero, the result is zero.

> luhn_sub 0 _ = 0

The main work of the whole algorithm is done in the next three lines.

We split the number into n/10 and n%10, the latter being "the current digit". Multiply the current digit by the current multiplier (m) and then (in "do_dig") add the digits (i.e., subtract 9) if necessary.

The recursive call takes the main part of the number (n/10), and gets the checksum for that part, but with the multiplier toggled (2 if it was 1, and vice versa; this is the "(3-m)").

Then we add the results of that do_dig and that recursive call, and we're done!

> luhn_sub n m = > (luhn_sub (n `div` 10) (3 - m)) + do_dig (m * (n `mod` 10)) > where do_dig n = if n <= 9 then n else n - 9

The next five lines run a small pair of tests. Not originally a great choice, since both of the test digit strings were an even number of digits long. I've added a couple of odd-length tests here (and have run a few thousand other numbers chosen by python's random number module) and now I have a bit more confidence.

t9 takes a digit string and returns a list of ten strings, where each resulting string has an appended digit, "0" through "9".

In more detail:

(map show [0..9]) takes the list of numbers 0-9 and turns it into a list of one-char strings "0" through "9".

(map (base ++) That_Previous_Result) is a tricky one:
"(base ++)" is a partially-applied operator function.
(++) is the list append operator, and Haskell treats character strings as lists of characters. E.g., ("abcd" ++ "efg") results in "abcdefg". So a partially-applied (++) as in ("abcd" ++) is a function that takes a string and appends it to "abcd".

So, mapping "(base ++)" onto That_Previous_Result yields the list of ten strings where the one-char string ("0" through "9") has been appended to our base string (t9's parameter).

> t9 base = (map (base ++) (map show [0..9]))

prTest prints the test rusults. Starting with the small list of test numbers I chose, ["123", "234234", "328388272", "5128960128"], we map t9 onto that list to give a list of four lists, where each sub-list has ten numbers in it.

"concat $ a_long_expression" is the same as "concat (a_long_expression)"; it's a style thing to decide which one to use. In our case, concat will take that list of four lists, each with ten numbers, and combine them into one list of forty numbers.

> prTest = > let test_strs = concat $ map t9 ["123", "234234", "328388272", "5128960128"] in

The raw results would be (map luhn test_strs) but that doesn't look particularly helpful when you display it: it's just a list of forty Trues or Falses. So instead of mapping luhn directly onto the test_strs, we map an anonymous function:

(\s -> (s ++ " -> " ++ show (luhn s)))

onto them. The anonymous function takes a string s, and returns a string consisting of the input (parameter) s, an arrow, and then the string version of the True or False result of the luhn call.

> let results = map (\s -> (s ++ " -> " ++ show (luhn s))) test_strs in

Finally, we'd like to see those results, so we want to output each one, and follow each output with a newline. In this case, we can't simply use a "map" because that would return a list of IO actions (the type of putStrLn is String -> IO () ). We wish to get prTest to perform the actions itself, and the Haskell interpreters won't do that for a list of actions. If we were to make a pr2 that's like prTest but uses map instead of mapM_, then we would have to call it by saying "sequence pr2" in order to get the interpreter to perform those actions.

> mapM_ putStrLn results

When I run it, I get a bunch of lines like these:

1230 -> True
1231 -> False
1232 -> False
  . . .

with the four good ones interspersed:

1230 -> True
2342343 -> True
3283882722 -> True
51289601281 -> True

Once one is familiar with Haskell, one tends to appreciate how well Haskell code can retain clarity despite being so concise.
1:26:00 PM comment/