Luhn Haskell code explained in excruciating detail
On Halloween, I posted
a scary entry about the Luhn algorithm, and Joey deVilla, whose
idea this was,
mentioned my little programs.
Now, since there are a whole lot more people who don't know any
Haskell than people who do, I thought it'd be nice to write a literate version,
aimed at the Haskell novice. Here 'tis. A slightly earlier Haskell "literate source" version is at
http://radio.weblogs.com/0100945/gems/Luhn_explain_lhs.txt,
with the typical filename munging that Radio UserLand's policies require.
(I.e., rename it to Luhn_explain.lhs and you should be able to feed it
directly to hugs, ghci, or ghc.)
So, here goes:
To calculate the Luhn algorithm for a number
(passed in as a string of digits), convert it to
an integer ("read"), and call the sub-luhner with
that number and a 1 as the multiplier. If the result
of that call is zero mod 10, you have a valid number.
> luhn str = (luhn_sub (read str) 1) `mod` 10 == 0
The sub-luhner takes a big number n (or part of one), and
a multiplier m (starts with 1). It will return the funky
sum, for its caller to check for divisibility by 10.
If the big number gets to zero, the result is zero.
> luhn_sub 0 _ = 0
The main work of the whole algorithm is done in the next
three lines.
We split the number into n/10 and n%10, the latter
being "the current digit". Multiply the current digit
by the current multiplier (m) and then (in "do_dig") add
the digits (i.e., subtract 9) if necessary.
The recursive call takes the main part of the number
(n/10), and gets the checksum for that part, but with the
multiplier toggled (2 if it was 1, and vice versa; this
is the "(3-m)").
Then we add the results of that do_dig and that recursive
call, and we're done!
> luhn_sub n m =
> (luhn_sub (n `div` 10) (3 - m)) + do_dig (m * (n `mod` 10))
> where do_dig n = if n <= 9 then n else n - 9
The next five lines run a small pair of tests. Not originally a
great choice, since both of the test digit strings were an even
number of digits long. I've added a couple of odd-length
tests here (and have run a few thousand other numbers chosen
by python's random number module) and now I have a bit more
confidence.
t9 takes a digit string and returns a list of ten strings,
where each resulting string has an appended digit, "0" through "9".
In more detail:
(map show [0..9]) takes the list of numbers 0-9 and turns it into a list of
one-char strings "0" through "9".
(map (base ++) That_Previous_Result) is a tricky one:
"(base ++)" is a partially-applied operator function.
(++) is the list append operator, and Haskell treats character
strings as lists of characters. E.g., ("abcd" ++ "efg") results in
"abcdefg". So a partially-applied (++) as in ("abcd" ++) is a
function that takes a string and appends it to "abcd".
So, mapping "(base ++)" onto That_Previous_Result yields the
list of ten strings where the one-char string ("0" through "9")
has been appended to our base string (t9's parameter).
> t9 base = (map (base ++) (map show [0..9]))
prTest prints the test rusults. Starting with the small list of
test numbers I chose, ["123", "234234", "328388272", "5128960128"],
we map t9 onto that list to give a list of four lists, where each
sub-list has ten numbers in it.
"concat $ a_long_expression" is the same as "concat (a_long_expression)";
it's a style thing to decide which one to use. In our case, concat
will take that list of four lists, each with ten numbers, and combine
them into one list of forty numbers.
> prTest =
> let test_strs = concat $ map t9 ["123", "234234", "328388272", "5128960128"] in
The raw results would be (map luhn test_strs) but that doesn't look
particularly helpful when you display it: it's just a list of forty
Trues or Falses. So instead of mapping luhn directly onto the test_strs,
we map an anonymous function:
(\s -> (s ++ " -> " ++ show (luhn s)))
onto them. The anonymous function takes a string s, and returns a
string consisting of the input (parameter) s, an arrow, and then the
string version of the True or False result of the luhn call.
> let results = map (\s -> (s ++ " -> " ++ show (luhn s))) test_strs in
Finally, we'd like to see those results, so we want to output each one,
and follow each output with a newline. In this case, we can't simply
use a "map" because that would return a list of IO actions (the type
of putStrLn is String -> IO () ). We wish to get prTest to
perform the actions itself, and the Haskell interpreters won't do
that for a list of actions. If we were to make a pr2 that's like prTest but
uses map instead of mapM_, then we would have to call it by saying
"sequence pr2" in order to get the interpreter to perform those actions.
> mapM_ putStrLn results
When I run it, I get a bunch of lines like these:
1230 -> True
1231 -> False
1232 -> False
. . .
with the four good ones interspersed:
1230 -> True
2342343 -> True
3283882722 -> True
51289601281 -> True
Once one is familiar with Haskell, one tends to appreciate how well
Haskell code can retain clarity despite being so concise.
1:26:00 PM