Planet Haskell

December 19, 2008

Conal Elliott

Smarter termination for thread racing
I realized in the shower this morning that there’s a serious flaw in my unamb implementation as described in Functional concurrency with unambiguous choice. Here’s the code for racing two computations:
race :: IO a -> IO a -> IO a
a `race` b = do v  <- newEmptyMVar
                ta <- forkPut a v
                tb <- forkPut b v
                x  <- takeMVar  v
                killThread ta
                killThread tb
                return x
 
forkPut :: IO a -> MVar a -> IO ThreadId
forkPut act v = forkIO ((act >>= putMVar v) `catch` uhandler `catch` bhandler)
 where
   uhandler (ErrorCall "Prelude.undefined") = return ()
   uhandler err                             = throw err
   bhandler BlockedOnDeadMVar               = return ()
The problem is that each of the threads ta and tb may have spawned other threads, directly or indirectly. When I kill them, they don’t get a chance to kill their sub-threads. If the parent thread does get killed, it will most likely happen during the takeMVar.

My first thought was to use some form of garbage collection of threads, perhaps akin to Henry Baker’s paper The Incremental Garbage Collection of Processes. As with memory GC, dropping one consumer would sometimes result is cascading de-allocations. That cascade is missing from my implementation above.

Or maybe there’s a simple and dependable manual solution, enhancing the method above.

I posted a note asking for ideas, and got the following suggestion from Peter Verswyvelen:

I thought that killing a thread was basically done by throwing a ThreadKilled exception using throwTo. Can’t these exception be caught?

In C#/F# I usually use a similar technique: catch the exception that kills the thread, and perform cleanup.

Playing with Peter’s suggestion works out very nicely, as described in this post.

There is a function that takes a clean-up action to be executed even if the main computation is killed:
finally :: IO a -> IO b -> IO a
Using this function, the race definition becomes a little shorter and more descriptive:
a `race` b = do v  <- newEmptyMVar
                ta <- forkPut a v
                tb <- forkPut b v
                takeMVar v `finally`
                  (killThread ta >> killThread tb)
This code is vulnerable to being killed after the first forkPut and before the second one, which would then leave the first thread running. The following variation is a bit safer:
a `race` b = do v  <- newEmptyMVar
                ta <- forkPut a v
                (do tb <- forkPut b v
                    takeMVar v `finally` killThread tb)
                 `finally` killThread ta
Though I guess it’s still possible for the thread to get killed after the first fork and before the next statement begins. Also, this code difficult to write and read. The general pattern here is to fork a thread, do something else, and kill the thread. Give that pattern a name:
forking :: IO () -> IO b -> IO b
forking act k = do tid <- forkIO act
                   k `finally` killThread tid
The post-fork action in both cases is to execute another action (a or b) and put the result into the mvar v. Removing the forkIO from forkPut, leaves putCatch:
putCatch :: IO a -> MVar a -> IO ()
putCatch act v = (act >>= putMVar v) `catch` uhandler `catch` bhandler
 where
   uhandler (ErrorCall "Prelude.undefined") = return ()
   uhandler err                             = throw err
   bhandler BlockedOnDeadMVar               = return ()
Combe forking and putCatch for convenience:
forkingPut :: IO a -> MVar a -> IO b -> IO b
forkingPut act v k = forking (putCatch act v) k
Or, in the style of Semantic editor combinators,
forkingPut = (result.result) forking putCatch
Now the code is tidy and safe:
a `race` b = do v <- newEmptyMVar
                forkingPut a v $
                  forkingPut b v $
                    takeMVar v
Recall that there’s a very slim chance of the parent thread getting killed after spinning a child and before getting ready to kill the sub-thread (i.e., the finally). If this case happens, we will not get an incorrect result. Instead, an unnecessary thread will continue to run and write its result into an mvar that no one is reading.
by conal at December 19, 2008 08:11 AM

Real-World Haskell

RWH on Twitter

RWH now has a Twitter account where updates about the book and other not-quite-blog worthy news will appear. The authors and the broader Haskell and FP community, are also largely on Twitter, if you want to keep up to date on events in Haskell land. We’ve also had a lot of micro-reviews and news about the book on Twitter, from readers all over:

Real World Haskell is awesome. Haskell is also awesome.

Just won “RealWorldHaskell” in the #kplug book raffle.

Going through the RWH exercises is fun

Thanks everyone! And see you online.

by Don Stewart (dons@cse.unsw.edu.au) at December 19, 2008 06:41 AM

Antti-Juhani Kaijanaho (ibid)

A thought

If a Debian General Resolution ballot asked me to choose between “eat shit” and “die”, I’d vote for Further Discussion.

by Antti-Juhani Kaijanaho at December 19, 2008 05:53 AM

December 18, 2008

Alson Kemp

Turbinado update

For those of you interested in Turbinado, here’s a quick status update:

I separated the code for the turbinado.org website from the code for the framework. The framework is here and the website code is here.

I’m going to finish up implementing HAML templating for Turbinado in the next few days.

After HAML templates are in, I’ll provide a tutorial on implementing a mini-CMS/wiki in Turbinado (the code is already in the website. The standard-Rails-ish “look, Mom! No code!” type of tutorial. Just enough to convince you to download it, but not enough to get you to be significantly productive.

Adam Stark is providing some greatly needed polish here as he attempts to get this beastie to build. Turbinado really needs to be easier to build…

Diego Echeverri is doing some work to get Turbinado to work with GHC 6.10 here. I had a difficult time getting my HSP-ish View templates working with 6.10, so I hope Diego can do it. I’d greatly prefer to be working with 6.10, but I couldn’t get there…

Writing a little web framework turns out to be a lot of work (it’s all the little stuff (documentation!!) that really gets ya). I’ve greatly appreciated the ability to build on the work of others (especially Niklas Broberg, Don Stewart, Bjorn Bringert and John Goerzen) and am grateful that others are providing help to this fledgling project.

by alson at December 18, 2008 06:13 PM

Philip Wadler

Type Safe Pattern Combinators, by Morten Rhiger

An ingenious set of combinators for expressing pattern matching in a type-safe way. No ADTs required. Opens up the possibility of programming languages where we can declare new binding forms just as we declare new functions. Revolutionary, but may have been hard to publish before Programming Pearls were created---it shows why they are a good idea.

by Philip Wadler (noreply@blogger.com) at December 18, 2008 03:20 PM

RAE 2008

Every few years, the UK Government conducts the Research Assessment Exercise, to determine the quality of research of every department in every university. The results for the most recent exercise are out, and by any measure Informatics at the University of Edinburgh did very well.

by Philip Wadler (noreply@blogger.com) at December 18, 2008 10:09 AM

The Price of Forgoing Basic Research

An article by Bill Buxton in Business Week, making the case that too much emphasis on applications is counterproductive. My favorite line:
To be blunt, I believe that when academic research starts demonstrating industry relevance is when funding should be cut off, not augmented.

by Philip Wadler (noreply@blogger.com) at December 18, 2008 10:04 AM

Alson Kemp

GitHub, ‘git’ and the forking fallacy

GitHub is a pretty sweet system. One of the best aspects of web 2.0 has been the focus on simple, straightforward web app usability/utility. GitHub is a great example of some of web 2.0’s best attributes: a really interesting model coupled with straightforward usability and just a touch of AJAX. I never would have tried Git without GitHub.

Git is pretty interesting, too. Turns out that git is effective at encouraging social involvement in coding. The Turbinado project has been forked a few times… ‘Forked’?! Wait! That’s what happens when projects are in deep trouble, right?!

Maybe so, but not necessarily in git. Since git is a decentralized revision control system (like our beloved darcs), ‘forking’ is roughly equivalent to ‘checking out’. A ‘fork’ is a good thing, so Rails’ fork count (398) is heroic.

by alson at December 18, 2008 05:04 AM

"Osfameron"

Crossword puzzles in Haskell
Every year or so I come back to the problem of writing a crossword puzzle compiler/player. I think Javascript would be the most promising for a web-based player, though I've given it a go in Java and Perl too. Modeling the problem is interesting in an Object Oriented language - I would find myself getting bogged down with "Lines" and the similarities between rows (Across) and columns (Down). I have a suspicion that OO Roles might be a more expressive way to model this. Anyway, given that I've not been writing much about Haskell ¹, this is a good time to redress the balance.
In the OO implementations, Cells would refer to the "Light" (group of adjacent cells running Across/Down) that they're in. And the Light would of course refer to the cells... this idea filled me with terror in Haskell, as it involves "Tying the Knot", which seems terribly clever and confusing. As it happens, you can often get away with having mutual references in Functional Programming, as they just spontaneously turn out not to be necessary. So far, this seems to be the case, though I think that if I take it to the point of navigating the grid from a UI, I may need an explicit structure to manage this, like a zipper.
This is a "literate Haskell" post. You should be able to save it as "crossword.lhs" and run it with runghc or ghci (calling the main function to test it). We start off with some imports: the List and Char modules (which I've naughtily not specified) and the handy join function to flatten a list.
> import Data.List
> import Data.Char
> import Control.Monad (join)
They say that much of the work in Haskell is defining the types. Direction is an Enum for the direction of a Light. Cell is either a Block (a black square) or a Cell (either blank, or already filled in). I guess I could model Block | Blank | Cell but don't yet see an advantage to that.
> data Direction = Across | Down
>     deriving (Show, Eq, Ord)
> 
> data Cell = Cell  (Maybe Char) Coord
>           | Block              Coord
>     deriving (Show, Eq)
>
> type Coord  = (Int, Int)
>
> coord (Cell _ c) = c
> coord (Block  c) = c
Directions can be sorted (which may come in useful for showing Across clues before Down ones), and this can be done automatically by deriving Eq and Ord. But how would we sort a Cell? We'll do it by the (x,y) coordinates, which means I can't use automatic derivation. Perhaps I could swap the order of arguments and still use that, but for now I defined a custom compare.
> instance Ord Cell
>     where compare l r = compare (coord l) (coord r)
Lights (distinct from Clues, which may be spread over 1 or more Lights) are a list of cells in a given direction. (It would be nice to specify that the cells really are contiguous, not sure if this is something that fundeps would be useful for?)
> data Light     = Light [Cell] Direction
>     deriving (Show, Eq, Ord)
>
> -- we'll sometimes want to know the first cell in a Light:
> headC (Light (c:_) _) = c
Of course Lights are numbered... but with the algorithm that I'm using, we don't know the number (like 5 Across) at the time we create it. I created a new type, LightN, but perhaps I should have modeled with a Maybe Int instead.
> data LightN    = LightN Int Light
>     deriving Show
Now we need some test data. I started with a grid from the wonderful Araucaria, Cryptic Crossword No. 24298.
> grid = textToCells [ 
>     "TRIPOD#        ",
>     "# # # # # # # #",
>     "PARSIFAL#RESCUE",
>     "# # # # # # # #",
>     "###            ",
>     "# # # # # ### #",
>     "ANNA###        ",
>     "# # # # # # # #",
>     "        ###PARE",
>     "# ### # #U# # #",
>     "         S  ###",
>     "# # # # #E# # #",
>     "      #INFERNAL",
>     "# # # # #U# # #",
>     "FRAGMENT#L     "
>     ]
We're going to want to output the grid, lights, and other objects, so let's define some functions to do that.
> showCell (Block _)         = '#'
> showCell (Cell (Just c) _) = c
> showCell _                 = ' '

> showLightN :: LightN -> String
> showLightN (LightN n l@(Light cs d)) =
>        show n ++ " " 
>     ++ show d 
>  -- ++ " " ++ show (coord $ headC l) 
>     ++ ": "
>     ++ show (map showCell cs)
>     ++ " (" ++ (show $ length cs) ++ ")"
>     
Similarly, we want to parse the list of strings above into a list of crossword cells. textToCells threads the row and column number with every character in the grid by zipping the list with the infinite list [1..], which I think is quite cute, though there are no doubt more elegant versions (list comprehensions?)
> charToCell :: Char -> Coord -> Cell
> charToCell '#' = Block
> charToCell ' ' = Cell Nothing
> charToCell  c
>     | isAlpha c = Cell (Just c)
>     | otherwise = error $ "Invalid character '" ++ [c] ++ "'"
> 
> textToCells :: [[Char]] -> [[Cell]]
> textToCells                     = zipWith  makeRow       [1..] 
>     where makeRow  row          = zipWith (makeCell row) [1..] 
>           makeCell row col char = charToCell char (row,col)
But working out what cells are is the easy part! We now want to know which cells form a light — i.e. groups of more than 1 non-block cell in either direction Across/Down. To get data for both directions, it's easiest to run in two passes, one in the normal direction, the other transposed. (I did consider trying to do both at the same time, but it hurt my brain: a one pass solution involving magical fumplicative arroids or somesuch is left as an exercise to the very clever reader).
> lights dir grid = concatMap 
>                     (flip lightsInLine dir) 
>                     $ rot dir grid
>     where rot Across = id
>           rot Down   = transpose
>
> lightsInLine :: [Cell] -> Direction -> [Light]
> lightsInLine cells dir = 
>     let l  = filter isMultiCell 
>                $ groupBy areCells cells
>     in  map (\c -> Light c dir) l
>
> areCells x y = isCell x && isCell y 
> isCell (Cell _ _) = True
> isCell  _         = False
> 
> isMultiCell (x:y:_) | areCells x y = True
> isMultiCell _ = False
So... where have we got with all of this modeling? Well, we can now find all the Across and Down lights. But then we'll want to number them. To do that, we'd have to sort them (by the coordinate). Across and Down lights can have the same number (like 5 Across and 5 Down in our example grid) so we want to group by lights that have the same head cell. Then we can thread the light number again, using the zipWith ... [1..] trick:
> allLights = join $ zipWith (map . LightN) [1..] gs
>         where gs = groupBy eqHead ls
>               ls = sort $ (lights Across grid)
>                        ++ (lights Down   grid)
>               eqHead l r = (headC l) == (headC r)
And finally, we can see the result of all the hard work, with a list of all the lights, and their current (partial) solutions:
> main = mapM_ putStrLn $ map showLightN allLights
Obviously this is only a start on the problem. For modeling, we now need a concept of a Clue (Clue String [Light]) and a solution - should the solution belong to the clue? or to the [Light]s that it's made up of. How do we link the answer grid (where the lights contain the correct characters) with the play grid, which contains the current letters that the player believes to be right? And how do we update the cells, lights, and grid while playing (or creating) a crossword?
Suggestions on these questions, and improvements or advice on the current code are greatly appreciated!

I had a complaint about this from a Planet Haskell reader: and though the FAQ does suggest that it's ok, or even encouraged to write about other things, perhaps I should also write a little about Haskell... ;-)
back
by osfameron at December 18, 2008 12:05 AM

Chris Done

lojbot
I don't think I have blogged at all about lojbot. Lojbot, or {lojbot}¹, is an IRC bot² which I have written, in Haskell, which runs on the Freenode IRC network, idles in #lojban and #lojbot. It has some neat commands which are useful to Lojbanists. I'm going to demonstrate some of its features briefly.
I've recently done over lojbot's command and tidied up the source. It's much better now.
I've marked this post as Lojban and Haskell, but it's only vaguely related to Haskell in that I wrote the bot and all the Lojban library stuff in it. I won't be discussing Haskell in this post.
Help
First of all you can invoke the help command to list all the other commands, and you can invoke help for each of those commands. Commands, in channels, are prefixed with things, e.g. in #lojbot, it is ?command. In private message, however, no prefix is needed.
<chrisdone> help
<lojbot> commands: search (query, q), valsi (v), rafsi (r), gloss
(g), definition (d), selma'o (s), translate (t), selrafsi (sr), cll,
lujvo (l), grammar (gr), correct (c), more (m), help (h, commands),
info (about)
The words in brackets are short-hands/aliases. Easier to type when you're rushing about trying to make a response to someone!
Lookup
Look up entries according to the word itself:
<chrisdone> h v
<lojbot> valsi (v): lookup a gismu/cmavo/lujvo/fu'ivla

<chrisdone> v valsi
<lojbot> gismu {valsi}, with rafsi {val, vla}, glossing to "word": x1
is a word meaning/causing x2 in language x3; (adjective: x1 is
lexical/verbal).
Words don't have spaces in, so we can reasonably assume when you enter more than one word, you want to see more than one result:
<chrisdone> v ganra garna grana
<lojbot> gismu {ganra}, with rafsi {gan}, glossing to "broad": x1 is
broad/wide in dimension x2 [2nd most significant dimension] by
standard x3.
<lojbot> gismu {garna}, with rafsi {gar}, glossing to "rail", "bar",
"railing": x1 is a rail/railing/bar [tool] supporting/restraining x2,
of material x3.
<lojbot> gismu {grana}, with rafsi {ga'a}, glossing to "stick",
"pole", "staff", "cane", "rod": x1 is a rod/pole/staff/stick/cane
[shape/form] of material x2.
If we know the start of a gismu, or something like it, we can use a wild card to search for it. Here's an example of words starting with {jd}, a cluster which many find difficult to pronounce:
<chrisdone> v jd*
<lojbot> gismu {jdari}, with rafsi {jar}, glossing to "firm": x1 is
firm/hard/resistant/unyielding to force x2 under conditions x3. .. 11
more results
<chrisdone> more
<lojbot> gismu {jdice}, with rafsi {jdi}, glossing to "decide": x1
(person) decides/makes decision x2 (du'u) about matter x3
(event/state). .. 10 more results
<chrisdone> more
<lojbot> gismu {jdika}, glossing to "decrease": x1 (experiencer)
decreases/contracts/is reduced/diminished in property/quantity x2 by
amount x3. .. 9 more results

<chrisdone> v xa*m*
<lojbot> cmavo cluster {xamoi}, of selma'o MOI*, glossing to "is
sixth among": quantified selbri: convert 6 to ordinal selbri; x1 is
sixth among x2 ordered by rule x3. .. 15 more results
<chrisdone> more
<lojbot> gismu {xajmi}, with rafsi {xam}, glossing to "funny": x1 is
funny/comical to x2 in property/aspect x3 (nu/ka); x3 is what is funny
about x1 to x2. .. 14 more results
Next we have the 'rafsi' command. Search gismu/cmavo by their rafsi:
<chrisdone> h r
<lojbot> rafsi (r): find gismu/cmavo with the given rafsi

<chrisdone> r loj ban
<lojbot> gismu {logji}, with rafsi {loj}, glossing to "logic": x1
[rules/methods] is a logic for deducing/concluding/inferring/reasoning
to/about x2 (du'u).
<lojbot> gismu {bangu}, with rafsi {ban, bau}, glossing to
"language": x1 is a/the language/dialect used by x2 to
express/communicate x3 (si'o/du'u, not quote).

<chrisdone> r sel
<lojbot> cmavo {se}, of selma'o SE, with rafsi {sel}, glossing to
"2nd conversion": 2nd conversion; switch 1st/2nd places.
Similar to the valsi command, it can display more than one result at a time because it is clear that you want this. Wild cards also work with rafsi, but I have yet to see why this is useful. I will leave it in, though.
Gloss are a very helpful way to search lojban vocabulary:
<chrisdone> h g
<lojbot> gloss (g): find valsi with the given gloss

<chrisdone> g cat
<lojbot> gismu {mlatu}, with rafsi {lat}, glossing to "cat": x1 is a
cat/[puss/pussy/kitten] [feline animal] of species/breed x2;
(adjective:) x1 is feline. .. 32 more results

<chrisdone> g speaking
<lojbot> cmavo {sa'e}, of selma'o UI3, glossing to "precisely
speaking": discursive: precisely speaking - loosely speaking. .. 2
more results

<chrisdone> g love
<lojbot> cmavo {iu}, of selma'o UI1, glossing to "love": attitudinal:
love - no love lost - hatred. .. 8 more results
<chrisdone> more
<lojbot> lujvo {pampe'o}, with rafsi {pam, pe'o}, selrafsi {prami,
pendo} (friend), glossing to "boyfriend", "lover", "girlfriend":
pr1=pe1 is a lover of pr2=pe2. .. 7 more results
<chrisdone> more
<lojbot> lujvo {pamsrasu}, with rafsi {pam, srasu}, selrafsi {prami,
srasu} (grass), glossing to "lovegrass": x1 is lovegrass of
species/variety x2. .. 6 more results
I have demonstrated the more command a few times. It simply displays the next result. It used to display three or four results at a time. But it's usually the next result you want. And if you're listing more than a few results, you should use a proper mass lookup tool. Also, lojbot waits a few seconds every three messages to stop it flooding the IRC server. Keep that in mind.
Indeed, wildcards work with the gloss, too:
<chrisdone> g *plant*
<lojbot> fu'ivla {akmela}, glossing to "toothache plant",
"paracress", "spotflower": x1 is a toothache
plant/spotflower/paracress of species/variety x2. .. 7 more results
<chrisdone> more
<lojbot> lujvo {derba'o}, with rafsi {der, ba'o}, selrafsi {dertu,
banro} (dirt, grow), glossing to "to sprout [of plant]", "sprouting
object", "sprout": b1 initially grows b2 beyond the soil; b1 sprouts
b2 from the ground. .. 6 more results
The definition search is nice. It will try to give you the best matching result first. A good match is an exact word:
<chrisdone> h d
<lojbot> definition (d): search for valsi(s) by definition

<chrisdone> d dances
<lojbot> gismu {dansu}, glossing to "dance": x1 (individual, mass)
dances to accompaniment/music/rhythm x2.
Above we see that we have matched a single word exactly. And now, we match a word inside the slashes:
<chrisdone> d music
<lojbot> gismu {dansu}, glossing to "dance": x1 (individual, mass)
dances to accompaniment/music/rhythm x2. .. 6 more results
No problem.
Indeed, this command will try the separate words, too, to see what comes up better, which is useful for when you want to search a few ideas:
<chrisdone> d teacher teaches
<lojbot> gismu {ctuca}, with rafsi {ctu}, glossing to "teach": x1
teaches audience x2 ideas/methods/lore x3 (du'u) about subject(s) x4
by method x5 (event).
But what about incomplete words?
Let's say I want all results relevant to teaching, I can use a wild card:
<chrisdone> d *teach*
<lojbot> gismu {ckule}, with rafsi {kul, cu'e}, glossing to "school":
x1 is school/institute/academy at x2 teaching subject(s) x3 to
audien./commun. x4 operated by x5. .. 3 more results
<chrisdone> more
<lojbot> gismu {ctuca}, with rafsi {ctu}, glossing to "teach": x1
teaches audience x2 ideas/methods/lore x3 (du'u) about subject(s) x4
by method x5 (event). .. 2 more results
<chrisdone> more
<lojbot> lujvo {balcu'e}, with rafsi {bal, cu'e}, selrafsi {banli,
ckule} (great, school), glossing to "university", "university site",
"course", "student", "regents": c1 is a university at c2 teaching
subject(s) c3 to audience/community c4 operated by c5. .. 1 more
result
<chrisdone> more
<lojbot> lujvo {ctununta'a}, with rafsi {ctu, nun, ta'a}, selrafsi
{ctuca, nu, tavla} (teach, event abstract, talk), glossing to
"lecture", "lecturer": n1=c5 is a lecture / an event of verbal
teaching by t1=c1 to audience $t_2 = c_2$ about subject $t_3 = c_4$ in
language t4 with facts taught c3.
A nice way to investigate cmavo of a particular selma'o, especially after just viewing a cmavo entry and wondering what else there is in that selma'o, is to use the s (for selma'o) command:
<chrisdone> h s
<lojbot> selma'o (s): list cmavo of a selma'o

<chrisdone> s UI5
<lojbot> {be'u}: lack, {dai}: empathy, {fu'i}: easy, {ga'i}: hauteur,
{ju'o}: certainty, {le'o}: aggressive, {ri'e}: release of emotion,
{se'a}: self-sufficiency, {se'i}: self-oriented, {vu'e}: virtue,
{zo'o}: humorously .. 11 more results
<chrisdone> more
<lojbot> cmavo {be'u}, of selma'o UI5, glossing to "lack":
attitudinal modifier: lack/need - presence/satisfaction -
satiation. .. 10 more results
<chrisdone> more
<lojbot> cmavo {dai}, of selma'o UI5, glossing to "empathy":
attitudinal modifier: marks empathetic use of preceding attitudinal;
shows another's feelings. .. 9 more results
The translation command uses jbofihe:
<chrisdone> h t
<lojbot> translate (t): translate some lojban with jbofihe -x

<chrisdone> t coi ro do
<lojbot> (^coi /greetings/ ro /every/ /(of)/ do /you/^)
<chrisdone> t mi ui nelci la djan
<lojbot> [([nelci1 (like-r(s)):] mi /I, me/ .ui /{happiness..}/)
/[is, does]/ <<nelci /lik-ing/>> ([nelci2 (liked thing(s)):] la / /
djan. /[NAME]/)]
<chrisdone> t mi nelci la djan iu ru'e
<lojbot> [([nelci1 (like-r(s)):] mi /I, me/) /[is, does]/ <<nelci
/lik-ing/>> ([nelci2 (liked thing(s)):] la / / djan. /[NAME]/ .iu ru'e
/{weak love..}/)]
Unfortunately it doesn't yet give informative error messages:
<chrisdone> t do mo do mo
<lojbot> parse error
But I will do that some time.
The selrafsi command gives you the lujvo which the given selrafsi is in:
<chrisdone> h sr
<lojbot> selrafsi (sr): find all lujvo with given selrafsi

<chrisdone> sr gleki
<lojbot> bebgei (giddy), gekpanpi (blissful), tcegei (overjoyed) .. 3
more results
<chrisdone> more
<lojbot> lujvo {bebgei}, with rafsi {beb, gei}, selrafsi {bebna,
gleki} (foolish, happy), glossing to "giddy": g1 is giddy about
g2. .. 2 more results
<chrisdone> more
<lojbot> lujvo {gekpanpi}, with rafsi {gek, panpi}, selrafsi {gleki,
panpi} (happy, peace), glossing to "blissful": g1=p1 is blissful about
g2=p2 (event/state). .. 1 more result
<chrisdone> more
<lojbot> lujvo {tcegei}, with rafsi {tce, gei}, selrafsi {mutce,
gleki} (much, happy), glossing to "overjoyed", "delighted",
"ecstatic", "delightful": g1 is delighted/ecstatic/overjoyed about g2
(event/state).

<chrisdone> sr gletu
<lojbot> cpanygle (ride), datpavycinglepre (straight), glebre (heat),
glefi'a (pornography), glepe'o (sexual partner), gleve'u (prostitute),
mitcinglepre (MOTSS), mitpavycinglepre (homo), nu'ogle (virgin),
pavycinglepre (non-bisexual), relcinglepre (bisexual) .. 11 more
results
The cll command looks up, using Google, results from The Lojban Reference Grammar:
<chrisdone> h cll
<lojbot> cll: lookup something in the lojban reference grammar

<chrisdone> cll ko'a
<lojbot> http://jbotcan.org/cllc/c7/s5.html : Brevity Is The Soul Of
Language: Pro-sumti And Pro-bridi - The ...There are ten cmavo in the
ko'a-series, and they may be assigned freely to any sumti
whatsoever. The English word ``he'' can refer only to males, ... .. 9
more results
valid simply tells us that some grammar is valid.
<chrisdone> h c
<lojbot> correct (c): camxes test for validity

<lojbot> valid: coi ro do
<chrisdone> correct coi ro d
<lojbot> not valid: coi ro d
And finally, lujvo makes a lujvo for you:
<chrisdone> h l
<lojbot> lujvo (l): construct lujvos from selrafsis and rate them

<chrisdone> lujvo xamgu dansu
<lojbot> xagdansu (7937), xaurdansu (9006)
Ah, I forgot the search command. It is a combination of the other look up commands, showing results of the highest priority first. It goes:
valsi > rafsi > gloss > definition
Thus, {bau} matches a rafsi, which means no valsi entries matched:
<chrisdone> q bau
<lojbot> cmavo {bau}, of selma'o BAI, glossing to "in language":
bangu modal, 1st place in language ... .. 1 more result
{bridi} matches a valsi, and 62 more results, meaning other things like gloss and definition results matched against "bridi", but they are less precise, and so are listed later:
<chrisdone> q bridi
<lojbot> gismu {bridi}, with rafsi {bri}, glossing to "predicate": x1
(text) is a predicate relationship with relation x2 among arguments
(sequence/set) x3. .. 62 more results
<chrisdone> more
<lojbot> lujvo {brirebla}, with rafsi {bri, rebla}, selrafsi {bridi,
rebla} (predicate, tail), glossing to "bridi-tail": x1 is a bridi-tail
of predicate relationship x2 with relation x3 among arguments
(sequence/set) x4 . .. 61 more results
Here we have a gloss match, meaning no valsi or rafsi entries matched.
<chrisdone> q precisely speaking
<lojbot> cmavo {sa'e}, of selma'o UI3, glossing to "precisely
speaking": discursive: precisely speaking - loosely speaking. .. 4
more results
And here a definition match, because "dances" doesn't match any of the valsi, rafsi or gloss entries.
<chrisdone> q dances
<lojbot> gismu {dansu}, glossing to "dance": x1 (individual, mass)
dances to accompaniment/music/rhythm x2.
I might recommend this to newbies, or people who prefer to stick to one mildly clever command than use different ones. But you get more power using the individual ones; so I'd only recommend this for valsi/rafsi/gloss lookup of singular words.
Mlismu
Lojbot has a mlismu³ feature, invoked by COI or DOI cmavo⁴, which makes grammatically and ontologically valid utterances, which may or may not have relevance to what was said to lojbot. Mlismu generates these utterances using fatci (facts), which are very simple statements such as that “all humans are persons”.
<chrisdone> coi lojbot pei
<lojbot> ckule lo va'i te nicte .o'e be le se murse be'o fo mi le
prenu
Some amusing instances of lojbot's bantering:
Discussing bananas and religion:
19:46 < chrisdone> ma grute doi lojbot
19:46 < lojbot> re'e lo se lebna be mi'o bei ko be'o .u'a fo'o cu grute
19:46 < chrisdone> zo re'e u'i
19:48 < timonator> lijda la lojbot lo nu se lebna fi lo badna
Roughly translated as:
19:46 < chrisdone> What is a fruit, O, Lojbot?
19:46 < lojbot> [Religious feeling] YOU, be the one whose fruit we
sieze! [Gain]
19:46 < chrisdone> ‘Religious’, haha.
19:48 < timonator> Lojbot believes in a religion with practices of
taking bananas from people.
(Go to the xamselsku page)
19:32 < timonator> doi lojbot gletu pei
19:32 < lojbot> gletu fa ko le danlu
19:32 < timonator> oi
19:33 <@Broca> doi lojbot do mutce tolclite
19:33 < lojbot> lo penmi va'i cu mrobi'o
Rough translation:
19:32 < timonator> O, Lojbot, fucking; how do you feel about it?
19:32 < lojbot> Go fuck an animal.
19:32 < timonator> Hey! >:-(
19:33 <@Broca> O, Lojbot, you are very rude.
19:33 < lojbot> The encounterer, in other words, dies.
(Go to the xamselsku page)
It's surprisingly coherent, sometimes.
Source
Lojbot is written in Haskell; you can view its complete source code or pull it with git:
git clone git://github.com/chrisdone/lojbot.git
Okay, so I said I wouldn't discuss Haskell, but I wrote a neat wild card library which Lojbot uses for the above described wild card stuff. It's pretty concise, I think. The pattern matching makes it easy to analyse the state of the function. I'm pretty sure it's essentially a state machine. I've never written a wild card library. A regular expression library would be cool to write some time. I believe there are exceedingly fast algorithms for doing wild cards which I have not used in this simple code. Maybe I'll upgrade it some time and blog about it.
Bugs / feedback
Please add any ideas/bugs/feedback you want to the lojbot lighthouse page. That way, I will actually get them done if they're recorded somewhere.
Please join the #lojbot channel to discuss and test the bot.
We typically use { and } to quote Lojban text, in English. ↩
“An IRC bot is a set of scripts or an independent program that connects to Internet Relay Chat as a client, and so appears to other IRC users as another user. It differs from a regular client in that instead of providing interactive access to IRC for a human user, it performs automated functions.” — IRC bot at Wikipedia ↩
Mlismu is a program written by Adam Lopresto which “[Generates] semantically near-valid grammatical lojban nonsense.” — mlismu's web page ↩
That is, words which are vocatives, such as {doi djan}, {ki'e djan}, {je'e djan}, etc. for "O, John", "Thanks, John", "Roger, John", etc. ↩
by chrisdone@gmail.com at December 18, 2008 12:00 AM

December 17, 2008

John Goerzen (CosmicRay)

More Wiki Annoyances

So today, I discovered that MoinMoin has an “all or nothing” attachment setting: either everyone with write permission to a page can upload attachments, or uploading attachments is disabled for the entire site. No exceptions. Period.

Not only that, but there is no maximum attachment size setting — unless the attachment was uploaded embedded in a ZIP file. How’s that for irony?

Not wanting to have my railroad site turn into a file trading site, I don’t really want to let everyone upload attachments.

Oh, also MoinMoin doesn’t maintain a history of attachments. So if somebody drives by and vandalizes an attachment, you get to…. restore the original version from tape. Yay?

So I decided I’d look back at MediaWiki, which has better attachment controls.

I still had my test installation, so I went to use it. I edited the main page. I wanted to read about the markup, so I clicked the “Editing help” link. Whoops, broken link. It links to Help:Editing on the local wiki. Which MediaWiki does not install for you. I asked about it on #mediawiki. The answer was: copy from mediawiki.org. OK, fine. How? “Cut and paste.” Yes, that’s right. Every time a new version of MediaWiki comes out, you get to cut and paste dozens of pages from mediawiki.org to your wiki, or else you’ll have outdated help. Yay?

The MediaWiki folks on IRC seem to like it this way. “Not everyone wants the same help.” Fine, but provide a sane default for those that don’t care.

Who thought running a wiki would be so hard?

Update: since yesterday, I went to moinmo.in and fixed the ThemeMarket page to reflect what versions a MoinMoin theme works with. I’m happy to help out with fixing Free Software — though I don’t really have time to add fundamental features like working syntax help links on this particular project.

by John Goerzen at December 17, 2008 11:53 PM

Brent Yorgey

QuickCheck rocks my socks

Over the past few days I’ve been hacking on a Data.List.Split module, to be used when you just want to quickly split up a list without going to the trouble of using a real parsing or regular expression library — for example, suppose you are writing a one-off script that needs to read in strings like “abc;def;gh;i” and you want to split it on the semicolons to yield a list of Strings. Of course, such a thing isn’t in the standard libraries since no one can agree on the right interface; the idea is to provide a whole module with lots of different ways to split instead of a single function, and to just put it on Hackage instead of going through the much more difficult process of getting it included in the standard libraries. Anyway, more on that when it’s released, hopefully in a few days.

Like any good Haskell programmer writing a nice pure library, today I started adding a suite of QuickCheck properties. I set up a framework and added a couple basic properties: 200 tests passed! I added another: 100 tests passed! Now I was on a roll, and added three more. This time… it hung after checking the fourth property only 8 times. OK, no problem, I’ve seen this sort of thing before when there’s some sort of combinatorial explosion in the size of the randomly generated test data… except the test data is so simple that’s definitely not what’s happening here. Hmm… maybe it’s infinite recursion? But I really can’t see where infinite recursion could crop up. Oh, unless… hmm, yes, if function A ever returns an empty list, it would cause an infinite loop. But surely function A can never return an empty list! Well, let’s try it. prop_A_nonempty x = (not . null) (funcA x). And… Falsified! Whoops.

If you’re curious, the case I forgot was when you specify the empty list as a delimiter — obvious in retrospect, perhaps, but without QuickCheck’s assistance, I probably would have ended up releasing the library with this latent infinite recursion bug!

by Brent at December 17, 2008 04:01 AM

December 16, 2008

John Goerzen (CosmicRay)

Wiki Software

I used to run a website for traveling by rail in the United States. I let it falter, and eventually took it down. But I still have the domain, and am working to bring it back as a wiki.

The first step in that process was selecting which wiki software to use. I have a few requirements for the site:

Availability of both WYSIWYG (friendly for beginners) and non-WYSIWYG editors

A number of nice-looking themes to choose from

Nice to have: a hierarchy or category system

The ability to search within only a particular section or category in the hierarchy

Easy to maintain software; not having tons of plugins to keep up-to-date for security

Stellar spam prevention

Nice to have: ability to redirect people to the new page after a rename

I’m frustrated that there is no wiki out there that does all of these. There are quite a few that do all but one, but which one they omit varies.

My two finalists were MoinMoin and MediaWiki.

MoinMoin

MoinMoin will let you easily define arbitrary categories (by creating a wiki page following a certain name). The search screen automatically presents checkboxes for restricting searches to a particular category. Some reviews have complained about its anti-spam features, but they are all talking about older versions and they seem to have done some work on this lately.

MoinMoin has tons of features and is easy to set up and maintain. But here’s where it falls down: themes.

Over at moinmo.in, there is a “theme market” for themes. Only most of the themes there haven’t been compatible with the current MoinMoin release since about 2005. Most of the rest have one download, then a long discussion page full of mixed bug reports, diffs, and non-diff “edit this to make it work” comments. Most of these don’t state what version of the theme they apply to. Most themes won’t work with current browsers and Moin releases without them. UGH. After discussing on #moin, I’ll probably go in there and at least organize the ThemeMarket page by release.

MediaWiki

Then there’s MediaWiki. It’s got a lot of features, and a lot of complexity. It has no current WYSIWYG feature, though apparently there is work on one.

MediaWiki has an amazing category system. It can generate sorted lists of pages in a category, supports subcategories, etc. Surprisingly, though, you can’t search in just one category. (Though it might be possible indirectly via some syntax; not sure.)

Searching in MediaWiki overall is less capable that in MoinMoin.

MediaWiki does offer namespaces, and namespaces are the sole way of searching just one part of a site. They’re used well over at, say, uesp.net. But namespaces are heavy-handed. You have to edit config files to define them, and they bring with them other associated namespaces for discussion and whatnot. It’s not as easy as creating a category in MoinMoin, and might not scale well to lots of future categories.

MediaWiki does appear to have good spam prevention, and support recaptcha.

Conclusions

I eventually selected MoinMoin and have set up most of my content in it. But now that I am to the point of selecting themes, I’m having some second thoughts.

I also looked at DokuWiki. Its design makes me nervous. The user list is stored in a single file. You can’t search by category. You can search by namespace, but there aren’t checkboxes for it in the search screen; you have to know the syntax. WYSIWYG is a plugin. Categories are a plugin. So — too many plugins to maintain, and no real features above MoinMoin.

by John Goerzen at December 16, 2008 11:52 PM

Bryan O'Sullivan

The 12 Days of EFF

Via the ever industrious Danny O’Brien, we bring to you the Electronic Frontier Foundation’s end-of-year animation.

Learn more about this video and support EFF!

by Bryan O'Sullivan (bos@serpentine.com) at December 16, 2008 09:58 PM

John Goerzen (CosmicRay)

Administering Dozens of Debian Servers

At work, we have quite a few Debian servers. We have a few physical machines, then a number of virtual machines running under Xen. These servers are split up mainly along task-oriented lines: DNS server, LDAP server, file server, print server, mail server, several web app servers, ERP system, and the like.

In the past, we had fewer virtual instances and combined more services into a single OS install. This led to some difficulties, especially with upgrades. If we wanted to upgrade the OS for, say, the file server, we’d have to upgrade the web apps and test them along with it at the same time. This was not a terribly sustainable approach, hence the heavier reliance on smaller virtual environments.

All these virtual environments have led to their own issues. One of them is getting security patches installed. At present, that’s a mainly manual task. In the past, I used cron-apt a bit, but it seemed to be rather fragile. I’m wondering what people are using to get security updates onto servers in an automated fashion these days.

The other issue is managing the configuration of these things. We have some bits of configuration that are pretty similar between servers — the mail system setup, for instance. Most of them are just simple SMTP clients that need to be able to send out cron reports and the like. We had tried using cfengine2 for this, but it didn’t work out well. I don’t know if it was our approach or not, but we found that hacking cfengine2 after making changes on systems was too time-consuming, and so that task slipped and eventually cfengine2 wasn’t doing what it should anymore. And that even with taking advantage of it being able to do things like put the local hostname in the right places.

I’ve thought a bit about perhaps providing some locally-built packages that establish these config files, or load them up with our defaults. That approach has worked out well for me before, though it also means that pushing out changes isn’t a simple hack of a config file somewhere anymore.

It seems like a lot of the cfengine2/bcfg tools are designed for environments where servers are more homogenous than ours. bcfg2, in particular, goes down that road; it makes it difficult to be able to log on to a web server, apt-get install a few PHP modules that we need for a random app, and just proceed.

Any suggestions?

by John Goerzen at December 16, 2008 02:34 PM

Alson Kemp

Thinking About Haskell*: You Know Lazy Evaluation; You Just Don’t Know It
[Post updated to reflect comments. ...too much late night typing...] Lazy evaluation is a very novel aspect of Haskell. Turns out that it’s not that difficult to think about.

A very common example of lazy-ish evaluation is ‘&&’ operators used in lots of languages (using C#):
if ( (obj != null) && (obj.someMethod() == somethingElse) ) {
  // do something
}
obj.someMethod isn’t evaluated unless obj != null. So &&, a C# function/operator everyone knows and loves, is strict in its first parameter and lazy in its second. The previous code might act a little like the following:
if (obj != null){
  if (obj.someMethod() == somethingElse) ) {
    // do something
  }
}
If && were strict in both parameters, then it’d throw errors when the second parameter was unconditionally evaluated (e.g. null.someMethod()). If _&& weren’t lazy, it might act something like:
condition1 = (obj != null);
condition2 = (obj.someMethod() == somethingElse);
if (condition1 && condition2) {
    // do something
  }
}
Clearly, condition2 would throw an Exception when obj was null, so the if statement must not be evaluating the second parameter. It’s as if it were lazy…

Here’s the (slightly revised for clarity) definition of && from Haskell:
(&&)                    :: Bool -> Bool -> Bool
(&&) True x              =  x
(&&) False _             =  False
Just as with C#, the first parameter (x) must be evaluated, but the second parameter is only evaluated if the first parameter is True. && is strict in the first parameter and lazy in the second.

And That’s All There Is to Laziness?

Nope. It’s “turtles all the way down”.

Laziness is a defining feature of Haskell and a feature that separates Haskell from the vast majority of languages. Lazy evaluation isn’t confined to a few operators in Haskell; lazy evaluation starts at the first function/expression in your program and continues all the way down.

For more reading, check out here, here, here and here.

But I Don’t Want To Be Lazy

[Example fixed.] Then force evaluation of your parameters. How? One way is to use the $! function to force the evaluation of parameters.
f a b c = ((someFunction $! a) $! b) $! c
See here for more.

* These little bits have helped me think about Haskell. Maybe they’ll be useful for you, too.
by alson at December 16, 2008 05:33 AM

Manuel M T Chakravarty

ICFP calls for functional programming Experience Reports.

ICFP calls for functional programming Experience Reports.: If you have used functional programming in a real world application, we want to hear about it!

December 16, 2008 01:22 AM

December 15, 2008

"FP Lunch"

Update on PiSigma

Last Friday I gave an update on $\Pi\Sigma$ a dependently typed core language which Nicolas Oury and I are working on. We wrote a paper (you can also play with it) earlier this year which didn’t make it into ICFP and have revised the language definition considerably. Basically we got rid of some features (namely constraints) which were - in our view - to clever for a core language.

$\Pi\Sigma$ is a core language, hence the emphasis is on small leaving out lots of syntactic sugar which makes languages like Coq’s CIC, Agda, Epigram or Cayenne actually usable. The idea is that these features can be recovered by a syntactic translation (or elaboration as we say) as part of the compiler. Having a small core langauge is nice for compiler writers but also for theoretical analysis. The aim here is similar as Fc for Haskell but directed at dependent functional programming.

Unlike most of the DTP languages mentioned above, with the exception of Cayenne, $\Pi\Sigma$ is partial, i.e. programs may fail to terminate. On the other hand it is full blown, which means that type-checking is partial too. We think this isn’t such a big issue for programming if the only reason for the type checker to diverge is that you have anon-terminating expression within a type. On the other hand it is an issue for using $\Pi\Sigma$ for verification and it also rules out important optimisations. Hence we plan to define a total fragment but we think it is a good idea to specify the partial language first. Indeed, even when using Agda (or Epigram) we find it often more convenient to start developing in a non-total langauge and address the totality issue later (or never, depending on demand).

Here is a summary of the essential features of $\Pi\Sigma$

Very few type formers : $\Pi$ types, $\Sigma$ -types , Type:Type, finite types (enumerations) as sets of labels, lifted types for recursion and coinduction.

Depedently typed pattern matching can be derived using the primitive eliminator and equality

Full recursion via let rec (mutual) , used to define recursive types (like Nat) and recursive values. Recursion is controlled via lifted types and boxes.

Inductive families can be encoded using an extensional equality

Coinductive types and families can be encoded using lifted types, overcoming some issues present in Agda and CIC

Currently we are working on a new implementation and on a revised language definition. I’ll put this up as soon as it is in a reasonable state.

by Thorsten at December 15, 2008 05:11 PM

Manuel M T Chakravarty

Advance program of "DAMP 2009: Workshop on Declarative Aspects of Multicore Programming".

Advance program of "DAMP 2009: Workshop on Declarative Aspects of Multicore Programming".

December 15, 2008 10:25 AM

Ivan Lazar Miljenovic

RWH Arrives Down Under!

As far as I know, no-one has as yet reported that Real World Haskell has hit the shores of Australia. Well, I’d like to rectify that situation now that I just received my copy (which whilst later than people in the Northern Hemisphere and the Americas have reported, is a day earlier than the [...]

by ivanmiljenovic at December 15, 2008 09:30 AM

André Pang (ozone)

LittleSnapper and Mac Development Talky Talk

Four little announcements, all of them Mac-related:

First, myself and my comrades at Realmac Software are very proud to announce the release of LittleSnapper 1.0, our swiss-army-knife picture, screenshot and website organisation utility thingamijiggo. We’ve all worked hard on this for the past few months and sweated over a ton of details to try to make it a polished user experience and be a joy to use; we hope you agree. (You would not believe how long we spent figuring out how the blur and highlighting tools should work before they became their final incarnations, or how much pain was involved when we decided to add FTP and SFTP¹ support late in the development cycle.) If you’re a Mac user, give it a whirl; it’s a hard program to describe because it has a lot of different workflows, but between the quick annotation tools, easy Web sharing with QuickSnapper/Flickr/SFTP¹, website DOM snapping, and the iPhoto-like forget-about-what-folder-you-need-to-put-your-picture-in snapshot management, I’m sure you’ll find something useful for you in there. Hopefully our hard work can make life just a little easier for you!

¹ FTP must die.

I blogged earlier that I was speaking at MacDev 2009 in April, but didn’t mention exactly what I was talking about. Well, the talk abstract’s up now:

One reason for Mac OS X’s success is Objective-C, combining the dynamism of a scripting language with the performance of a compiled language. However, how does Objective-C work its magic and what principles is it based upon? In this session, we explore the inner workings of the Objective-C runtime, and see how a little knowledge about programming language foundations—such as lambda calculus and type theory—can go a long way to tackling difficult topics in Cocoa such as error handling and concurrency. We’ll cover a broad range of areas such as garbage collection, blocks, and data structure design, with a focus on practical tips and techniques that can immediately improve your own code’s quality and maintainability.

So, two sections: first, low-level hackery of the Objective-C runtime. Second, a different kind of low-level hackery, and one that’s arguably far more important: understanding the essence of computation and programming languages, and why I fell in love with both Haskell & Objective-C, two languages at completely opposite ends of the planet.

I’d like to point out that while the MacDev registration fee seems fairly expensive at £399, keep in mind that covers your accommodation and also meals, which easily covers £100-£150. Scotty’s done a lot of organising so that you don’t have to. There’s also a Christmas special on at the moment where a few applications are included in the registration price; check the MacDev 2009 website for details.

If you’re an imsoniac and are having trouble sleeping, you’ll hopefully enjoy a recent Late Night Cocoa episode where I talk to Scotty about Garbage Collection. (Actually, you probably won’t enjoy it so much after you find out exactly how -retain & -release are implemented under-the-hood. The words CFBag and “lock” should hopefully scare you enough.) It’s a bit of a long episode at over an hour and a quarter long, but next time I’ll say “um” a bit less which should shorten it to about half an hour. Have fun. And use GC! (LittleSnapper and RapidWeaver both aren’t GC yet, but you bet your ass they will be for the next major versions.)

I’ve had a pretty long exodus away from the fp-syd user group since I was off getting drunk overseas for about four months. That, of course, meant that somehow my brain was rather misplaced when I arrived back in Sydney, so I decided to give a talk at fp-syd upon my return… on the same day that LittleSnapper 1.0 was due to be released, leaving pretty much no margin for error. Oops. I’ll glad to say that the gusto prevailed, and that both the talk seemed to go OK (well, I wasn’t booed off the stage anyway), and LittleSnapper was released on time. (Just; thanks Alan and Danny!) My talk there was similar to the one I gave at Galois in Portland earlier this year: a whirlwind tour of the Objective-C programming language and Mac OS X technologies for a functional programming audience. In particular:

basics of the runtime system,

higher-order messaging and its analogy to higher-order functions in functional languages,

some details on the engineering marvel that is the Objective-C garbage collector, and

(updated!) information on Blocks, LLVM and Clang, and a wee tiny bit of info on Grand Central Dispatch and OpenCL.

I’ve updated the talk with a few extra slides, since Apple have made a little more information to the public now. (In particular, brief information on Blocks, Grand Central Dispatch and OpenCL.) Enjoy all!

by ozone@algorithm.com.au at December 15, 2008 08:23 AM

Interview with Marshall Kirk McKusick

A website named Neat Little Mac Apps is not the kind of place you’d expect to find an interview with a operating systems and filesystems hacker. Nevertheless, one of their podcasts was just that: an interview with UNIX and BSD legend Marshall Kirk McKusick. (He has his own Wikipedia page; he must be famous!)

There’s some great stuff in there, including the origin of the BSD daemon (Pixar, would you believe? Or, well, Lucasarts at the time…), and a great story about how a bug was introduced into the 4.2 BSD version of the pervasive UNIX diff utility. Marshall’s full of energy, and it’s a great interview; it’s a little amusing to see the stark contrast between the interviewer and McKusick, both of whom have rather different definitions of what constitutes an operating system.

by ozone@algorithm.com.au at December 15, 2008 08:00 AM

December 14, 2008

Clifford Beshers

The Pattern of Walking

These images may look like Christmas lights, but they were created simply by holding a camera on long exposure while walking on the beach at dusk. If you just wave the camera around, the patterns are uninteresting, because they have little cohesion, but just walking straight created unexpectedly interesting results.

They remind me of high school physics, when we learned about cycloids by attaching a light to a bicycle tire. The walking images are more regular than I expected, but also far more convoluted than a cycloid. Just as the hidden motion of the tire was revealed in that experiment, these images must show something fundamental about walking. I'm just not sure what.

by Clifford Beshers (noreply@blogger.com) at December 14, 2008 07:59 PM

Jamie Brandon

Zombified GMap

This summer I participated in the google summer of code, working on a haskell library for fast, composable maps. This never materialised and is generally assumed to be dead.

What happened? Myself and Adrian Hey put a lot of work into this. The code as is stands is just about usable and is in places very fast (mostly in the places that Adrian was responsible for :-) ). When the summer ended I wanted to tidy up some ugliness in the api (which was causing *loads* of silly code duplification, and was my fault) before releasing the first version. [EDIT: duplification is not a real word]

Then I moved house, started at a new university and generally did lots of time consuming things that caused this release to be put off. I'm an incredibly proficient procastinator so this level of work completely stopped me from doing the 2-3 days worth of work that was needed to bring GMap to life.

It would be a crying shame to let it die after so much work, so I have a plan. The code is going up on hackage, tonight, in all of its ugly glory. Between the 6th and 16th January I have absolutely no work, no commitments and no excuse for leaving my desk. Hopefully the sheer embarassment of having my half-finished code on display will drive me to fix it.

So now I've made a public commitment to fixing this sorry state of affairs and have no way to back out. Fingers crossed.

[EDIT: Huzzah]

by jamiiecb (noreply@blogger.com) at December 14, 2008 05:38 PM

Jeremy Shaw

Data Migration with HApps-Data
HAppS applications, like any application with persistent data storage, are faced with the issue of migrating existing data when the format of the persistent data is changed. This tutorial will explore the binary serialization and migration facilities provided by HAppS-Data. If you think I am doing it all wrong, please let me know. Writing this tutorial is the extent of my experience using the HApps-Data migration facilities.

Requirements

This tutorial only uses the HAppS-Data (and dependencies) portion of HAppS. It has been tested with HAppS-Data 0.9.3. The first three lines of the module look like this:
> {-# LANGUAGE TemplateHaskell, UndecidableInstances, FlexibleInstances, GeneralizedNewtypeDeriving, MultiParamTypeClasses, DeriveDataTypeable, TypeFamilies #-}
> module Main where
> import HAppS.Data
Serialization

The most obvious way to serialize data in Haskell is to use the familiar Read and Show classes. Simply use show to turn a value into a String, and read to turn a String back into a value. This method has three serious flaws however:

The law read . show == id does not hold for all Show/Read instances.

The serialized representation is rather verbose

No migration path when types change, leaving your old data inaccessible

HAppS-Data provides a Serialize class which addresses these three issues. From an end user point of view the Serialize functionality provides three items of interest:
The Serialize class

the serialize and deserialize functions

The deriveSerialize function
> class (Typeable a, Version a) => Serialize a where 
>   ...
>
> serialize :: Serialize a => a -> Lazy.ByteString
> deserialize :: Serialize a => Lazy.ByteString -> (a, Lazy.ByteString)
>
> deriveSerialize :: Language.Haskell.TH.Syntax.Name
>                  -> Language.Haskell.TH.Syntax.Q [Language.Haskell.TH.Syntax.Dec]
The Version superclass is used during data migration. The serialize and deserialize functions are the counterparts to show and read. deriveSerialize is a Template Haskell function which provides functionality similar to deriving (Read, Show).

The Version class

The Version class is very straight-forward. It consists of a single function which returns the Mode (aka, the version) of a datatype.
>
> class Version a where
>     mode :: Mode a
>     mode = Versioned 0 Nothing
>
> data Mode a = Primitive -- ^ Data layout won't change. Used for types like Int and Char.
>             | Versioned (VersionId a) (Maybe (Previous a))
>
> newtype VersionId a = VersionId {unVersion :: Int} deriving (Num,Read,Show,Eq)
There are two categories of datatypes:

primitives whose layout will never change, and, hence, will never need to be migrated

everything else

The Versioned constructor takes two arguments. The first argument is a version number which you increment when you make an change to the data-type. The second argument is an indicator of the previous version of the data-type. The exact details are covered in the next section.
Putting it all together

Let's say we have the following types:
>
> $(deriveAll [''Eq,''Ord,''Read,''Show, ''Default]
>  [d|
>      data Foo 
>          = Bar String
>          | Baz Beep
>            
>      data Beep 
>          = Beep
>    |])
The deriveAll template haskell function is similar to the normal haskell deriving clause, except it also has the ability to derive Default instances. Additionally, it always derives Typeable and Data instances even though they are not explicitly listed.

To make the types serializeable we first need to create Version instances.
> instance Version Beep where
>     mode = Versioned 0 Nothing
>
> instance Version Foo where
>     mode = Versioned 0 Nothing
We want to indicate that Beep and Foo are non-primative types, so we use the Versioned constructor. Next we specify a version number for the type. It could be anything, but 0 is the most sensible choice. Since there are now previous versions of these types we mark the previous type as Nothing.
For all non-primitive types the initial version of Versioned 0 Nothing is sensible. So the Version class provides it as a default value for mode:
> class Version a where
>    mode :: Mode a
>    mode = Versioned 0 Nothing
Hence, we could shorten our Version instances from above to:
> instance Version Beep
> instance Version Foo
Next we derive Serialize instances for our types:
> $(deriveSerialize ''Beep)
> $(deriveSerialize ''Foo)
Now we can use serialize to serialize values. Let's look at the output of serialize Beep
*Main> Data.ByteString.Lazy.unpack $ serialize Beep
[0,0,0,0,0,0,0,0,0]
*Main>
We see that Beep serializes to 9 bytes. The first 8 bytes represent the VersionId. VersionId is basically an Int, and the serialization code always treats Ints as a 64-bit values to avoid cross-platform issues. The final byte indicates which constructor of Beep was used. In this case the zeroth constructor was used.

At first it may seem like we don't have enough information here to deserialize the data, after all there are no type names, constructors, etc. But deserializing these bytes is no different than doing read "1" :: Int. Because we know the type of the value we want to be reading at compile time, we do not need to record that information in the stored data. We just do:
*Main> deserialize (serialize Beep) :: (Beep,ByteString)
(Beep,Empty)
*Main> 
As a side note, Strings are serialized to a very compact representation. In fact, they are stored as compactly as ByteStrings because they are first converted to a ByteString.
*Main> Data.ByteString.Lazy.unpack $ serialize "hello"
[0,0,0,0,0,0,0,5,104,101,108,108,111]
*Main> Data.ByteString.Lazy.unpack $ serialize (Data.ByteString.Lazy.Char8.pack "hello")
[0,0,0,0,0,0,0,5,104,101,108,108,111]
*Main> 
The first 8 bytes are the length of the String, and the remaining bytes are the utf-8 encoded characters of the String.

So, if you application is best served by using Strings instead of ByteStrings, you do not have to take an extra steps to ensure that the serialized data is compactly represented.

Simple Migration

Let's say we want to add another constructor to the Beep type. As a first pass, we will actually create a whole new type named Beep', which is similar to the old type, but has an additional constructor BeepBeep.
> $(deriveAll [''Eq,''Ord,''Read,''Show, ''Default]
>  [d|
>      data Beep' = BeepBeep' | Beep'
>    |])
>
> $(deriveSerialize ''Beep')
Because we are extending a previous type, our Version instance will look a bit different:
> instance Version Beep' where
>     mode = extension 1 (Proxy :: Proxy Beep)
This indicates that we are extending the old type Beep. The new version number must be higher than the old version, but does not have to be strictly sequential.

Because we specified that this type is a newer version of an older type, we also need to tell HAppS how to migrate the old data to the new type. To do this, we simply create an instance of the Migrate class.
> class Migrate a b where
>    migrate :: a -> b
The Migrate class is quite simple, it contains a single function, migrate which migrates something of type a to type b. In our current example, all we need is:
> instance Migrate Beep Beep' where
>     migrate Beep = Beep'
We can demonstrate migration by serializing a value of type Beep and deserializing it as type Beep'. The migration happens automatically in the deserialize function.
*Main> fst $ deserialize (serialize Beep) :: Beep'
Beep'
*Main>
When deserialize tries to deserialize the data produced by serialize Beep, it will first check the version number. When it sees that the version number in the stored data is lower than the version number of the current type it will instead try to decode it as the type you specified as the previous version. If the version associated with the previous type is still higher than the value in the serialized data, the migration code will recurse until it finds a matching version number. Once it finds a matching version number, it will call the corresponding deserialization "instance" to decode the old data. Then as the recursion unwinds, it will apply the migrate function to migrate the data to newer and newer formats until it reaches the newest format.

Managing History

A big issue in the above example is that when we added the new constructor we also changed the name of the type and its existing constructors. That is not very convenient in a real application where you have a multitude of references to the old names.

Fortunately, we do not have to change the name of the type to add a new constructor. As we saw in the beginning, the name of the type and the names of the constructors are not actually stored in the serialized data. So, instead we can change the name of the old type from Beep to OldBeep and update its constructor as well.
> $(deriveAll [''Eq,''Ord,''Read,''Show, ''Default]
>  [d|
>      data OldBeep = OldBeep
>   |])
>
> $(deriveSerialize ''OldBeep)
> instance Version OldBeep
Because OldBeep and Beep have the same shape, they will serialize to the same byte sequence:
*Main> Data.ByteString.Lazy.unpack $ serialize OldBeep
[0,0,0,0,0,0,0,0,0]
*Main> Data.ByteString.Lazy.unpack $ serialize Beep
[0,0,0,0,0,0,0,0,0]
*Main> 
that means we can serialize an OldBeep value and then deserialize it as a Beep value, like this:
*Main> fst $ deserialize (serialize OldBeep) :: Beep
Beep
*Main> 
Note that this is not the same as migration. Here we are just exploiting the fact that because the type name and constructor names are not encoded in the serialized data we can change those names and still be able to deserialize the data.

Full Migration Example #1

Here is the full example which shows:

Beep renamed to OldBeep

the new Beep with the extra constructor

the migration code from OldBeep to Beep
> $(deriveAll [''Eq,''Ord,''Read,''Show, ''Default]
>  [d|
>      data OldBeep 
>          = OldBeep
>   |])
>
> instance Version OldBeep
> $(deriveSerialize ''OldBeep)
>
>
> $(deriveAll [''Eq,''Ord,''Read,''Show, ''Default]
>  [d|
>      data OldBeep 
>          = OldBeep
>      data Beep = BeepBeep | Beep
>    |])
>
> instance Version Beep where
>     mode = extension 1 (Proxy :: Proxy OldBeep)
>
> $(deriveSerialize ''Beep)
Using separate files to manage type history

Keeping all the revisions of your type in one file, and changing the name of the type and its constructors every revision is tedious and hard to manage. Instead, we can use a system where we rename the files that contain our types. To start, we will put the types we want to serialize in a separate file (or files), such as Types.lhs.
> {-# LANGUAGE TemplateHaskell, UndecidableInstances, FlexibleInstances, GeneralizedNewtypeDeriving, MultiParamTypeClasses, DeriveDataTypeable, TypeFamilies #-}
> module Types 
>     ( Bar(..)
>     , Foo(..)
>     ) where
> import HAppS.Data
>
> $(deriveAll [''Eq,''Ord,''Read,''Show, ''Default]
>  [d|
>      data Foo 
>          = Bar String
>          | Baz Beep
>            
>      data Beep 
>          = Beep
>    |])
>
> instance Version Beep
> $(deriveSerialize ''Beep)
> instance Version Foo
> $(deriveSerialize ''Foo)
Now let's say we want to add a constructor Bop to the type Foo. First we rename Types.lhs to Types_000.lhs and change the module name to reflect the changed file name:
> {-# LANGUAGE TemplateHaskell, UndecidableInstances, FlexibleInstances, GeneralizedNewtypeDeriving, MultiParamTypeClasses, DeriveDataTypeable, TypeFamilies #-}
> module Types_000 
>     ( Beep(..)
>     , Foo(..)
>     ) where
> import HAppS.Data
>
> $(deriveAll [''Eq,''Ord,''Read,''Show, ''Default]
>  [d|
>      data Foo 
>          = Bar String
>          | Baz Beep
>            
>      data Beep 
>          = Beep
>    |])
>
> instance Version Beep
> $(deriveSerialize ''Beep)
> instance Version Foo
> $(deriveSerialize ''Foo)
Next we create a new Types.lhs:
> {-# LANGUAGE TemplateHaskell, UndecidableInstances, FlexibleInstances, GeneralizedNewtypeDeriving, MultiParamTypeClasses, DeriveDataTypeable, TypeFamilies #-}
> module Types 
>     ( Beep(..) -- ^ re-exported from Types_000
>     , Foo(..)  -- ^ extended here
>     ) where
> import HAppS.Data
We import the old Types_000 qualified as T0 to avoid name clashes.
> import qualified Types_000 as T0
Since we are only modifying Foo, we can import and re-export the old version of Beep unmodified (also see the module export list above):
> import           Types_000 (Beep)
Then we extend Foo with the new constructor Bop Int:
>
> $(deriveAll [''Eq,''Ord,''Read,''Show, ''Default]
>  [d|
>      data Foo 
>          = Bar String
>          | Baz Beep
>          | Bop Int
>    |])
Next we create a Version instance which indicates that the previous version of Foo is T0.Foo.
>
> instance Version Foo where
>     mode = extension 1 (Proxy :: Proxy T0.Foo)
>
> $(deriveSerialize ''Foo)
>
And, finally, we specify how to migrate the old data:
> instance Migrate T0.Foo Foo where
>     migrate (T0.Bar str)  = Bar str
>     migrate (T0.Baz beep) = Baz beep
Note that Foo in Types.lhs and Foo in Types_000.lhs are different types, namely Types.Foo and Types_000.Foo. So this works for the same reason that renaming Beep to OldBeep works.

Serializing Datatypes from 3rd Party Libraries

It is also possible to serialize datatypes from 3rd party libraries, provided those types have Data and Typeable instances. There is a caveat with this however. If the third party library changes the type, then you will not be able to read your data. This is not a fatal flaw however. You can simply copy the old type definition into a local module, and then migrate the old data to the new format.

Suggested Policy

Put the types you will serialize in one or more files which only contain types

Deploy your web 2.718 killer app

Before you do any more development, copy the current type files to sequential versions and create new current type files which re-export all the types. You can skip this step if the current type file only contains re-exports. i.e., if no type changes were made to that type file during the previous iteration.

Make changes for current development cycle, and then go to step 2.
by Jeremy Shaw (noreply@blogger.com) at December 14, 2008 02:20 PM

Paul Johnson

Where did all the money go?

I've got a simple question about the Credit Crunch? Where did all the money go?

I've always thought of money as something that is "conserved" in the physics sense, like energy. Money in = money out + money stored. So when someone "loses" money it implies that someone else got it. If I run a company that loses money, what it means is that I'm getting less money in than I'm spending on staff and supplies. But the money hasn't disappeared, its just moved from my pockets to my suppliers (and their suppliers, and so on).

But the financial institutions seem to have had billions of dollars disappear into thin air. They got poorer, sometimes so poor that they went bust, but nobody seems to have become correspondingly richer.

I guess this has something to do with fractional reserve banking, which I know doesn't conserve money. Suppose I start with £1000 and put it in a bank. The bank shows a balance of £1000, but it doesn't just hang on to my money, it loans £500 of it to Joe Bloggs, who spends it on a new TV, and the person who sold the TV also puts the money in the bank. So now the bank has $1,500 on deposit even though our imaginary economy just started with £1,000. So what happens if Joe can't pay the money back?

Can someone enlighten me?

by Paul Johnson (noreply@blogger.com) at December 14, 2008 12:59 PM

Real-World Haskell

Haskell around the world

Haskell all over the world:

ikegami in #haskell reports that his copy of RWH arrived in Japan today (and looks to be #1 in language books on Amazon Japan at the moment).

doublec sent us the following photo of RWH on his laptop, phone and incarnated, from New Zealand

Jeff and Armand independently reported that RWH is making its way through Canada

Ganesh reported that Amazon.co.uk was now shipping copies again, after running out earlier this week.

And as the snow starts to fall here in Portland, Oregon, new copies have appeared at Powell’s Technical, after selling out earlier this week. Hooray!

by Don Stewart (dons@cse.unsw.edu.au) at December 14, 2008 03:49 AM

Planet Haskell

December 21, 2008

Build Turbinado

Create the Page table

Generate the ORM models

Create your Layout

Create your Page controller

Create Your Views

Run It

Examples!

December 20, 2008

December 19, 2008

December 18, 2008

Help

Lookup

Mlismu

Source

Bugs / feedback

December 17, 2008

December 16, 2008

And That’s All There Is to Laziness?

But I Don’t Want To Be Lazy

December 15, 2008

December 14, 2008

Requirements

Serialization

The Version class

Putting it all together

Simple Migration

Managing History

Full Migration Example #1

Using separate files to manage type history

Serializing Datatypes from 3rd Party Libraries

Suggested Policy

Authors

Planetarium:

The `Version` class