You are viewing jputnam

jputnam [userpic]

Haskell vs. Quicksort

September 9th, 2007 (03:20 am)

Sort time ago, a blogger rather cleverly embedded a C-like imperative language as a DSL in Haskell. He then implemented quicksort in this DSL and opined that no elegance had been gained.

Um...

...yeah. I guess there really isn't much to say about that.

This post is both an explanation of QuickSort in Haskell itself an a window into how I develop code.

swap :: (MArray a e m, Ix i) => a i e -> i -> i -> m ()
swap a x y = do
   vX <- readArray a x
   vY <- readArray a y
   writeArray a x vY
   writeArray a y vX
Swap is only used in one place, but it should be in the standard library, so I'm factoring it out anyway.

Now, the most important property of quicksort is the inner loop, in which elements less than the pivot are left at the front of the subarray and values greater than the pivot are moved to the end. I ignore the optimization of tracking the values equal to the pivot, which avoids pathological behavior on simple inputs.
partition front back = do
   v <- readArray a front
   if (v < pivot)
      then    partition (front+1) back
      else do swap a front back
              partition front (back-1)
and, of course, when it ends, we want to properly return a middle index to swap the pivot back into.
partition front back | front == back = do
   v <- readArray a front
   if (v < pivot)
      then return (front+1)
      else return front
Putting these together, however, results in a disappointingly long function which manages to be almost entirely duplication. Fortunately, we can factor the common parts together.
partition front back = do
   v <- readArray a front
   case (front /= back, pivot < v) of
      (False, False) -> return (front+1)
      (False, True)  -> return front
      (True,  False) -> partition (front+1) back
      (True,  True)  -> do swap a front back
                        partition front (back-1)
Wikipedia has a simpler algorithm, though, and I'll get back to it later. Anyway, from here, we can sort.
...
   middle <- partition front (back-1)
   swap a middle back
   qsort' front (middle-1)
   qsort' (middle+1) back
And then, the housekeeping.
qsort a = do (front, back) <- getBounds a
             qsort' front back
   where qsort' front back | front >= back = return ()
                           | otherwise = do
            pivot <- readArray a back
            let partition front back = do
                   v <- readArray a front
                   case (front /= back, pivot < v) of
                           (False, False) -> return (front+1)
                           (False, True)  -> return front
                           (True,  False) -> partition (front+1) back
                           (True,  True)  -> do swap a front back
                                                partition front (back-1)
            middle <- partition front (back-1)
            swap a middle back
            qsort' front (middle-1)
            qsort' (middle+1) back
And quickCheck verifies that it does the right thing. But it isn't pretty. Much of Haskell's nice properties come from giving up on mutation and mutable structures altogether.

None of what follows has been tested.
Wikipedia's version isn't any shorter:
partition front back = do
   middle <- rot foldM front [front..back-1] $ \store i -> do
      v <- readArray a i
      if (v <= pivot)
         then do swap a store i
                 return (store+1)
         else return store
   return middle
...
rot f b c a = f a b c
The real problem is with the entirely conventional array operations. As long as the only available operations are the same, the algorithm obviously stays the same.

Better would be something involving a findFirst and some kind of mutate.
partition store front back | (front > back) = return store
                           | otherwise = do
   i <- findFirst a front back (< pivot)
   maybeM i (return store) $ do \i ->
      swap a store i
      partition (store+1) (i+1) back
Nothing spectacular. The for loop already encodes performing a nondeterminate action to each element of an array. Surely we can do better. Sorts encode something specific, namely performing a swap on decreasing sections of an array.

Not done yet.