Scrap your boilerplate (SYB) is another early approach to datatype-generic programming, that is, it provides a way to define generic functions over a "universal" type representation.
SYB differs from the other two approaches we have explored in that the type representation is obfuscated from the user.
Earlier versions of the SYB approach had a strong focus on generic traversals over complex nested data structures, for example:
data Book = Book Title [Chapter] data Chapter = Chapter Title [Section] data Section = Section Title [Para] type Title = String; type Para = String haskellDP = Book "Haskell Design Patterns" chapters chapters = [Chapter "The building blocks" sections1, Chapter "IO Patterns" sections2] sections1 = [Section "1.1" ["Lorem lorem"], Section "1.2" ["Lorem lorem", "Lorem lorem"]] sections2 = [Section "2.1" ["Ipsum Ipsum"], Section "2.2" ["Ipsum Ipsum", "Ipsum Ipsum"]]
Suppose we have a function, fSection
, which we want to apply to all sections embedded in a Book
. This would be a great job for the Lens
library, which can precisely deliver a function to elements in a complex structure, but we will follow a different route here:
fSection (Section t lines') = Section "!!!" lines' main = print $ fSection (Section "1.1" ["S1"])
Our strategy will be to morph our function into one that can be applied to all parts of the data structure (Book
). The new function should simply ignore elements for which it was not intended.
To make type-safe cast possible, we will have to do a type comparison to check whether an element of a book is in fact a section.
The Data.Typeable
type-class has what we need in the form of the type-safe cast
function. We can autoderive Typeable
for our types (in fact, since GHC 7.8, the compiler does not allow us to implement Typeable
ourselves). For this, we need to add a language pragma for DeriveDataTypeable
:
{-# LANGUAGE DeriveDataTypeable #-} import Data.Typeable data Book = Book Title [Chapter] deriving(Show, Typeable) data Chapter = Chapter Title [Section] deriving(Show, Typeable) data Section = Section Title [Para] deriving(Show, Typeable) type Title = String; type Para = String
Let's get to know the Typeable.cast
function:
ghci> cast 'a' :: Maybe Char -> Just 'a' ghci> cast 'a' :: Maybe Int -> Nothing
If the value type matches with the type inside Maybe
, we get a Just
value, otherwise we get Nothing
:
ghci> (cast (Section "t" ["s1", "s2"])) :: Maybe Section -> (Just (Section "title" ["line1", "line2"])) ghci> (cast (Book "title" [])) :: Maybe Section -> Nothing
Since functions are also just types, we can cast them too!
ghci> cast (++ "a") :: Maybe String -> Nothing ghci> cast (++ "a") :: Maybe (String -> String) -> Just (++ "a")
Let's write a higher order function to enable type-safe function application:
typesafeF :: (Typeable a, Typeable b) => (b -> b) -> a -> a typesafeF f = case (cast f) of Just f' -> f' Nothing -> id
Consider the following example:
ghci> typesafeF (+1) 3 -> 4 ghci> typesafeF (+1) "3" -> "3"
By lifting fSection
into a type-safe function, we can apply it to any part of the Book
type (or any Typeable
type, for that matter):
main = do print $ (typesafeF fSection) aSection print $ (typesafeF fSection) aBook where aSection = (Section "1.1" ["s1", "s2]) aBook = (Book "title" [])
The function (typesafeF fSection
) leaves all values not targeted by the fSection
function. For example, in the first call in the preceding code, typesafeF
will apply fSection
to Section. In the second call, typesafeF
applies id to Book.
In both the
sum of products
and origami
styles of generic programming, we encountered shallow recursion in crucial parts of the formulation.
Let's explore how we might traverse a Book
data structure with a function, again favoring a shallow approach. As a first attempt, let’s write a traversal for Chapter
data structures:
gmap f (Chapter title sections) = Chapter (f title) (f sections) -- INVALID
We quickly run into trouble! The function cannot span all the different types it is being applied to. This is why we need to resort to Rank2Types
so that we can define a gmap
function that accepts a generic mapping function.
We also make gmap
a part of the Data'
type-class to be implemented by all the types we plan to traverse over:
{-# LANGUAGE Rank2Types #-} -- Data' inherits from Typeable class Typeable a => Data' a where gmap :: (forall b. Data' b => b -> b) -> a -> a –- note the shallow recursion in gmap implementations... instance Data' Book where gmap f (Book title chapters) = Book (f title) (f chapters) instance Data' Chapter where gmap f (Chapter title sections) = Chapter (f title) (f sections) instance Data' Section where gmap f (Section title paras) = Section (f title) (f paras) instance Data' a => Data' [a] where gmap f [] = [] gmap f (x:xs) = f x : f xs instance Data' Char where gmap f c = c main = do print $ gmap (typesafeF fSection) chapter print $ gmap (typesafeF fSection) sections1 where chapter = (Chapter "The building blocks" sections1)
If we traverse a chapter with the fSection
function, the sections are not reached by the traversal. When we traverse a list of sections, only the first is affected by fSection
, while the subsequent sections are ignored by the fSection
function:
gmap f (section1: sections) –- = f section1 : f sections –- f sections = f sections –- = f [Section] -- = id [Section]
This shallow traversal of the gmap
function can seem a bit pointless, but the big advantage it brings is that we can mold it into different kinds of recursion, for example:
-- bottom-up traversal: traverse x before applying f traverse :: Data' a =>(forall b . Data' b => b->b) -> a -> a traverse f x = f (gmap (traverse f) x) -- vs top-down traversal: apply f x then traverse traverse' :: Data' a =>(forall b . Data' b => b->b) -> a -> a traverse' f x = gmap (traverse' f) (f x) main = do -- this time our traversal is reaching all Sections... print $ traverse (typesafeF fSection) chapter print $ traverse (typesafeF fSection) sections1 print $ traverse' (typesafeF fBook) haskellDP where chapter = (Chapter "The building blocks" sections1) fBook (Book t chapters) = Book "!!!" chapters
Together, the gmap
and typesafeF
functions enable us to deliver type-specific functions deeply into nested structures.
"Recursive traversal in two steps — first define a one-layer map, and then tie the recursive knot separately — is well-known folk lore in the functional programming community. For lack of better-established terminology we call it 'the non-recursive map trick'" | ||
--Lammel and Peyton Jones, 2003, Scrap your Boilerplate: A Practical Design Pattern for Generic Programming |
Our Data'
type-class mimics the Haskell Data
type-class, which is based on the more general gfoldl
(as opposed to our gmap
).
The SYB approach relies on the compiler to auto-generate instances of Typeable
and Data
. This is a purposeful obfuscation of the underlying structure of datatypes and contrasts with the sum of products and origami styles, which involve the programmer directly in the translation to and from the relevant type representation.
In SYB, the programming interface consists of generic combinators based on Typeable
and Data
. This is why it is said that Typeable
provides the backend of SYB and Data provides the frontend.
Even though we are not directly engaged with the type representation, we are still dealing with the generic functions defined in the abstract data shape, where Data
provides the abstraction over shapes made of Typeable's
.
We have just scratched the surface of this technique. In the above referenced paper, Lammel and Peyton Jones cover the following points:
fold
In further work, SYB was extended to allow for extensibility, in the sense of being able to override generic behavior for a given type.
SYB was an early approach that strongly influenced the development of datatype-generic programming in Haskell.
For example, the Uniplate
library represents a simplified (and less powerful) phrasing of SYB. The Uniplate
library has since been embedded in the Lens
library.