Even though Haskell I/O code is purely functional, this does not prevent us from writing imperative style I/O code. Let's start by printing all the lines of the file:
import System.IO import qualified Data.ByteString as B import qualified Data.ByteString.Char8 as B8 import Data.Char (chr) main = do h <- openFile "jabberwocky.txt" ReadMode loop h hClose h where loop h' = do isEof <- hIsEOF h' if isEof then putStrLn "DONE..." else do line <- hGetLine h' print $ words line loop h'
Instead of the hGetLine
function, let's use a Data.ByteString.hGet
function to read from the file in chunks of 8 bytes:
chunk <- B.hGet h' 8 print . words $ show chunk –- vs –- line <- hGetLine h' –- print $ words line
The splitting of a chunk into words is not meaningful anymore. We need to accumulate the chunks until we reach the end of a line and then capture the accumulated line and possible remainder:
data Chunk = Chunk {chunk :: String} | LineEnd {chunk :: String, remainder :: String} deriving (Show)
If a chunk contains a newline character, then we return a LineEnd
; otherwise, we return just another Chunk
:
parseChunk chunk = if rightS == B8.pack "" then Chunk (toS leftS) else LineEnd (toS leftS) ((toS . B8.tail) rightS) where (leftS, rightS) = B8.break (== ' ') chunk toS = map (chr . fromEnum) . B.unpack
Let's use the parseChunk
function:
main = do print $ (parseChunk (B8.pack "AAA BB")) -- LineEnd {chunk = "AAA", remainder = "BB"} print $ (parseChunk (B8.pack "CCC")) -- Chunk {chunk = "CCC"}
Now we can accumulate chunks into lines:
main = do fileH <- openFile "jabberwocky.txt" ReadMode loop "" fileH hClose fileH where loop acc h = do isEof <- hIsEOF h if isEof then do putStrLn acc; putStrLn "DONE..." else do chunk <- B.hGet h 8 case (parseChunk chunk) of (Chunk chunk') -> do let accLine = acc ++ chunk' loop accLine h (LineEnd chunk' remainder) -> do let line = acc ++ chunk' putStrLn line -- do something with line loop remainder h return ()
For each loop iteration, we use the hGet
function, to fetch a chunk of file. When we reach a LineEnd
chunk, we continue looping, this time starting with the remainder
as the first chunk in a new accumulation of a line. When processing regular chunks, we just add the chunk to the accumulation and continue looping.
What we call "imperative I/O" is strictly called "handle-based I/O". For more information, visit http://okmij.org/ftp/Haskell/Iteratee/describe.pdf.
Handle-based I/O has some beneficial characteristics:
The downsides of handle-based I/O are:
EOF
at each iteration. We need to explicitly clean up the resource.