Chapter 4. Introducing Imperative Programming

In Chapter 3, you saw some of the simple but powerful data types and language constructs that make up F# functional programming. The functional programming paradigm is strongly associated with "programming without side effects," called pure functional programming. In this paradigm, programs compute the result of a mathematical expression and don't cause any side effects, except perhaps reporting the result of the computation. The formulas used in spreadsheets are often pure, as is the core of functional programming languages such as Haskell. F# isn't, however, a pure functional language. For example, you can write programs that mutate data, perform I/O communications, start threads, and raise exceptions. Furthermore, the F# type system doesn't enforce a strict distinction between expressions that perform these actions and expressions that don't.

Programming with side effects is called imperative programming. This chapter looks more closely at a number of constructs related to imperative programming. It describes how to use loops, mutable data, arrays, and some common input/output techniques.

If your primary programming experience has been with an imperative language such as C, C#, or Java, you may initially find yourself using imperative constructs fairly frequently in F#. However, over time, F# programmers generally learn how to perform many routine programming tasks within the side-effect-free subset of the language. F# programmers tend to use side effects in the following situations:

  • When scripting and prototyping using F# Interactive

  • When working with .NET library components that use side effects heavily, such as GUI libraries and I/O libraries

  • When initializing complex data structures

  • When using inherently imperative, efficient data structures such as hash tables and hash sets

  • When locally optimizing routines in a way that improves the performance of the functional version of the routine

When working with very large data structures or in scenarios where the allocation of data structures must be minimized for performance reasons

Some F# programmers don't use any imperative techniques except as part of the external wrapper for their programs. Adopting this form of pure functional programming for a time is an excellent way to hone your functional programming techniques.

Tip

Programming with fewer side effects is attractive for many reasons. For example, eliminating unnecessary side effects nearly always reduces the complexity of your code, so it leads to fewer bugs. Another thing experienced functional programmers appreciate is that the programmer or compiler can easily adjust the order in which expressions are computed. A lack of side effects also helps you reason about your code: it's easier to visually check when two programs are equivalent, and it's easier to make radical adjustments to your code without introducing new, subtle bugs. Programs that are free from side effects can often be computed on demand as necessary, often by making very small, local changes to your code to introduce the use of delayed data structures. Finally, side effects such as mutation are difficult to use when data is accessed concurrently from multiple threads, as you see in Chapter 13.

Imperative Looping and Iterating

Three looping constructs are available to help simplify writing iterative code with side effects:

  • Simple for loops: for var = start-expr to end-expr do expr

  • Simple while loops: while expr do expr

  • Sequence loops: for pattern in expr do expr

All three constructs are for writing imperative programs, indicated partly by the fact that in all cases the body of the loop must have a return type of unit. Note that unit is the F# type that corresponds to void in imperative languages such as C, and it has the single value (). The following sections cover these three constructs in more detail.

Simple for Loops

Simple for loops are the most efficient way to iterate over integer ranges. This is illustrated here by a replacement implementation of the repeatFetch function from Chapter 2:

let repeatFetch url n =
    for i = 1 to n do
        let html = http url
        printf "fetched <<< %s >>>
" html
    printf "Done!
"

This loop is executed for successive values of i over the given range, including both start and end indexes.

Simple while Loops

The second looping construct is a while loop, which repeats until a given guard is false. For example, here is a way to keep your computer busy until the weekend:

open System

let loopUntilSaturday() =
    while (DateTime.Now.DayOfWeek <> DayOfWeek.Saturday) do
        printf "Still working!
"

    printf "Saturday at last!
"

When executing this code in F# Interactive, you can interrupt its execution by using Ctrl+C.

More Iteration Loops over Sequences

As discussed in Chapter 3, any values compatible with the type seq<type> can be iterated using the for pattern in seq do ... construct. The input seq may be an F# list value, any seq<type>, or a value of any type supporting a GetEnumerator method. Here are some simple examples:

> for (b,pj) in [ ("Banana 1",true); ("Banana 2",false) ] do
      if pj then printfn "%s is in pyjamas today!" b;;
Banana 1 is in pyjamas today!

The following example iterates the results of a regular expression match. The type returned by the .NET method System.Text.RegularExpressions.Regex.Matches is a MatchCollection, which for reasons known best to the .NET designers doesn't directly support the seq<Match> interface. It does, however, support a GetEnumerator method that permits iteration over the individual results of the operation, each of which is of type Match; the F# compiler inserts the conversions necessary to view the collection as a seq<Match> and perform the iteration. You learn more about using the .NET Regular Expression library in Chapter 10:

> open System.Text.RegularExpressions;;
> for m in (Regex.Matches("All the Pretty Horses","[a-zA-Z]+")) do
      printf "res = %s
" m.Value;;
res = All
res = the
res = Pretty
res = Horses

Using Mutable Records

The simplest mutable data structures in F# are mutable records. In Chapter 3, you saw some simple examples of immutable records. A record is mutable if one or more of its fields is labeled mutable. This means record fields can be updated using the <- operator: that is, the same syntax used to set a property. Mutable fields are generally used for records that implement the internal state of objects, discussed in Chapters 6 and 7.

For example, the following code defines a record used to count the number of times an event occurs and the number of times the event satisfies a particular criterion:

type DiscreteEventCounter =
    { mutable Total: int;
      mutable Positive: int;
      Name : string }

let recordEvent (s: DiscreteEventCounter) isPositive =
    s.Total <- s.Total+1
    if isPositive then s.Positive <- s.Positive+1

let reportStatus (s: DiscreteEventCounter) =
    printfn "We have %d %s out of %d" s.Positive s.Name s.Total

let newCounter nm =
    { Total = 0;
      Positive = 0;
      Name = nm }

You can use this type as follows (this example uses the http function from Chapter 2):

let longPageCounter = newCounter "long page(s)"

let fetch url =
    let page = http url
    recordEvent longPageCounter (page.Length > 10000)
    page

Every call to the function fetch mutates the mutable record fields in the global variable longPageCounter. For example:

> fetch "http://www.smh.com.au" |> ignore;;
val it : unit = ()

> fetch "http://www.theage.com.au"  |> ignore;;
val it : unit = ()

> reportStatus longPageCounter;;
We have 1 long page(s) out of 2
val it : unit = ()

Record types can also support members (for example, properties and methods) and give implicit implementations of interfaces, discussed in Chapter 6. Practically speaking, this means you can use them as one way to implement object-oriented abstractions.

Mutable Reference Cells

One particularly useful mutable record is the general-purpose type of mutable reference cells, or ref cells for short. These often play much the same role as pointers in other imperative programming languages. You can see how to use mutable reference cells in the following example:

> let cell1 = ref 1;;
val cell1 : int ref = {contents = 1;}

> !cell1;;
val it : int = 1

> cell1 := 3;;
val it : unit = ()

> cell1;;
val it : int ref = {contents = 3;}

> !cell1;;
val it : int = 3

The key type is 'T ref, and its main operators are ref, !, and :=. The types of these operators are as follows:

val ref  : 'T -> 'T ref
val (:=) : 'T ref -> 'T -> unit
val (!)  : 'T ref -> 'T

These allocate a reference cell, mutate the cell, and read the cell, respectively. The operation cell1 := 3 is the key one; after this operation, the value returned by evaluating the expression !cell1 is changed. You can also use either the contents field or the Value property to access the value of a reference cell.

Both the 'T ref type and its operations are defined in the F# library as simple record data structures with a single mutable field:

type 'T ref = { mutable contents: 'T }
let (!) r = r.contents
let (:=) r v = r.contents <- v
let ref v = { contents = v }

The type 'T ref is a synonym for a type Microsoft.FSharp.Core.Ref<'T> defined in this way.

Avoiding Aliasing

Like all mutable data structures, two mutable record values or two values of type 'T ref may refer to the same reference cell—this is called aliasing. Aliasing of immutable data structures isn't a problem; no client consuming or inspecting the data values can detect that the values have been aliased. However, aliasing of mutable data can lead to problems in understanding code. In general, it's good practice to ensure that no two values currently in scope directly alias the same mutable data structures. The following example continues from earlier and shows how an update to cell1 can affect the value returned by !cell2:

> let cell2 = cell1;;
val cell2 : int ref = {contents = 3;}

> !cell2;;
val it : int = 3

> cell1 := 7;;
val it : unit = ()

> !cell2;;
val it : int 7

Hiding Mutable Data

Mutable data is often hidden behind an encapsulation boundary. Chapter 7 looks at encapsulation in more detail, but one easy way to do this is to make data private to a function. For example, the following shows how to hide a mutable reference within the inner closure of values referenced by a function value:

let generateStamp =
    let count = ref 0
    (fun () -> count := !count + 1; !count)
val generateStamp: unit -> int

The line let count = ref 0 is executed once, when the generateStamp function is defined. Here is an example of the use of this function:

> generateStamp();;
val it : int = 1

> generateStamp();;
val it : int = 2

This is a powerful technique for hiding and encapsulating mutable state without resorting to writing new type and class definitions. It's good programming practice in polished code to ensure that all related items of mutable state are collected under some named data structure or other entity such as a function.

Using Mutable Locals

You saw in the previous section that mutable references must be explicitly dereferenced. F# also supports mutable locals that are implicitly dereferenced. These must either be top-level definitions or be local variables in a function:

> let mutable cell1 = 1;;
val mutable cell1 : int = 1

> cell1;;
val it : int = 1

> cell1 <- 3;;
val it : unit = ()

> cell1;;
val it : int = 3

The following shows how to use a mutable local:

let sum n m =
    let mutable res = 0
    for i = n to m do
        res <- res + i
    res
> sum 3 6;;
val it : int = 18

F# places strong restrictions on the use of mutable locals. In particular, unlike mutable references, mutable locals are guaranteed to be stack-allocated values, which is important in some situations because the .NET garbage collector won't move stack values. As a result, mutable locals may not be used in any inner lambda expressions or other closure constructs, with the exception of top-level mutable values, which can be used anywhere, and mutable fields of records and objects, which are associated with the heap allocated objects themselves. You learn more about mutable object types in Chapter 6. Reference cells and types containing mutable fields can be used instead to make the existence of heap-allocated imperative state obvious.

Working with Arrays

Mutable arrays are a key data structure used as a building block in many high-performance computing scenarios. The following example illustrates how to use a one-dimensional array of double values:

> let arr = [| 1.0; 1.0; 1.0 |];;
val arr : float[]

> arr.[1];;
val it : float = 1.0

> arr.[1] <- 3.0;;
val it : unit = ()

> arr;;
val it : float[] = [| 1.0; 3.0; 1.0 |]

F# array values are usually manipulated using functions from the Array module; its full path is Microsoft.FSharp.Collections.Array, but you can access it with the short name Array. Arrays are created either by using the creation functions in that module (such as Array.init, Array.create, and Array.zeroCreate) or by using sequence expressions, as discussed in Chapter 3. Some useful methods are also contained in the System.Array class. Table 4-1 shows some common functions from the Array module.

Table 4.1. Some Important Functions and Aggregate Operators from the Array Module

Operator

Type

Explanation

Array.append

: 'T[] -> 'T[] -> 'T[]

Returns a new array containing elements of the first array followed by elements of the second array

Array.sub

: 'T[] -> int -> int -> 'T[]

Returns a new array containing a portion of elements of the input array

Array.copy

: 'T[] -> 'T[]

Returns a copy of the input array

Array.iter

: ('T -> unit) -> 'T[] -> unit

Applies a function to all elements of the input array

Array.filter

: ('T -> bool) -> 'T[] -> 'T[]

Returns a new array containing a selection of elements of the input array

Array.length

: 'T[] -> int

Returns the length of the input array

Array.map

: ('T -> 'U) -> 'T[] -> 'U[]

Returns a new array containing the results of applying the function to each element of the input array

Array.fold

: ('T -> 'U -> 'T) -> 'T -> 'U[] -> 'T

Accumulates left to right over the input array

Array.foldBack

: ('T -> 'U -> 'U) -> 'T[] -> 'U -> 'U

Accumulates right to left over the input array

F# arrays can be very large, up to the memory limitations of the machine (a 3GB limit applies on 32-bit systems). For example, the following creates an array of 100 million elements (of total size approximately 400MB for a 32-bit machine):

> let (r : int[]) = Array.zeroCreate 100000000;;
val r : int [] = ...

The following attempt to create an array more than 4GB in size causes an OutOfMemoryException on one of our machines:

> let (r : int[]) = Array.zeroCreate 1000000000;;
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException'
was thrown.

Note

Arrays of value types (such as int, single, double, int64) are stored flat, so only one object is allocated for the entire array. Arrays of other types are stored as an array of object references. Primitive types such as integers and floating-point numbers are all value types; many other .NET types are also value types. The .NET documentation indicates whether each type is a value type or not. Often, the word struct is used for value types. You can also define new struct types directly in F# code, as discussed in Chapter 6. All other types in F# are reference types, such as all record, tuple, discriminated union, and class and interface values.

Generating and Slicing Arrays

In Chapter 3, you saw in passing that you can use sequence expressions as a way to generate interesting array values. For example:

> let arr = [| for i in 0 .. 5 -> (i,i*i) |];;
val arr : (int * int) [] =
  [|(0, 0); (1, 1); (2, 4); (3, 9); (4, 16); (5, 25)|]

You can also use a convenient syntax for extracting subarrays from existing arrays; this is called slice notation. A slice expression for a single-dimensional array has the form arr.[start..finish], where one of start and finish may optionally be omitted, and index zero or the index of the last element of the array is assumed instead. For example:

> let arr = [| for i in 0 .. 5 -> (i,i*i) |];;
val arr : (int * int) [] =
  [|(0, 0); (1, 1); (2, 4); (3, 9); (4, 16); (5, 25)|]

> arr.[1..3];;
val it : (int * int) [] = [| (1, 1); (2, 4); (3, 9); |]

> arr.[..2];;
val it : (int * int) [] = [| (0, 0); (1, 1); (2, 4); |]

> arr.[3..];;
val it : (int * int) [] = [| (3, 9); (4, 16); (5, 25) |]

Slicing syntax is used extensively in the example "Verifying Circuits with Propositional Logic" in Chapter 12. You can also use slicing syntax with strings and several other F# types such as vectors and matrices, and the operator can be overloaded to work with your own type definitions. The F# library definitions of vectors and matrices can be used as a guide.

Note

Slices on arrays generate fresh arrays. Sometimes it's more efficient to use other techniques, such as accessing the array via an accessor function or object that performs one or more internal index adjustments before looking up the underlying array. If you add support for the slicing operators to your own types, you can choose whether they return copies of data structures or an accessor object.

Two-Dimensional Arrays

Like other .NET languages, F# directly supports two-dimensional array values that are stored flat: that is, where an array of dimensions (N, M) is stored using a contiguous array of N * M elements. The types for these values are written using [,], such as in int[,] and double[,], and these types also support slicing syntax. Values of these types are created and manipulated using the values in the Array2D module. Likewise, there is a module for manipulating three-dimensional array values whose types are written int[,,]. You can also use the code in those modules as a template for defining code to manipulate arrays of higher dimension.

Introducing the Imperative .NET Collections

The .NET Framework comes equipped with an excellent set of imperative collections under the namespace System.Collections.Generic. You've seen some of these already. The following sections look at some simple uses of these collections.

Using Resizeable Arrays

As mentioned in Chapter 3, the .NET Framework comes with a type System.Collections.Generic.List<'T>, which, although named List, is better described as a resizeable array. The F# library includes the following type abbreviation for this purpose:

type ResizeArray<'T> = System.Collections.Generic.List<'T>

Here is a simple example of using this data structure:

> let names = new ResizeArray<string>();;
val names : ResizeArray<string>

> for name in ["Claire"; "Sophie"; "Jane"] do
      names.Add(name);;
val it : unit = ()

> names.Count;;
val it : int = 3

> names.[0];;
val it : string = "Claire"

> names.[1];;
val it : string = "Sophie"

> names.[2];;
val it : string = "Jane"

Resizable arrays use an underlying array for storage and support constant-time random-access lookup. In many situations, this makes a resizable array more efficient than an F# list, which supports efficient access only from the head (left) of the list. You can find the full set of members supported by this type in the .NET documentation. Commonly used properties and members include Add, Count, ConvertAll, Insert, BinarySearch, and ToArray. A module ResizeArray is included in the F# library; it provides operations over this type in the style of the other F# collections.

Like other .NET collections, values of type ResizeArray<'T> support the seq<'T> interface. There is also an overload of the new constructor for this collection type that lets you specify initial values via a seq<'T>. This means you can create and consume instances of this collection type using sequence expressions:

> let squares = new ResizeArray<int>(seq { for i in 0 .. 100 -> i*i });;
val squares : ResizeArray<int>

> for x in squares do
      printfn "square: %d" x;;
square: 0
square: 1
square: 4
square: 9
...

Using Dictionaries

The type System.Collections.Generic.Dictionary<'Key,'Value> is an efficient hash-table structure that is excellent for storing associations between values. The use of this collection from F# code requires a little care, because it must be able to correctly hash the key type. For simple key types such as integers, strings, and tuples, the default hashing behavior is adequate. Here is a simple example:

> open System.Collections.Generic;;

> let capitals = new Dictionary<string, string>(HashIdentity.Structural);;
val capitals : Dictionary<string,string> = dict []

> capitals.["USA"] <- "Washington";;
val it : unit = ()

> capitals.["Bangladesh"] <- "Dhaka";;
val it : unit = ()

> capitals.ContainsKey("USA");;
val it : bool = true

> capitals.ContainsKey("Australia");;
val it : bool = false

> capitals.Keys;;
val it : KeyCollection<string,string> = seq["USA"; "Bangladesh"]

> capitals.["USA"];;
val it : string = "Washington"

Dictionaries are compatible with the type seq<KeyValuePair<'key,'value>>, where KeyValuePair is a type from the System.Collections.Generic namespace and simply supports the properties Key and Value. Armed with this knowledge, you can use iteration to perform an operation for each element of the collection:

> for kvp in capitals do
      printf "%s has capital %s
" kvp.Key kvp.Value;;
USA has capital Washington
Bangladesh has capital Dhaka
val it : unit = ()

Using Dictionary's TryGetValue

The Dictionary method TryGetValue is of special interest because its use from F# is a little nonstandard. This method takes an input value of type 'Key and looks it up in the table. It returns a bool indicating whether the lookup succeeded: true if the given key is in the dictionary and false otherwise. The value itself is returned via a .NET idiom called an out parameter. From F# code, three ways of using .NET methods rely on out parameters:

  • You may use a local mutable in combination with the address-of operator &.

  • You may use a reference cell.

  • You may simply not give a parameter, and the result is returned as part of a tuple.

Here's how you do it using a mutable local:

open System.Collections.Generic

let lookupName nm (dict : Dictionary<string,string>) =
    let mutable res = ""
    let foundIt = dict.TryGetValue(nm, &res)
    if foundIt then res
    else failwithf "Didn't find %s" nm

The use of a reference cell can be cleaner. For example:

> let res = ref "";;
val res: string ref = {contents = "";}

> capitals.TryGetValue("Australia", res);;
val it: bool = false

> capitals.TryGetValue("USA", res);;
val it: bool = true

> res;;
val it: string ref = {contents = "Washington"}

Finally, here is the technique where you don't pass the final parameter, and instead the result is returned as part of a tuple:

> capitals.TryGetValue("Australia");;
val it: bool * string = (false, null)

> capitals.TryGetValue("USA");;
val it: bool * string = (true, "Washington")

Note that the value returned in the second element of the tuple may be null if the lookup fails when this technique is used. null values are discussed in the section "Working with null Values" at the end of this chapter.

Using Dictionaries with Compound Keys

You can use dictionaries with compound keys such as tuple keys of type (int * int). If necessary, you can specify the hash function used for these values when creating the instance of the dictionary. The default is to use generic hashing, also called structural hashing, a topic covered in more detail in Chapter 8. If you want to indicate this explicitly, you do so by specifying Microsoft.FSharp.Collections. HashIdentity.Structural when creating the collection instance. In some cases, this can also lead to performance improvements, because the F# compiler often generates a hashing function appropriate for the compound type.

Here is an example that uses a dictionary with a compound key type to represent sparse maps:

> open System.Collections.Generic;;
> open Microsoft.FSharp.Collections;;

> let sparseMap = new Dictionary<(int * int), float>();;
val sparseMap : Dictionary <(int * int),float> = dict []

> sparseMap.[(0,2)] <- 4.0;;
val it : unit = ()

> sparseMap.[(1021,1847)] <- 9.0;;
val it : unit = ()

> sparseMap.Keys;;
val it : Dictionary.KeyCollection<(int * int),float> = seq [(0,2); (1021; 1847)]

Some Other Mutable Data Structures

Some of the other important mutable data structures in the F# and .NET libraries are as follows:

  • System.Collections.Generic.SortedList<'Key,'Value>: A collection of sorted values. Searches are done by a binary search. The underlying data structure is a single array.

  • System.Collections.Generic.SortedDictionary<'Key,'Value>: A collection of key/value pairs sorted by the key, rather than hashed. Searches are done by a binary chop. The underlying data structure is a single array.

  • System.Collections.Generic.Stack<'T>: A variable-sized last-in/first-out (LIFO) collection.

  • System.Collections.Generic.Queue<'T>: A variable-sized first-in/first-out (FIFO) collection.

  • System.Text.StringBuilder: A mutable structure for building string values.

  • Microsoft.FSharp.Collections.HashSet<'Key>: A hash table structure holding only keys and no values. From .NET 3.5, a HashSet<'T> type is available in the System.Collections.Generic namespace.

Exceptions and Controlling Them

When a routine encounters a problem, it may respond in several ways, such as by recovering internally, emitting a warning, returning a marker value or incomplete result, or throwing an exception. The following code indicates how an exception can be thrown by some of the code you've been using:

> let req = System.Net.WebRequest.Create("not a URL");;
System.UriFormatException: Invalid URI: The format of the URI could not be
determined.

Similarly, the GetResponse method also used in the http function may raise a System.Net.WebException exception. The exceptions that may be raised by routines are typically recorded in the documentation for those routines. Exception values may also be raised explicitly by F# code:

> (raise (System.InvalidOperationException("not today thank you")) : unit);;
System.InvalidOperationException: not today thank you

In F#, exceptions are commonly raised using the F# failwith function:

> if false then 3 else failwith "hit the wall";;
System.Exception: hit the wall

The types of some of the common functions used to raise exceptions are shown here:

val failwith : string -> 'T
val raise : System.Exception -> 'T
val failwithf : StringFormat<'T,'U> -> 'T
val invalidArg : string -> string -> 'T

Note that the return types of all these are generic type variables: the functions never return normally and instead return by raising an exception. This means they can be used to form an expression of any particular type and can be handy when you're drafting your code. For example, in the following example, we've left part of the program incomplete:

if (System.DateTime.Now > failwith "not yet decided") then
    printfn "you've run out of time!"

Table 4-2 shows some of the common exceptions that are raised by failwith and other operations.

Table 4.2. Common Categories of Exceptions and F# Functions That Raise Them

Exception Type

F# Abbreviation

Description

Example

Exception

Failure

General failure

failwith "fail"

ArgumentException

InvalidArgument

Bad input

invalidArg "x" "y"

DivideByZeroException

 

Integer divide by 0

1 / 0

NullReferenceException

 

Unexpected null

(null : string).Length

Catching Exceptions

You can catch exceptions using the try ... with ... language construct and :? type-test patterns, which filter any exception value caught by the with clause. For example:

> try
     raise (System.InvalidOperationException ("it's just not my day"))
  with
     | :? System.InvalidOperationException -> printfn "caught!";;
caught!

Chapter 5 covers these patterns more closely. The following code sample shows how to use try ... with ... to catch two kinds of exceptions that may arise from the operations that make up the http method, in both cases returning the empty string "" as the incomplete result. Note that try ... with ... is just an expression, and it may return a result in both branches:

open System.IO

let http(url: string) =
    try
        let req = System.Net.WebRequest.Create(url)
        let resp = req.GetResponse()
        let stream = resp.GetResponseStream()
        let reader = new StreamReader(stream)
        let html = reader.ReadToEnd()
        html
    with
        | :? System.UriFormatException -> ""
        | :? System.Net.WebException -> ""

When an exception is thrown, a value is created that records information about the exception. This value is matched against the earlier type-test patterns. It may also be bound directly and manipulated in the with clause of the try ... with constructs. For example, all exception values support the Message property:

> try
      raise (new System.InvalidOperationException ("invalid operation"))
  with
      | err -> printfn "oops, msg = '%s'" err.Message;;
oops, msg = 'invalid operation'

Using try . . . finally

Exceptions may also be processed using the try ... finally ... construct. This guarantees to run the finally clause both when an exception is thrown and when the expression evaluates normally. This allows you to ensure that resources are disposed after the completion of an operation. For example, you can ensure that the web response from the previous example is closed as follows:

let httpViaTryFinally(url: string) =
    let req = System.Net.WebRequest.Create(url)
    let resp = req.GetResponse()
    try
        let stream = resp.GetResponseStream()
        let reader = new StreamReader(stream)
        let html = reader.ReadToEnd()
        html
    finally
        resp.Close()

In practice, you can use a shorter form to close and dispose of resources, simply by using a use binding instead of a let binding. This closes the response at the end of the scope of the resp variable, a technique that is discussed in full in Chapter 8. Here is how the previous function looks using this form:

let httpViaUseBinding(url: string) =
    let req = System.Net.WebRequest.Create(url)
    use resp = req.GetResponse()
    let stream = resp.GetResponseStream()
    let reader = new StreamReader(stream)
    let html = reader.ReadToEnd()
    html

Defining New Exception Types

F# lets you define new kinds of exception objects that carry data in a conveniently accessible form. For example, here is a declaration of a new class of exceptions and a function that wraps http with a filter that catches particular cases:

exception BlockedURL of string

let http2 url =
    if url = "http://www.kaos.org"
    then raise(BlockedURL(url))
    else http url

You can extract the information from F# exception values, again using pattern matching:

> try
     raise(BlockedURL("http://www.kaos.org"))
  with
     | BlockedURL(url) -> printf "blocked! url = '%s'
" url;;
blocked! url = 'http://www.kaos.org'

Exception values are always subtypes of the F# type exn, an abbreviation for the .NET type System.Exception. The declaration exception BlockedURL of string is shorthand for defining a new F# class type BlockedURLException, which is a subtype of System.Exception. Exception types can also be defined explicitly by defining new object types. Chapters 5 and 6 look more closely at object types and subtyping.

Table 4-3 summarizes the exception-related language and library constructs.

Table 4.3. Exception-Related Language and Library Constructs

Example Code

Kind

Notes

raise expr

F# library function

Raises the given exception

failwith expr

F# library function

Raises an Exception exception

try expr with rules

F# expression

Catches expressions matching the pattern rules

try expr finally expr

F# expression

Executes the finally expression both when the computation is successful and when an exception is raised

| :? ArgumentException ->

F# pattern rule

A rule matching the given .NET exception type

| :? ArgumentException as e ->

F# pattern rule

A rule matching the given .NET exception type and naming it as its stronger type

| Failure(msg) -> expr

F# pattern rule

A rule matching the given data-carrying F# exception

| exn -> expr

F# pattern rule

A rule matching any exception, binding the name exn to the exception object value

| exn when expr -> expr

F# pattern rule

A rule matching the exception under the given condition, binding the name exn to the exception object value

Having an Effect: Basic I/O

Imperative programming and input/output are closely related topics. The following sections show some very simple I/O techniques using F# and .NET libraries.

Very Simple I/O: Reading and Writing Files

The .NET types System.IO.File and System.IO.Directory contain a number of simple functions to make working with files easy. For example, here's a way to output lines of text to a file:

> open System.IO;;

> File.WriteAllLines("test.txt", [| "This is a test file.";
                                    "It is easy to read." |]);;
val it : unit = ()

Many simple file-processing tasks require reading all the lines of a file. You can do this by reading all the lines in one action as an array using System.IO.File.ReadAllLines:

> open System.IO;;

> File.ReadAllLines("test.txt");;
val it : string [] = [| "This is a test file.";  "It is easy to read." |]

If necessary, the entire file can be read as a single string using System.IO.File.ReadAllText:

> File.ReadAllText("test.txt");;
val it : string = "This is a test file.
It is easy to read
"

You can also use the results of System.IO.File.ReadAllLines as part of a list or sequence defined using a sequence expression:

> [ for line in File.ReadAllLines("test.txt") do
       let words = line.Split [| ' ' |]
       if words.Length > 3 && words.[2] = "easy" then
           yield line ];;
val it : string list = [| "It is easy to read." |]

.NET I/O via Streams

The .NET namespace System.IO contains the primary .NET types for reading/writing bytes and text to/from data sources. The primary output constructs in this namespace are as follows:

  • System.IO.BinaryWriter: Writes primitive data types as binary values. Create using new BinaryWriter(stream). You can create output streams using File.Create(filename).

  • System.IO.StreamWriter: Writes textual strings and characters to a stream. The text is encoded according to a particular Unicode encoding. Create by using new StreamWriter(stream) and its variants or by using File.CreateText(filename).

  • System.IO.StringWriter: Writes textual strings to a StringBuilder, which eventually can be used to generate a string.

Here is a simple example of using System.IO.File.CreateText to create a StreamWriter and write two strings:

> let outp = File.CreateText("playlist.txt");;
val outp : StreamWriter

> outp.WriteLine("Enchanted");;
val it : unit = ()
> outp.WriteLine("Put your records on");;
val it : unit = ()

> outp.Close();

These are the primary input constructs in the System.IO namespace:

  • System.IO.BinaryReader: Reads primitive data types as binary values. When reading the binary data as a string, it interprets the bytes according to a particular Unicode encoding. Create using new BinaryReader(stream).

  • System.IO.StreamReader: Reads a stream as textual strings and characters. The bytes are decoded to strings according to a particular Unicode encoding. Create by using new StreamReader(stream) and its variants or by using File.OpenText(filename).

  • System.IO.StringReader: Reads a string as textual strings and characters.

Here is a simple example of using System.IO.File.OpenText to create a StreamReader and read two strings:

> let inp = File.OpenText("playlist.txt");;
val inp : StreamReader

> inp.ReadLine();;
val it : string = "Enchanted"

> inp.ReadLine();;
val it : string = "Put your records on"

> inp.Close();;
val it : unit = ()

Tip

Whenever you create objects such as a StreamReader that have a Close or Dispose operation or that implement the IDisposable interface, you should consider how to eventually close or otherwise dispose of the resource. We discuss this later in this chapter and in Chapter 8.

Some Other I/O-Related Types

The System.IO namespace contains a number of other types, all of which are useful for corner cases of advanced I/O but that you won't need to use from day to day. For example, the following abstractions appear in the .NET documentation:

  • System.IO.TextReader: Reads textual strings and characters from an unspecified source. This is the common functionality implemented by the StreamReader and StringReader types and the System.Console.In object. The latter is used to access the stdin input.

  • System.IO.TextWriter: Writes textual strings and characters to an unspecified output. This is the common functionality implemented by the StreamWriter and StringWriter types and the System.Console.Out and System.Console.Error objects. The latter are used to access the stdout and stderr output streams.

  • System.IO.Stream: Provides a generic view of a sequence of bytes.

Some functions that are generic over different kinds of output streams make use of these; for example, the formatting function twprintf discussed in the section "Using printf and Friends" writes to any System.IO.TextWriter.

Using System.Console

Some simple input/output routines are provided in the System.Console class. For example:

> System.Console.WriteLine("Hello World");;
Hello World

> System.Console.ReadLine();;
<enter "I'm still here" here>
val it : string = "I'm still here"

The System.Console.Out object can also be used as a TextWriter.

Using printf and Friends

Throughout this book, you've used the printfn function, which is one way to print strings from F# values. This is a powerful, extensible technique for type-safe formatting. A related function called sprintf builds strings:

> sprintf "Name: %s, Age: %d" "Anna" 3;;
val it : string = "Name: Anna, Age: 3"

The format strings accepted by printf and sprintf are recognized and parsed by the F# compiler, and their use is statically type checked to ensure the arguments given for the formatting holes are consistent with the formatting directives. For example, if you use an integer where a string is expected, you see a type error:

> sprintf "Name: %s, Age: %d" 3 10;;
------------------------------^
error: FS0001: This expression was expected to have type string but here
has type int

Several printf-style formatting functions are provided in the Microsoft.FSharp.Text.Printf module. Table 4-4 shows the most important of these.

Table 4.4. Formatting Functions in the Printf Module

Function(s)

Outputs via Type

Outputs via Object

Example

[a]

  

printf(n)a

TextWriter

Console.Out

printf "Result: %d" res

eprintf(n)

TextWriter

Console.Error

eprintf "Result: %d" res

fprintf(n)

TextWriter

Any TextWriter

twprintf stderr "Error: %d" res

sprintf

string

Generates strings

sprintf "Error: %d" res

bprintf(n)

StringBuilder

Any StringBuilder

bprintf buf "Error: %d" res

[a] The functions with a suffix n add a new line to the generated text.

Table 4-5 shows the basic formatting codes for printf-style formatting.

Table 4.5. Formatting Codes for printf-style String and Output Formatting

Code

Type Accepted

Notes

%b

bool

Prints true or false

%s

string

Prints the string

%d, %x, %X, %o, %u

int/int32

Decimal/hex/octal format for any integer types

%e, %E, %f, %g

float

Floating-point formats

%M

decimal

See the .NET documentation

%A

Any type

Uses structured formatting, discussed in the section "Generic Structural Formatting" and in Chapter 5

%O

Any type

Uses Object.ToString()

%a

Any type

Takes two arguments: one is a formatting function, and one is the value to format

%t

Function

Runs the function given as an argument

Any value can be formatted using a %O or %A pattern; these patterns are extremely useful when you're prototyping or examining data. %O converts the object to a string using the Object.ToString() function supported by all values. For example:

> System.DateTime.Now.ToString();;
val it : string = "28/06/20.. 17:14:07 PM"

> sprintf "It is now %O" System.DateTime.Now;;
val it : string = "It is now 28/06/20... 17:14:09"

Note

The format strings used with printf are scanned by the F# compiler during type checking, which means the use of the formats are type-safe; if you forget arguments, a warning is given, and if your arguments are of the wrong type, an error is given. The format strings may also include the usual range of specifiers for padding and alignment used by languages such as C, as well as some other interesting specifiers for computed widths and precisions. You can find the full details in the F# library documentation for the Printf module.

Generic Structural Formatting

Object.ToString() is a somewhat undirected way of formatting data. Structural types such as tuples, lists, records, discriminated unions, collections, arrays, and matrices are often poorly formatted by this technique. The %A pattern uses .NET reflection to format any F# value as a string based on the structure of the value. For example:

> printf "The result is %A
" [1;2;3];;
"The result is [1; 2; 3]"

Generic structural formatting can be extended to work with any user-defined data types, a topic covered on the F# web site. This is covered in detail in the F# library documentation for the printf function.

Cleaning Up with IDisposable, use, and using

Many constructs in the System.IO namespace need to be closed after use, partly because they hold on to operating system resources such as file handles. You can ignore this issue when prototyping code in F# Interactive. However, as we touched on earlier in this chapter, in more polished code, you should use language constructs such as use var = expr to ensure that the resource is closed at the end of the lexical scope where a stream object is active. For example:

let myWriteStringToFile () =
    use outp = File.CreateText(@"playlist.txt")
    outp.WriteLine("Enchanted")
    outp.WriteLine("Put your records on")

This is equivalent to the following:

let myWriteStringToFile () =
    using (File.CreateText(@"playlist.txt")) (fun outp ->
        outp.WriteLine("Enchanted")
        outp.WriteLine("Put your records on"))

where the function using has the following definition in the F# library:

let using (ie : #System.IDisposable) f =
    try f(ie)
    finally ie.Dispose()

use and using ensure that the underlying stream is closed deterministically and the operating system resources are reclaimed when the lexical scope is exited. This happens regardless of whether the scope is exited because of normal termination or because of an exception. Chapter 8 covers the language construct use, the operator using, and related issues in more detail.

Note

If you don't use using or otherwise explicitly close the stream, the stream is closed when the stream object is finalized by the .NET garbage collector. However, it's generally bad practice to rely on finalization to clean up resources this way, because finalization isn't guaranteed to happen in a deterministic, timely fashion.

Working with null Values

The keyword null is used in imperative programming languages as a special, distinguished value of a type that represents an uninitialized value or some other kind of special condition. In general, null isn't used in conjunction with types defined in F# code, although it's common to simulate null with a value of the option type. For example:

> let parents = [("Adam",None); ("Cain",Some("Adam","Eve"))];;
val parents : (string * (string * string) option) list = ...

However, reference types defined in other .NET languages do support null; when using .NET APIs, you may have to explicitly pass null values to the API and also, where appropriate, test return values for null. The .NET Framework documentation specifies when null may be returned from an API. It's recommended that you test for this condition using null pattern tests. For example:

match System.Environment.GetEnvironmentVariable("PATH") with
| null -> printf "the environment variable PATH is not defined
"
| res -> printf "the environment variable PATH is set to %s
" res

The following is a function that incorporates a pattern type test and a null-value test:

let switchOnType (a:obj) =
    match a with
    | null                     -> printf "null!"
    | :? System.Exception as e -> printf "An exception: %s!" e.Message
    | :? System.Int32 as i     -> printf "An integer: %d!" i
    | :? System.DateTime as d  -> printf "A date/time: %O!" d
    | _                        -> printf "Some other kind of object
"

There are other important sources of null values. For example, the semisafe function Array.zeroCreate creates an array whose values are initially null or, in the case of value types, an array each of whose entries is the zero bit pattern. This function is included with F# primarily because there is no other alternative technique to initialize and create the array values used as building blocks of larger, more sophisticated data structures such as queues and hash tables. Of course, you must use this function with care, and in general you should hide the array behind an encapsulation boundary and be sure the values of the array aren't referenced before they're initialized.

Note

Although F# generally enables you to code in a null-free style, F# isn't totally immune to the potential existence of null values: they can come from the .NET APIs, and it's also possible to use Array.zeroCreate and other back-door techniques to generate null values for F# types. If necessary, APIs can check for this condition by first converting F# values to the obj type by calling box and then testing for null (see the F# Informal Language Specification for full details). But in practice, this isn't required by the vast majority of F# programs; for most purposes, the existence of null values can be ignored.

Some Advice: Functional Programming with Side Effects

F# stems from a tradition in programming languages where the emphasis has been on declarative and functional approaches to programming in which state is made explicit, largely by passing extra parameters. Many F# programmers use functional programming techniques first before turning to their imperative alternatives, and we encourage you to do the same, for all the reasons listed at the start of this chapter.

However, F# also integrates imperative and functional programming together in a powerful way. F# is actually an extremely succinct imperative programming language! Furthermore, in some cases, no good functional techniques exist to solve a problem, or those that do are too experimental for production use. This means that in practice, using imperative constructs and libraries is common in F#: for example, many of the examples you saw in Chapters 2 and 3 used side effects to report their results or to create GUI components.

Regardless, we still encourage you to think functionally, even about your imperative programming. In particular, it's always helpful to be aware of the potential side effects of your overall program and the characteristics of those side effects. The following sections describe five ways to help tame and reduce the use of side effects in your programs.

Consider Replacing Mutable Locals and Loops with Recursion

When imperative programmers begin to use F#, they frequently use mutable local variables or reference cells heavily as they translate code fragments from their favorite imperative language into F#. The resulting code often looks very bad. Over time, they learn to avoid many uses of mutable locals. For example, consider the following (naive) implementation of factorization, transliterated from C code:

let factorizeImperative n =
    let mutable primefactor1 = 1
    let mutable primefactor2 = n
    let mutable i = 2
    let mutable fin = false
    while (i < n && not fin) do
        if (n % i = 0) then
            primefactor1 <- i
            primefactor2 <- n / i
            fin <- true
        i <- i + 1
    if (primefactor1 = 1) then None
    else Some (primefactor1, primefactor2)

This code can be replaced by the following use of an inner recursive function:

let factorizeRecursive n =
    let rec find i =
        if i >= n then None
        elif (n % i = 0) then Some(i,n / i)
        else find (i+1)
    find 2

The second code is not only shorter but also uses no mutation, which makes it easier to reuse and maintain. You can also see that the loop terminates (i is increasing toward n) and see the two exit conditions for the function (i >= n and n % i = 0). Note that the state i has become an explicit parameter.

Separate Pure Computation from Side-Effecting Computations

Where possible, separate out as much of your computation as possible using side-effect-free functional programming. For example, sprinkling printf expressions throughout your code may make for a good debugging technique but, if not used wisely, can lead to code that is difficult to understand and inherently imperative.

Separate Mutable Data Structures

A common technique of object-oriented programming is to ensure that mutable data structures are private, nonescaping, and, where possible, fully separated, which means there is no chance that distinct pieces of code can access each other's internal state in undesirable ways. Fully separated state can even be used inside the implementation of what, to the outside world, appears to be a purely functional piece of code.

For example, where necessary, you can use side effects on private data structures allocated at the start of an algorithm and then discard these data structures before returning a result; the overall result is then effectively a side-effect-free function. One example of separation from the F# library is the library's implementation of List.map, which uses mutation internally; the writes occur on an internal, separated data structure that no other code can access. Thus, as far as callers are concerned, List.map is pure and functional. The following is a second example that divides a sequence of inputs into equivalence classes (the F# library function Seq.groupBy does a similar thing):

open System.Collections.Generic

let divideIntoEquivalenceClasses keyf seq =

    // The dictionary to hold the equivalence classes
    let dict = new Dictionary<'key,ResizeArray<'T>>()

    // Build the groupings
    seq |> Seq.iter (fun v ->
        let key = keyf v
        let ok,prev = dict.TryGetValue(key)
        if ok then prev.Add(v)
        else let prev = new ResizeArray<'T>()
             dict.[key] <- prev
             prev.Add(v))

    // Return the sequence-of-sequences. Don't reveal the
    // internal collections: just reveal them as sequences
    dict |> Seq.map (fun group -> group.Key, Seq.readonly group.Value)

This uses the Dictionary and ResizeArray mutable data structures internally, but these mutable data structures aren't revealed externally. The inferred type of the overall function is as follows:

val divideIntoEquivalenceClasses : ('T -> 'key) -> seq<'T> -> seq<'key * seq<'T>>

Here is an example use:

> divideIntoEquivalenceClasses (fun n -> n % 3) [ 0 .. 10 ];;

val it : seq<int * seq<int>>
= seq [(0, seq [0; 3; 6; 9]); (1, seq [1; 4; 7; 10]); (2, seq [2; 5; 8])]

Not All Side Effects Are Equal

It's often helpful to use the weakest set of side effects necessary to achieve your programming task and at least be aware when you're using strong side effects:

  • Weak side effects are effectively benign given the assumptions you're making about your application. For example, writing to a log file is very useful and is essentially benign (if the log file can't grow arbitrarily large and crash your machine!). Similarly, reading data from a stable, unchanging file store on a local disk is effectively treating the disk as an extension of read-only memory, so reading these files is a weak form of side effect that isn't difficult to incorporate into your programs.

  • Strong side effects have a much more corrosive effect on the correctness and operational properties of your program. For example, blocking network I/O is a relatively strong side effect by any measure. Performing blocking network I/O in the middle of a library routine can have the effect of destroying the responsiveness of a GUI application, at least if the routine is invoked by the GUI thread of an application. Any constructs that perform synchronization between threads are also a major source of strong side effects.

Whether a particular side effect is stronger or weaker depends very much on your application and whether the consequences of the side effect are sufficiently isolated and separated from other entities. Strong side effects can and should be used freely in the outer shell of an application or when you're scripting with F# Interactive; otherwise, not much can be achieved.

When you're writing larger pieces of code, you should write your application and libraries in such a way that most of your code either doesn't use strong side effects or at least makes it obvious when these side effects are being used. Threads and concurrency are commonly used to mediate problems associated with strong side effects; Chapter 14 covers these issues in more depth.

Avoid Combining Imperative Programming and Laziness

It's generally thought to be bad style to combine delayed computations (that is, laziness) and side effects. This isn't entirely true; for example, it's reasonable to set up a read from a file system as a lazy computation using sequences. However, it's relatively easy to make mistakes in this sort of programming. For example, consider the following code:

open System.IO
let reader1, reader2 =
    let reader = new StreamReader(File.OpenRead("test.txt"))
    let firstReader() = reader.ReadLine()
    let secondReader() = reader.ReadLine()

    // Note: we close the stream reader here!
    // But we are returning function values which use the reader
    // This is very bad!
    reader.Close()
    firstReader, secondReader

// Note: stream reader is now closed! The next line will fail!
let firstLine = reader1()
let secondLine = reader2()
firstLine, secondLine

This code is wrong because the StreamReader object reader is used after the point indicated by the comment. The returned function values are then called, and they try to read from the captured variable reader. Function values are just one example of delayed computations: other examples are lazy values, sequences, and any objects that perform computations on demand. Be careful not to build delayed objects such as reader that represent handles to transient, disposable resources, unless those objects are used in a way that respects the lifetime of that resource.

The previous code can be corrected to avoid using laziness in combination with a transient resource:

open System.IO

let line1, line2 =
    let reader = new StreamReader(File.OpenRead("test.txt"))
    let firstLine = reader.ReadLine()
    let secondLine = reader.ReadLine()
    reader.Close()
    firstLine, secondLine

Another technique uses language and/or library constructs that tie the lifetime of an object to some larger object. For example, you can use a use binding within a sequence expression, which augments the sequence object with the code needed to clean up the resource when iteration is finished or terminates. This technique is discussed further in Chapter 8 and shown by example here:

let reader =
    seq { use reader = new StreamReader(File.OpenRead("test.txt"))
          while not reader.EndOfStream do
              yield reader.ReadLine() }

The general lesson is to try to keep your core application pure. Use both delayed computations (laziness) and imperative programming (side effects) where appropriate, but be careful about using them together.

Summary

In this chapter, you learned how to do imperative programming in F#, from some of the basic mutable data structures such as reference cells to working with side effects such as exceptions and I/O. You also looked at some general principles for avoiding the need for imperative programming and isolating your uses of side effects. The next chapter returns to some of the building blocks of both functional and imperative programming in F#, with a deeper look at types, type inference, and generics.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset