Chapter 3. Creating Your First F# Program—Introducing Functional Programming

F# is effective and productive primarily because it's built on the tried and tested constructs of functional programming. This chapter covers the core building blocks of functional programming with F#, including simple types and function values, pattern matching, lists, options, and sequences; as well as how to declare some of your own simple types. Chapters 4 through 6 cover imperative programming, generics, and object-oriented programming.

Getting Started with F# Arithmetic

We first cover the most common base types of data manipulated in F# code, beginning with the basics of F# arithmetic.

Basic Literals

Table 3-1 lists the basic numeric types used in F# code and their corresponding literal forms. The table also lists the non-numeric types bool and unit.

Table 3.1. Basic Types and Literals

Type

Description

Sample Literals

.NET Name

bool

True/false values

true,false

System.Boolean

byte

8-bit unsigned integers

0uy,19uy,0xFFuy

System.Byte

sbyte

8-bit signed integers

0y,19y,0xFFy

System.SByte

int16

16-bit signed integers

0s,19s, 0x0800s

System.Int16

uint16

16-bit unsigned integers

0us,19us, 0x0800us

System.UInt16

int, int32

32-bit signed integers

0, 19, 0x0800, 0b0001

System.Int32

uint32

32-bit unsigned integers

0u, 19u, 0x0800u

System.UInt32

int64

64-bit signed integers

0L, 19L, 0x0800L

System.Int64

uint64

64-bit unsigned integers

0UL, 19UL, 0x0800UL

System.UInt64

nativeint

Machine-sized signed integers

0n, 19n, 0x0800n

System.IntPtr

unativeint

Machine-sized unsigned integers

0un, 19un, 0x0800un

System.UIntPtr

single, float32

32-bit IEEE floating-point

0.0f, 19.7f, 1.3e4f

System.Single

double, float

64-bit IEEE floating-point

0.0, 19.7, 1.3e4

System.Double

decimal

High-precision decimal values

0M, 19M, 19.03M

System.Decimal

bigint

Arbitrarily large integers

0I, 19I

Math.BigInt

bignum

Arbitrary-precision rationals

0N, 19N

Math.BigNum

unit

The type with only one value

()

Core.Unit

Arithmetic Operators

Table 3-2 lists the most commonly used arithmetic operators. These are overloaded to work with all the numeric types listed in Table 3-1.

Table 3.2. Arithmetic Operators and Examples

Operator

Description

Sample Use on int

Sample Use on float

+

Unchecked addition

1 + 2

1.0 + 2.0

Unchecked subtraction

12 − 5

12.3 − 5.4

*

Unchecked multiplication

2 * 3

2.4 * 3.9

/

Division

5 / 2

5.0 / 2.0

%

Modulus

5 % 2

5.4 % 2.0

-

Unary negation

-(5+2)

-(5.4+2.4)

The behavior of these and other operators can be extended for user-defined types, a topic covered in Chapter 6. In F#, addition, subtraction, and multiplication over integers are unchecked; that is, if overflow or underflow occurs beyond the representable range, then wraparound occurs. For example, 2147483647 is the largest representable 32-bit integer of the int type:

> 2147483647+1;;
val it : int = −2147483648

You can access checked versions of arithmetic operators that raise System.OverflowException exceptions by opening the Microsoft.FSharp.Core.Operators.Checked module. If avoiding overflow is a priority, then using the decimal, bigint, and bignum types is recommended. Division by zero raises a System.DivideByZeroException exception, except in the case of floating-point numbers, where it returns one of the special floating-point numbers Infinity and -Infinity. Operator overloading interacts with type inference—if a use of an overloaded operator isn't otherwise constrained to work on a particular type, then F# assumes it works on 32-bit integers.

To constrain a use of an operator to a particular type, you must give a type annotation that has the effect of telling the compiler the type on the left of the two arguments to the binary operator. For example, in the absence of additional type information, the following function is assumed to work with integers:

> let squareAndAdd a b = a * a + b;;
val squareAndAdd : int -> int -> int

A single type annotation on a is sufficient to indicate that a * a is an operation on float values and thus returns a float value, and that a * a + b is also an operation on float:

> let squareAndAdd (a:float) b = a * a + b;;
val squareAndAdd : float -> float -> float

If you want, you can also give full type annotations for the arguments and return type of a function:

> let squareAndAdd (a:float) (b:float) : float = a * a + b;;
val squareAndAdd : float -> float -> float

Bitwise Operations

All the integer types listed in Table 3-1 support bitwise manipulations on their underlying representations. Table 3-3 shows the bitwise manipulation operators.

Table 3.3. Bitwise Arithmetic Operators and Examples

Operator

Description

Sample Use

Result

&&&

Bitwise "and"

0x65 &&& 0x0F

0x05

|||

Bitwise "or"

0x65 ||| 0x18

0x7D

^^^

Bitwise "exclusive or"

0x65 ^^^ 0x0F

0x6A

~~~

Bitwise negation

~~~0x65

0xFFFFFF9a

<<<

Left shift

0x01 <<< 3

0x08

>>>

Right shift (arithmetic if signed)

0x65 >>> 3

0x0C

The following sample shows how to use these operators to encode 32-bit integers into 1, 2, or 5 bytes, represented by returning a list of integers. Integers in the range 0 to 127 return a list of length 1:

let encode (n: int32) =
    if   (n >= 0    && n <= 0x7F)   then [ n ]
    elif (n >= 0x80 && n <= 0x3FFF) then [ (0x80 ||| (n >>> 8)) &&& 0xFF;
                                           (n &&& 0xFF) ]
    else  [ 0xC0; ((n >>> 24) &&& 0xFF);
                  ((n >>> 16) &&& 0xFF);
                  ((n >>> 8)  &&& 0xFF);
                   (n         &&& 0xFF) ]

Here's an example of the function in action:

> encode 32;;
val it : int32 list = [32]

> encode 320;;
val it : int32 list = [129; 64]

> encode 32000;;
val it : int32 list = [192; 0; 0; 125; 0]

Arithmetic Conversions

Numeric types aren't implicitly converted—conversions between different numeric types must be made explicitly. You do this by using overloaded conversion operators. These work the same way as overloaded infix operators such as + and *. Table 3-4 shows the primary conversion operators.

Table 3.4. Overloaded Arithmetic Conversions and Examples

Operator

Description

Sample Use

Result

sbyte

Convert/truncate to sbyte

sbyte (−17)

-17y

byte

Convert/truncate to byte

byte 255

255uy

int16

Convert/truncate to int16

int16 0

0s

uint16

Convert/truncate to uint16

uint16 65535

65535us

int/int32

Convert/truncate to int32

int 17.8

17

uint32

Convert/truncate to uint32

uint32 12

12u

int64

Convert/truncate to int64

int64 (−100.4)

−100L

uint64

Convert/truncate to uint64

uint64 1

1UL

float32

Convert to float32/single

float32 65

65.0f

float

Convert to float/double

float 65

65.0

These conversions are all unchecked in the sense that they won't raise exceptions. Again, the Microsoft.FSharp.Core.Operators.Checked module has corresponding definitions of these operators. An alternative is to use the .NET static methods contained in the type System.Convert, such as System.Convert.ToDouble( ). These do perform checking, which means they raise an exception if the source number can't be represented within the numeric range of the target type. As with many .NET constructs, uses of System.Convert methods may require type annotations to resolve overloading, discussed further in Chapters 5 and 6.

Arithmetic Comparisons

When used with numeric values, the binary comparison operators =, <>, <, <=, >, >=, min, and max perform comparisons according to the natural ordering for each particular numeric type. You can also use these operators on other data types, such as to compare lists of integers, and you can customize their behavior for new types you define. Chapter 5 discusses generic comparison in detail, and Chapter 8 discusses customizing generic comparison.

When used with floating-point values, these operators implement the IEEE semantics for NaN (Not a Number) values. For example, (NaN = NaN) is false, as are (NaN <= NaN) and (NaN < NaN).

Overloaded Math Functions

The module Microsoft.FSharp.Core.Operators includes the definition of a number of useful overloaded math operators. These are shown in Table 3-5 and are overloaded either on a suitable range of integer types or on the basic floating-point types.

Table 3.5. Overloaded Math Functions and Examples

Function

Description

Sample Use

Result

abs

Absolute value of signed numeric types

abs (−10.0f)

10.0f

cos, sin, tan

Trigonometric functions

cos 0.0

1.0

cosh, sinh, tanh

Hyperbolic trigonometric functions

cosh 1.0

1.543080635

acos, asin, atan, atan2

Inverse trigonometric functions

acos 1.0

0.0

ceil, floor

Round up, round down

ceil 1.001

2.0

truncate

Round toward zero

truncate 8.9

8.0

exp, log, log10

Exponent, logarithm, base-10 logarithm

exp 1.0

2.718281828

( ** )

Power

2.0 ** 4.0

16.0

Introducing Simple Strings

The F# type string is an abbreviation for the .NET type System.String and represents a sequence of Unicode UTF-16 characters. The following sections briefly introduce strings and the most useful functions for formatting them.

Working with String Literals and Primitives

Table 3-6 shows the different forms for writing string literals.

Table 3.6. String and Character Literals

Example

Kind

Type

"Humpty Dumpty"

String

string

"c:\Program Files"

String

string

@"c:Program Files"

Verbatim string

string

"xyZy3d2"B

Literal byte array

byte []

'c'

Character

char

Table 3-7 shows the escape characters you can use in strings and characters.

Table 3.7. Escape Characters in Nonverbatim Strings

Escape

Character

ASCII/Unicode Value

Examples

New line

10

" "

Carriage return

13

" "

Tab

9

" "



Backspace

8

 

NNN

Trigraph

NNN

"32" (space)

uNNNN

Unicode character

NNNN

"u00a9" (©)

UNNNNNNNN

Long Unicode character

NNNN NNNN

"U00002260"(_)

As shown in Table 3-6, a literal form is also available for arrays of bytes: the characters are interpreted as ASCII characters, and non-ASCII characters can be embedded using escape codes. This can be useful when you're working with binary protocols:

> "MAGIC"B;;
val it : byte [] = [|77uy; 65uy; 71uy; 73uy; 67uy|]

Verbatim string literals are particularly useful for file and path names that contain the backslash character ():

> let dir  = @"c:Program Files";;
val dir : string

You can also use multiline string literals:

> let s = "All the kings horses
- and all the kings men";;
val s : string

The operator .[] is used to access the elements of a string, and the property Length retrieves its length:

> let s = "Couldn't put Humpty";;
val s : string

> s.Length;;
val it : int = 19

> s.[13];;
val it : char = 'H'

The operator .[index..index] can be used to take substrings:

> let s = "Couldn't put Humpty";;
val s : string

> s.[13..16];;
val it : string = "Hump"

Strings are immutable; that is, a string value can't be modified after it's built. For example, the Substring method on the string type doesn't modify the original string but returns a new string representing the result. As mentioned in Chapter 2, immutability is a key concept for many F# values, and you encounter it many places in this book. If you attempt to mutate a string, you get an error like the one shown here:

> let s = "Couldn't put Humpty";;
val s : string = "Couldn't put Humpty"

> s.[13] <- 'h';;

  s.[13] <- 'h';;
  ^^
stdin(75,0): error: FS0001: Type error in the use of the overloaded operator
'set_Item'. The type 'string' does not support any operators named 'set_Item'

Building Strings

The simplest way to build strings is via concatenation using the + operator:

> "Couldn't put Humpty" + " " + "together again";;
val it : string = "Couldn't put Humpty together again"

You can also build strings using objects of the .NET type System.Text.StringBuilder. These objects are mutable buffers that you can use to accumulate and modify text, and they're more efficient than repeated uses of the + operator. Here's an example:

> let buf = new System.Text.StringBuilder();;
val buf : System.Text.StringBuilder

> buf.Append("Humpty Dumpty");;

> buf.Append(" sat on the wall");;

> buf.ToString();;
val it : string = "Humpty Dumpty sat on the wall"

Note

For compatibility with OCaml, the ^ operator can also be used for string concatenation, although it's generally used only when cross-compiling code with OCaml.

Working with Lists and Options

Some of the foundational data structures of F# coding are tuples, lists, and options. The following sections discuss these and some related topics by example.

Using F# Lists

F# lists are a common data structure used in functional programming. You saw some examples of concrete lists in Chapter 2. Table 3-8 shows the primitive constructs for building lists.

Table 3.8. Some List-Related Language Constructs and Operators

Operator/Expression

Description

Examples

[]

The empty list

[]

expr :: expr

"Cons" an element with a list

1 :: [2; 3]

[expr; ...; expr]

A list value

[1; 2; 3]

[expr .. expr]

A range of integers

[1 .. 99]

[ for x in list ... ]

A generated list (see end of chapter)

[ for x in 1..99 -> x * x ]

expr @ expr

Concatenates two lists

[1; 2] @ [3]

Here are some basic list values:

let oddPrimes = [3; 5; 7; 11]
let morePrimes = [13; 17]
let primes = 2 :: (oddPrimes @ morePrimes)

The value and type of primes are as follows:

val primes : int list = [2; 3; 5; 7; 11; 13; 17]

Lists are immutable: the cons :: and append @ operations don't modify the original lists; instead, they create new lists. You can see this in the following interactive session:

> let people = [ "Adam"; "Dominic"; "James" ];;
val people : string list

> people;;
val it : string list = [ "Adam"; "Dominic"; "James" ]

> "Chris" :: people;;
val it : string list = [ "Chris"; "Adam"; "Dominic"; "James" ]

> people;;
val it : string list = [ "Adam"; "Dominic"; "James" ]

Note that people has not been changed by the construction of a new list using the cons operator. F# lists are represented in memory as linked lists; each F# list value is a cons cell containing a value plus a pointer to the next chain in the list, or else it's a special nil object. When you create a new list using the :: operator, then the tail of the new list points to the old list, which ensures that the inner memory associated with lists is often reused as part of multiple list values.

You can decompose lists from the head downward by using pattern matching. You saw some simple examples of pattern matching on tuples in Chapter 2, and we look at pattern matching in more detail in "Getting Started with Pattern Matching" later in this chapter. Here is an example of using pattern matching with lists:

let printFirst primes =
    match primes with
    | h :: t -> printfn "The first prime in the list is %d" h
    | [] -> printfn "No primes found in the list"
> printFirst oddPrimes;;
The first prime in the list is 3
val it : unit = ()

The first line after the match is a pattern-matching rule that matches the input primes against the pattern h :: t. If primes is a nonempty list, then the match is successful, and the first printfn is executed with h bound to the head of the list and t to its tail. The second line considers the case where primes is an empty list. Note that the :: and [] symbols can be used both to build up lists in expressions and to decompose them in pattern matching. The F# library also includes a module List that contains some useful functions related to programming with lists. You see many of these functions in the next section and throughout this book. Table 3-9 shows some of them.

F# lists aren't appropriate for all circumstances; for example, very large data structures should probably be represented using arrays or other data structures or even managed by an external tool such as a relational database. We discuss a number of immutable data structures in the "Some Common Immutable Data Structures" sidebar.

Table 3.9. Some Sample Functions in the List Module

Function

Type

Description

List.length

'T list -> int

Returns the length of the list.

List.head

'T list -> 'T

Returns the first element of a nonempty list.

List.tail

'T list -> 'T list

Returns all the elements of a nonempty list except the first.

List.init

int -> (int -> 'T) -> 'T list

Returns a new list. The length of the new list is specified by the first parameter. The second parameter must be a generating function that maps list indexes to values.

List.append

'T list -> 'T list -> 'T list

Returns a new list containing the elements of the first list followed by the elements of the second list.

List.filter

('T -> bool) -> 'T list -> 'T list

Returns a new list containing only those elements of the original list where the function returns true.

List.map

('T -> 'U ) -> 'T list -> 'U list

Creates a new list by applying a function to each element of the original list.

List.iter

('T -> unit) -> 'T list -> unit

Executes the given function for each element of the list.

List.unzip

('T * 'U) list -> 'T list * 'U list

Returns two new lists containing the first and second elements of the pairs in the input list.

List.zip

'T list -> 'U list -> ('T * 'U) list

Returns a new list containing the elements of the two input lists combined pairwise. The input lists must be the same length; otherwise, an exception is raised.

List.toArray

'T list -> 'T[]

Creates an array from the given list.

List.ofArray

'T[] -> 'T list

Creates a list from the given array.

Here are examples of how to use some of the functions from Table 3-9. The last two examples use function values, which we cover in more detail in "Introducing Function Values" later in this chapter.

> List.head [5; 4; 3];;
val it : int = 5

> List.tail [5; 4; 3];;
val it : int list = [ 4; 3 ]

> List.map (fun x -> x*x) [1; 2; 3];;
val it : int list = [ 1; 4; 9 ]

> List.filter (fun x -> x % 3 = 0) [2; 3; 5; 7; 9];;
val it : int list = [ 3; 9 ]

Using F# Option Values

Like lists and tuples, option values are simple constructs frequently used as workhorses in F# coding. An option is simply either a value Some(v) or the absence of a value None. For example, options are useful for returning the value of a search where you may or may not have a result. You see in the section "Defining Discriminated Unions" that the option type is defined in the F# library as follows:

type 'T option =
    | None
    | Some of 'T

The following is a data structure that uses options to represent the (optional) parents of some well-known characters:

> let people = [ ("Adam", None);
                 ("Eve" , None);
                 ("Cain", Some("Adam","Eve"));
                 ("Abel", Some("Adam","Eve")) ];;
val people : (string * (string *string) option) list

Pattern matching is frequently used to examine option values:

> let showParents (name,parents) =
      match parents with
      | Some(dad,mum) -> printfn "%s has father %s, mother %s" name dad mum
      | None          -> printfn "%s has no parents!" name;;
val showParents : (string * (string * string) option) -> unit

> showParents ("Adam",None);;
Adam has no parents
val it : unit = ()

The F# library also includes a module Option that contains useful functions for programming with options. Table 3-10 shows some of these. Although it's easy to code them by hand using pattern matching, it can also be useful to learn and rely on the standard definitions.

Table 3.10. Some Sample Functions in the Option Module

Function

Type

Description

Option.get

'T option -> 'T

Returns the value of a Some option; otherwise raises an exception.

Option.isNone

'T option -> bool

Returns true for a None option.

Option.map

('T -> 'U) -> 'T option -> 'U option

Given None, returns None. Given Some(x), returns Some(f x), where f is the given function.

Option.iter

('T -> unit) -> 'T option -> unit

Applies the given function to the value of a Some option; otherwise, does nothing.

Using Option Values for Control

You can use option values for both data and control; they're often used to represent the success or failure of a computation. This can be useful when you're catching an exception, as shown in the following example (which uses the function http from Chapter 2):

let fetch url =
    try Some (http url)
    with :? System.Net.WebException -> None

Chapter 4 describes exceptions in more detail—what matters here is that if a network error occurs during the HTTP request, the exception is caught, and the result of the fetch function is the value None. Successful web page requests return a Some value. Option values can then be discriminated and decomposed using pattern matching, as shown here:

> match (fetch "http://www.nature.com") with
  | Some text -> printfn "text = %s" text
  | None -> printfn "**** no web page found";;
text = <HTML> ... </HTML>  (note: the HTML is shown here if connected to the web)
val it : unit = ()

Working with Conditionals: && and ||

A basic control construct in F# programming is if/then/elif/else. Here's an example:

let round x =
    if x >= 100 then 100
    elif x < 0 then 0
    else x

Conditionals are really shorthand for pattern matching; for example, the previous code could be written like this:

let round x =
    match x with
    | _ when x >= 100 -> 100
    | _ when x < 0    -> 0
    | _               -> x

Conditionals are always guarded by a Boolean-valued expression. You can build them using && and || (the "and" and "or" operators) as well as any library functions that return Boolean values:

let round2 (x, y) =
    if x >= 100 || y >= 100 then 100,100
    elif x < 0 || y < 0 then 0,0
    else x,y

The operators && and || have the usual short-circuit behavior in that the second argument of && is evaluated only if the first evaluates to true, and likewise, the second argument of || is evaluated only if the first evaluates to false.

Defining Recursive Functions

One of the fundamental building blocks of computation in F# is recursion. The following code shows a simple, well-known recursive function:

> let rec factorial n = if n <= 1 then 1 else n * factorial (n-1);;
val factorial : int -> int

> factorial 5;;
val it : int = 120

This example shows that a recursive function is simply one that can call itself as part of its own definition. Recursive functions are introduced by let rec. Functions aren't recursive by default, because it's wise to isolate recursive functions to help you control the complexity of your algorithms and keep your code maintainable. It may help to visualize the execution of factorial 5 in the following way (note that in reality, F# executes the function using efficient native code):

factorial 5
= 5 * factorial 4
= 5 * (4 * factorial 3)
= 5 * (4 * (3 * factorial 2))
= 5 * (4 * (3 * (2 * factorial 1 )))
= 5 * (4 * (3 * (2 * 1)))
= 5 * (4 * (3 * 2))
= 5 * (4 * 6)
= 5 * 24
= 120

As with all calls, the execution of the currently executing instance of the function is suspended while a recursive call is made.

Many of the operators you've encountered so far can be coded as recursive functions. For example, the following is one possible implementation of List.length:

let rec length l =
    match l with
    | [] -> 0
    | h :: t -> 1 + length t

Likewise, many functions such as List.iter are often implemented using recursive functions.

Recursion is sometimes used as a means of programming particular patterns of control. For example, the following code repeatedly fetches the HTML for a particular web page, printing each time it's fetched:

let rec repeatFetch url n =
    if n > 0 then
        let html = http url
        printfn "fetched <<< %s >>> on iteration %d" html n
        repeatFetch url (n-1)

Recursion is powerful but not always the ideal way to encode either data manipulations or control constructs, at least if other techniques are readily available. For example, the previous program could be implemented using a for loop, as explained in Chapter 4, which would be clearer. Likewise, you should typically avoid explicit recursion if an operator is available that captures the pattern of recursion being used. For example, many explicit loops and recursive functions can be replaced by uses of functions such as List.map and Array.map.

A typical error with recursion is to forget to decrement a variable at the recursive call. For example, we incorrectly entered the following nonterminating function when writing this chapter:

let rec badFactorial n = if n <= 0 then 1 else n * badFactorial n

You should always check your recursive calls to ensure that the function is tending toward termination—that is, that the arguments are approaching the base case. This is called well-founded recursion.

You can define multiple recursive functions simultaneously by separating the definitions with and. These are called mutually recursive functions. For example:

let rec even n = (n = 0u) || odd(n-1u)
and     odd n = (n <> 0u) && even(n-1u)

This gives the following types:

val even : uint32 -> bool
val odd : uint32 -> bool

Of course, a more efficient, nonrecursive implementation of these is available!

let even (n:uint32) = (n % 2u) = 0u
let odd  (n:uint32) = (n % 2u) = 1u

You must sometimes take care with recursive functions to ensure that they're tail recursive, or else the computation stack of your program may be exhausted by large inputs. This is particularly important for library functions or functions that operate over large data structures with very large numbers of recursive calls. Indeed, the implementation of length shown previously isn't tail recursive. Chapter 8 discusses tail recursion in depth.

Introducing Function Values

This section covers the foundational building block of F# functional programming: function values. We begin with a simple and well-known example: using function values to transform one list into another.

One of the primary uses of F# lists is as a general-purpose, concrete data structure for storing ordered input lists and ordered results. Input lists are often transformed to output lists using aggregate operations that transform, select, filter, and categorize elements of the list according to a range of criteria. These aggregate operations provide an excellent introduction to how to use function values. Let's take a closer look at this in the following code sample, which continues from the definition of http from Listing 2-2 in Chapter 2:

> let sites = [ "http://www.live.com";
                "http://www.google.com" ];;
val sites : string list

> let fetch url = (url, http url);;
val fetch : string -> string * string

> List.map fetch sites;;
val it : (string * string) list
= [ ("http://www.live.com", "<html>...</html>");
    ("http://www.google.com", "<html>...</html>"); ]

The first interaction defines sites as a literal list of URLs, and the second defines the function fetch. The third calls the aggregate operation List.map. This accepts the function value fetch as the first argument and the list sites as the second argument. The function applies fetch to each element of the list and collects the results in a new list.

Types are one useful way to help learn what a function does. Here's the type of List.map:

> List.map;;
val it: ('T -> 'U) -> 'T list -> 'U list

This says List.map accepts a function value as the first argument and a list as the second argument, and it returns a list as the result. The function argument can have any type 'T -> 'U, and the elements of the input list must have a corresponding type 'T. The symbols 'T and 'U are called type parameters, and functions that accept type parameters are called generic. Chapter 5 discusses type parameters in detail.

Tip

You can often deduce the behavior of a function from its type, especially if its type involves type parameters. For example, look at the type of List.map. Using type parameters, you can observe that the type 'T list of the input list is related to the type 'T accepted by the function passed as the first parameter. Similarly, the type 'U returned by this function is related to the type 'U list of the value returned by List.map. From this, it's reasonable to conclude that List.map calls the function parameter for items in the list and constructs its result using the values returned.

Using Anonymous Function Values

Function values are so common in F# programming that it's convenient to define them without giving them names. Here is a simple example:

> let primes = [2; 3; 5; 7];;
val primes : int list

> let primeCubes = List.map (fun n -> n * n * n) primes;;
val primeCubes: int list

> primeCubes;;
val it : int list = [8; 27; 125; 343]

The definition of primeCubes uses the anonymous function value (fun n -> n * n * n). Such values are similar to function definitions but are unnamed and appear as an expression rather than as a let declaration. fun is a keyword meaning function, n represents the argument to the function, and n * n * n is the result of the function. The overall type of the anonymous function expression is int -> int. You could use this technique to avoid defining the intermediary function fetch in the earlier sample:

let resultsOfFetch = List.map (fun url -> (url, http url)) sites

You see anonymous functions throughout this book. Here is another example:

> List.map (fun (_,p) -> String.length p) resultsOfFetch;;
val it : int list = [3932; 2827 ]

Here you see two things:

  • The argument of the anonymous function is a tuple pattern. Using a tuple pattern automatically extracts the second element from each tuple and gives it the name p within the body of the anonymous function.

  • Part of the tuple pattern is a wildcard pattern, indicated by an underscore. This indicates that you don't care what the first part of the tuple is; you're interested only in extracting the length from the second part of the pair.

Computing with Aggregate Operators

Functions such as List.map are called aggregate operators, and they're powerful constructs, especially when combined with the other features of F#. Here is a longer example that uses the aggregate operators Array.filter and List.map to count the number of URL links in an HTML page and then collects stats on a group of pages (this sample uses the function http defined in Chapter 2):

let delimiters = [| ' '; '
'; '	'; '<'; '>'; '=' |]
let getWords (s: string) = s.Split delimiters
let getStats site =
    let url = "http://" + site
    let html = http url
    let hwords = html |> getWords
    let hrefs = html |> getWords |> Array.filter (fun s -> s = "href")
    (site,html.Length, hwords.Length, hrefs.Length)

Here, you use the function getStats with three web pages:

> let sites = [ "www.live.com";"www.google.com";"search.yahoo.com" ];;
val sites : string list

> sites |> List.map getStats;;
val it : (string * int * int * int) list
  = [("www.live.com", 7728, 1156, 10);
     ("www.google.com", 2685, 496, 14);
     ("search.yahoo.com", 10715, 1771, 38)]

The function getStats computes the length of the HTML for the given web site, the number of words in the text of that HTML, and the approximate number of links on that page.

The previous code sample extensively uses the |> operator to pipeline operations, discussed in the "Pipelining with |>" sidebar. The F# library design ensures that a common, consistent set of aggregate operations is defined for each structured type. Table 3-11 shows how the same design patterns occur for the map abstraction.

Table 3.11. A Recurring Aggregate Operator Design Pattern from the F# Library

Operator

Type

List.map

('T -> 'U) -> 'T list -> 'U list

Array.map

('T -> 'U) -> 'T [] -> 'U []

Option.map

('T -> 'U) -> 'T option -> 'U option

Seq.map

('T -> 'U) -> seq<'T> -> seq<'U>

WYou may see types that define methods such as Map that provide a slightly more succinct way of transforming data. For example, this might allow you to write a transformation like so:

site.Map getStats

This may require the use of a type annotation. Both styles are used in F# programming, depending on the methods and properties available for a particular data type.

Composing Functions with >>

You saw earlier how to use the |> forward pipe operator to pipe values through a number of functions. This was a small example of the process of computing with functions, an essential and powerful programming technique in F#. This section covers ways to compute new function values from existing ones using compositional techniques. First, let's look at function composition. For example, consider the following code:

let google = http "http://www.google.com"
google |> getWords |> List.filter (fun s -> s = "href") |> List.length

You can rewrite this code using function composition as follows:

let countLinks = getWords >> List.filter (fun s -> s = "href") >> List.length
google |> countLinks

You define countLinks as the composition of three function values using the >> forward composition operator. This operator is defined in the F# library as follows:

let (>>) f g x = g(f(x))

You can see from the definition that f >> g gives a function value that first applies f to the x and then applies g. Here is the type of >>:

val (>>) : ('T -> 'U) -> ('U -> 'c) -> ('T -> 'c)

Note that >> is typically applied to only two arguments: those on either side of the binary operator, here named f and g. The final argument x is typically supplied at a later point.

F# is good at optimizing basic constructions of pipelines and composition sequences from functions—for example, the function countLinks shown earlier becomes a single function calling the three functions in the pipeline in sequence. This means sequences of compositions can be used with relatively low overhead.

Building Functions with Partial Application

Composing functions is just one way to compute interesting new functions. Another useful way is to use partial application. Here's an example:

let shift (dx,dy) (px,py) = (px + dx, py + dy)
let shiftRight = shift (1,0)
let shiftUp    = shift (0,1)
let shiftLeft  = shift (−1,0)
let shiftDown  = shift (0,−1)

The last four functions are defined by calling shift with only one argument, in each case leaving a residue function that expects an additional argument. F# Interactive reports the types as follows:

val shift : int * int -> int * int -> int * int
val shiftRight : int * int -> int * int
val shiftLeft  : int * int -> int * int
val shiftUp    : int * int -> int * int
val shiftDown  : int * int -> int * int

Here is an example of how to use shiftRight and how to apply shift to new arguments (2,2):

> shiftRight (10,10);;
val it : int * int = (11,10)

> List.map (shift (2,2)) [ (0,0); (1,0); (1,1); (0,1) ];;
val it : int * int list = [ (2,2); (3,2); (3,3); (2,3) ]

In the second example, the function shift takes two pairs as arguments. You bind the first parameter to (2, 2). The result of this partial application is a function that takes one remaining tuple parameter and returns the value shifted by two units in each direction. This resulting function can now be used in conjunction with List.map.

Using Local Functions

Partial application is one way in which functions can be computed rather than simply defined. This technique becomes very powerful when combined with additional local definitions. Here's a simple and practical example, representing an idiom common in graphics programming:

open System.Drawing;;
let remap (r1: Rectangle) (r2: Rectangle) =
    let scalex = float r2.Width / float r1.Width
    let scaley = float r2.Height / float r1.Height
    let mapx x = int (float r2.Left + truncate (float (x - r1.Left) * scalex))
    let mapy y = int (float r2.Top + truncate (float (y - r1.Top) * scaley))
    let mapp (p: Point) = Point(mapx p.X, mapy p.Y)
    mapp

The function remap computes a new function value mapp that maps points in one rectangle to points in another. F# Interactive reports the type as follows:

val remap : Rectangle -> Rectangle -> (Point -> Point)

The type Rectangle is defined in the .NET library System.Drawing.dll and represents rectangles specified by integer coordinates. The computations on the interior of the transformation are performed in floating point to improve precision. You can use this as follows:

> let mapp = remap (Rectangle(100,100,100,100)) (Rectangle(50,50,200,200));;
val mapp : Point -> Point

> mapp (Point(100,100));;
val it : Point = X=50,Y=50

> mapp (Point(150,150));;
val it : Point = X=150,Y=150

> mapp (Point(200,200));;
val it : Point = X=250,Y=250

The intermediate values scalex and scaley are computed only once, despite the fact that you've called the resulting function mapp three times. It may be helpful to think of mapp as a function object being generated by the remap function.

In the previous example, mapx, mapy, and mapp are local functions—functions defined locally as part of the implementation of remap. Local functions can be context dependent; in other words, they can be defined in terms of any values and parameters that happen to be in scope. Here, mapx is defined in terms of scalex, scaley, r1, and r2.

Note

Local and partially applied functions are, if necessary, implemented by taking the closure of the variables they depend on and storing them away until needed. In optimized F# code, the F# compiler often avoids this and instead passes extra arguments to the function implementations. Closure is a powerful technique that is used frequently in this book. It's often used in conjunction with functions, as in this chapter, but is also used with object expressions, sequence expressions, and class definitions.

Using Functions as Abstract Values

The function remap from the previous section generates values of type Point -> Point. You can think of this type as a way of representing "the family of transformations for the type Point." Many useful concepts can be modeled using function types and values. For example:

  • The type unit -> unit is frequently used to model actions: operations that run and perform some unspecified side effect. For example, consider the expression (fun () -> printfn "Hello World").

  • Types of the form type -> type -> int are frequently used to model comparison functions over the type type. You also see type * type -> int used for this purpose, where a tuple is accepted as the first argument. In both cases, a negative result indicates less than, zero indicates equals, and a positive result indicates greater than. Many collection types accept such a value as a configuration parameter. For example, the F# compare function performs generic comparison on values (see Chapter 5).

  • Types of the form type -> unit are frequently used to model callbacks. Callbacks are often run in response to a system event, such as when a user clicks a user interface element. In this case, the parameter sent to the handler has some specific type, such as System.Windows.Forms.MouseEventArgs.

  • Types of the form unit -> type are frequently used to model delayed computations, which are values that, when required, produce a value of type type. For example, a threading library can accept such a function value and execute it on a different thread, eventually producing a value of type type. Delayed computations are related to lazy computations and sequence expressions, discussed in "Using Sequence Expressions."

  • Types of the form type -> unit are frequently used to model sinks, which are values that, when required, consume a value of type type. For example, a logging API may use a sink to consume values and write them to a log file.

  • Types of the form type1 -> type2 are frequently used to model transformers, which are functions that transform each element of a collection. You see this pattern in the map operator for many common collection types.

  • Types of the form type1 -> type2 -> type2 are frequently used to model visitor accumulating functions, which are functions that visit each element of a collection (type type1) and accumulate a result (type type2). For example, a visitor function that accumulates integers into a set of integers has type int -> Set<int> -> Set<int>.

Note

The power of function types to model many different concepts is part of the enduring appeal of functional programming. This is one of the refreshing features F# brings to object-oriented programming: many simple abstractions are modeled in very simple ways and often have simple implementations through orthogonal, unified constructs such as anonymous function values and function compositions.

Iterating with Aggregate Operators

It's common to use data to drive control. In functional programming, the distinction between data and control is often blurred: function values can be used as data, and data can influence control flow. One example is using a function such as List.iter to iterate over a list:

let sites = [ "http://www.live.com";
              "http://www.google.com";
              "http://search.yahoo.com" ]

sites |> List.iter (fun site -> printfn "%s, length = %d" site (http site).Length)

List.iter simply calls the given function (here an anonymous function) for each element in the input list. Here is its type:

val iter : ('T -> unit) -> 'T list -> unit

Many additional aggregate iteration techniques are defined in the F# and .NET libraries, particularly by using values of type seq<type>, discussed in "Getting Started with Sequences" later in this chapter.

Abstracting Control with Functions

As a second example of how you can abstract control using functions, let's consider the common pattern of timing the execution of an operation (measured in wall-clock time). First, let's explore how to use System.DateTime.Now to get the wall-clock time:

> open System;;
> let start = DateTime.Now;;
val start : DateTime

> http "http://www.newscientist.com";;
val it : string = "<html>...</html>"

> let finish = DateTime.Now;;
val finish : DateTime

> let elapsed = finish - start;;
val elapsed : TimeSpan

> elapsed;;
val it : TimeSpan = 00:00:01.9799671

Note that the type TimeSpan has been inferred from the use of the overloaded operator in the expression finish - start. Chapter 6 discusses overloaded operators in depth. You can now wrap up this technique as a function time that acts as a new control operator:

open System
let time f =
    let start = DateTime.Now
    let res = f()
    let finish = DateTime.Now
    (res, finish - start)

This function runs the input function f but takes the time on either side of the call. It then returns both the result of the function and the elapsed time. The inferred type is as follows:

val time : (unit -> 'T) -> 'T * TimeSpan

Here 'T is a type variable that stands for any type, and thus the function can be used to time functions that return any kind of result. Note that F# automatically infers a generic type for the function, a technique called automatic generalization that lies at the heart of F# type inference. Chapter 5 discusses automatic generalization in detail. Here is an example of using the time function, which again reuses the http function defined in Chapter 2:

> time (fun () -> http "http://www.newscientist.com");;
val it : string * TimeSpan = ...   (The HTML text and time will be shown here)

Using .NET Methods as First-Class Functions

You can use existing .NET methods as first-class functions. For example:

> open System.IO;;
> [@"C:Program Files"; @"C:Windows"] |> List.map Directory.GetDirectories;;

val it : string [] list =
  [ [|"C:\Program Files\Adobe"; "C:\Program Files\Apoint";
      "C:\Program Files\CA"; "C:\Program Files\CE Remote Tools"; ...]; ... ]

Sometimes you need to add extra type information to indicate which overload of the method is required. Chapter 6 discusses method overloading in more detail. For example, the following causes an error:

> open System;;
> let f = Console.WriteLine;;

C:misc	est.ml(11,8): error: FS0041: The method WriteLine is overloaded.
Possible matches are shown below. Resolve the overloading by adding further
type annotations to the arguments.

Possible overload: 'Console.WriteLine(bool value) : unit'.

Possible overload: 'Console.WriteLine(char value) : unit'.
...

However, the following succeeds:

> let f = (Console.WriteLine : string -> unit);;
val f : string -> unit

Getting Started with Pattern Matching

One important tool in F# programming is pattern matching, a general construct that combines decomposition and control. In the previous sections, you got a taste of how you can use pattern matching with tuple, list, and option values. However, you can also use pattern matching in many other situations. You see many other examples of pattern matching in this book, but let's start with some simple pattern matching over strings and integers. As you've already seen, pattern matches on explicit values are introduced using the match ... with ... construct:

let urlFilter url agent =
    match (url,agent) with
    | "http://www.control.org", 99 -> true
    | "http://www.kaos.org"   , _  -> false
    | _, 86 -> true
    | _ -> false

The inferred type of the function is as follows:

val urlFilter : string -> int -> bool

The expression (url,agent) after the keyword match is a tuple of type (string*int). Each rule of the match is introduced with a | followed by a pattern, then ->, and then a result expression. When executed, the patterns of the rules are used one by one, and the first successful pattern match determines which result expression is used. In the previous example, the first pattern matches if url and agent are "http://www.control.org" and 99, respectively. The next matches if url is "http://www.kaos.org". The third pattern matches if agent is 86. The last three rules all use wildcard patterns represented by the underscore character; these match all inputs.

The overall conditions under which urlFilter returns true can be determined by reading through the pattern match: agent 99 can access "http://www.control.org", no one can access "http://www.kaos.org", and, excluding the latter, agent 86 can access anything.

Patterns are a rich and powerful technique for simultaneous data analysis and decomposition. Table 3-12 summarizes all the ways to form patterns in F#; many of these involve building up patterns from other patterns. The sections that follow look at some of these constructs, and Chapter 9 covers active patterns.

Table 3.12. Different Ways to Form Patterns

General Form

Kind

Example

(pat, ... ,pat)

Tuple pattern

(1,2,("3",x))

[pat; ... ;pat]

List pattern

[x;y;z]

[| pat; ...; pat |]

Array pattern

[| "cmd"; arg1; arg2 |]

{ id=pat; ...; id=pat }

Record pattern

{ X=1; Y=2 }

Tag(pat, ... ,pat)

Tagged union or active pattern

Point(x,y)

pat | pat

"Or" pattern

[x] | ["X";x]

pat & pat

"And" pattern

[p] & [Point(x,y)]

pat as id

Named pattern

[x] as inp

id

Variable pattern

x

_

Wildcard pattern

_

Any literal

Constant pattern

36, "36", 27L,System.DayOfWeek.Monday

:? type

Type test pattern

:? string

null

Null test pattern

null

Matching on Structured Values

Pattern matching can be used to decompose structured values. Here is an example where you match nested tuple values:

> let highLow a b =
    match (a,b) with
    | ("lo", lo), ("hi", hi) -> (lo,hi)
    | ("hi", hi), ("lo", lo) -> (lo,hi)
    | _ -> failwith "expected a both a high and low value";;

The match examines two pairs and looks at the strings in the first element of each, returning the associated values:

> highLow ("hi",300) ("lo",100);;
val it : int * int = (100,300)

The first rule matches if the first parts of the input pairs are the strings "lo" and "hi", respectively. It then returns a pair made from the respective second parts of the tuples. The second rule is the mirror of this in case the values appeared in reverse order.

The final cases of both of the previous examples use wildcard patterns to cover remaining cases. This makes the patterns exhaustive. Frequently, no wildcard is needed to ensure this, because for many input types F# is able to determine whether the given set of rules is sufficient to cover all possibilities for the given shape of data. However, if a match isn't exhaustive, a warning is given:

> let urlFilter3 url agent =
      match url,agent with
      | "http://www.control.org", 86 -> true
      | "http://www.kaos.org", _ -> false;;

  match url,agent with
  ^^^^^^^^^^^^^^^^^^^^
warning: Incomplete pattern matches on this expression. ...

In these cases, it may be necessary to add an extra exception-throwing clause to indicate to the F# compiler that the given inputs aren't expected:

let urlFilter4 url agent =
    match url,agent with
    | "http://www.control.org", 86 -> true
    | "http://www.kaos.org", _ -> false
    | _ -> failwith "unexpected input"

Nonexhaustive matches are automatically augmented by a default case where a MatchFailureException is thrown. Chapter 4 discusses exceptions.

F# is frequently able to determine whether pattern-matching rules are redundant, such as if a rule can never be selected because previous rules subsume all such cases. In this case, a warning is given. For example:

> let urlFilter2 url agent =
      match url,agent with
      | "http://www.control.org", _ -> true
      | "http://www.control.org", 86 -> true
      | _ -> false;;

     | "http://www.control.org", 86 -> true
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
warning: This rule will never be matched.

Tip

Use wildcard patterns with care. F# can often determine whether a match is exhaustive, and the use of wildcard patterns effectively disables this analysis for any particular pattern match. Sometimes it's better to write out the extra cases of a match as explicit patterns, because you can then adjust your code when new kinds of input data are introduced.

Guarding Rules and Combining Patterns

Individual rules of a match can be guarded by a condition that is executed if the pattern itself succeeds. Here is a simple use of this mechanism to record the three clauses of computing the sign of an integer:

let sign x =
    match x with
    | _ when x < 0 -> −1
    | _ when x > 0 ->  1
    | _ -> 0

You can combine two patterns to represent two possible paths for matching:

let getValue a =
    match a with
    | (("lo" | "low") ,v) -> v
    | ("hi",v) | ("high",v) -> v
    | _ -> failwith "expected a both a high and low value"

Here, the pattern ("lo" | "low") matches either string. The pattern ("hi",v) | ("high",v) plays essentially the same role by matching pairs values where the left of the pair is "hi" or "high" and by binding the value v on either side of the pattern.

Note

Individual patterns can't bind the same variables twice. For example, a pattern (x,x) isn't permitted, although (x,y) when x = y is permitted. Furthermore, each side of an "or" pattern must bind the same set of variables, and these variables must have the same types.

Getting Started with Sequences

Many programming tasks require the iteration, aggregation, and transformation of data streamed from various sources. One important and general way to code these tasks is in terms of values of the .NET type System.Collections.Generic.IEnumerable<type>, which is typically abbreviated to seq<type> in F# code. A seq<type> is a value that can be iterated, producing results of type type on demand. Sequences are used to wrap collections, computations, and data streams and are frequently used to represent the results of database queries. The following sections present some simple examples of working with seq<type> values.

Using Range Expressions

You can generate simple sequences using range expressions. For integer ranges, these take the form of seq {n .. m} for the integer expressions n and m:

> seq {0 .. 2};;
val it : seq<int> = seq [ 0; 1; 2; ]

You can also specify range expressions using other numeric types, such as double and single:

> seq {-100.0 .. 100.0};;
val it : seq<double> = seq [ −100.0; −99.0; −98.0; ... ]

By default, F# Interactive shows the value of a sequence only to a limited depth. seq<'T> values are lazy in the sense that they compute and return the successive elements on demand. This means you can create sequences representing very large ranges, and the elements of the sequence are computed only if they're required by a subsequent computation. In the next example, you don't actually create a concrete data structure containing one trillion elements but rather a sequence value that has the potential to yield this number of elements on demand. The default printing performed by F# Interactive forces this computation up to depth 4:

> seq {1I .. 1000000000000I};;
val it : seq<bigint> = seq [1I; 2I; 3I; 4I; ... ]

The default increment for range expressions is always 1. A different increment can be used via range expressions of the form seq { n .. skip .. m }:

> seq { 1 .. 2 .. 5 };;
val it : seq<int> = seq [ 1; 3; 5 ]

> seq { 1 .. −2 .. −5 };;
val it : seq<int> = seq [ 1; −1; −3; −5 ]

If the skip causes the final element to be overshot, then the final element isn't included in the result:

> seq { 0 .. 2 .. 5 };;
val it : seq<int> = seq [ 0; 2; 4 ]

The (..) and (.. ..) operators are overloaded operators in the same sense as (+) and (-), which means their behavior can be altered for user-defined types. Chapter 6 discusses this in more detail.

Iterating a Sequence

You can iterate sequences using the for ... in ... do construct, as well as the Seq.iter aggregate operator discussed in the next section. Here is a simple example of the first:

> let range = seq {0 .. 2 .. 6};;
val range : seq<int>

> for i in range do
      printfn "i = %d" i;;
i = 0
i = 2
i = 4
i = 6

This construct forces the iteration of the entire seq. Use it with care when you're working with sequences that may yield a large number of elements.

Transforming Sequences with Aggregate Operators

Any value of type seq<type> can be iterated and transformed using functions in the Microsoft.FSharp.Collections.Seq module. For example:

> let range = seq {0 .. 10};;
val range : seq<int>

> range |> Seq.map (fun i -> (i,i*i));;
val it : seq<int * int> = seq [ (0, 0); (1, 1); (2, 4); (3, 9) ... ]

Table 3-13 shows some important functions in this library module. The following operators necessarily evaluate all the elements of the input seq immediately:

  • Seq.iter: This iterates all elements, applying a function to each one.

  • Seq.toList: This iterates all elements, building a new list.

  • Seq.toArray: This iterates all elements, building a new array.

Most of the other operators in the Seq module return one or more seq<type> values and force the computation of elements in any input seq<type> values only on demand.

Table 3.13. Some Important Functions and Aggregate Operators from the Seq Module

Operator

Type

Seq.append

: seq<'T> -> seq<'T> -> seq<'T>

Seq.concat

: seq< seq<'T> > -> seq<'T>

Seq.choose

: ('T -> 'U option) -> seq<'T> -> seq<'U>

Seq.delay

: (unit -> seq<'T>) -> seq<'T>

Seq.empty

seq<'T>

Seq.iter

: ('T -> unit) -> seq<'T> -> unit

Seq.filter

: ('T -> bool) -> seq<'T> -> seq<'T>

Seq.map

: ('T -> 'U) -> seq<'T> -> seq<'U>

Seq.singleton

: 'T -> seq<'T>

Seq.truncate

: int -> seq<'T> -> seq<'T>

Seq.toList

: seq<'T> -> 'T list

Seq.ofList

: 'T list -> seq<'T>

Seq.toArray

: seq<'T> -> 'T[]

Seq.ofArray

: 'T[] -> seq<'T>

Which Types Can Be Used as Sequences?

Table 3-13 includes many uses of types such as seq<'T>. When a type appears as the type of an argument, the function accepts any value that's compatible with this type. Chapter 5 explains the notions of subtyping and compatibility in more detail; the concept should be familiar to OO programmers because it's the same as that used by other .NET languages such as C#, which itself is close to that used by Java. In practice, you can easily discover which types are compatible with which by using F# Interactive and tools such as Visual Studio: when you hover over a type name, the compatible types are shown. You can also refer to the online documentation for the F# libraries and the .NET Framework, which you can easily search using the major search engines.

Here are some of the types compatible with seq<'T>:

  • Array types: For example, int[] is compatible with seq<int>.

  • F# list types: For example, int list is compatible with seq<int>.

  • All other F# and .NET collection types: For example, System.Collections.Generic.SortedList<string> is compatible with seq<string>.

The following types aren't directly type compatible with seq<'T> but can be converted readily into sequences when needed:

  • Some .NET types are compatible with a somewhat deprecated nongeneric .NET 1.0 construct called System.Collections.IEnumerable (note the absence of any generic parameter) but aren't compatible with the newer .NET construct System.Collections.Generic.IEnumerable<type>, called seq<type> in F# code.

  • Some .NET types such as System.Text.RegularExpressions.MatchCollection support only a GetEnumerator method and can't be used directly as values of type seq<type>. However, you can convert them into sequences by using them in conjunction with the sequence expression syntax mentioned earlier, such as seq { for x in matchCollection -> x } or for x in matchCollection do ....

Expressions of the form for pat in seq are described in the section "Using Sequence Expressions" and in Chapter 4.

Using Lazy Sequences from External Sources

Sequences are frequently used to represent the process of streaming data from an external source, such as from a database query or from a computer's file system. For example, the following recursive function constructs a seq<string> that represents the process of recursively reading the names of all the files under a given path. The return types of Directory.GetFiles and Directory.GetDirectories are string[]; and as noted earlier, this type is always compatible with seq<string>:

open System.IO
let rec allFiles dir =
    Seq.append
        (dir |> Directory.GetFiles)
        (dir |> Directory.GetDirectories |> Seq.map allFiles |> Seq.concat)

This gives the following type:

val allFiles : string -> seq<string>

Here is an example of the function being used on one of our machines:

> allFiles @"c:WINDOWSsystem32";;
val it : seq<string> =
       = seq ["c:\WINDOWS\system32\$winnt$.inf";
              "c:\WINDOWS\system32\12520437.cpx";
              "c:\WINDOWS\system32\12520850.cpx";
              "c:\WINDOWS\system32\6to4svc.dll"; ...]

The allFiles function is interesting partly because it shows many aspects of F# working seamlessly together:

  • Functions as values: The function allFiles is recursive and is used as a first-class function value within its own definition.

  • Pipelining: The pipelining operator |> provides a natural way to present the transformations applied to each subdirectory name.

  • Type inference: Type inference computes all types in the obvious way, without any type annotations.

  • NET interoperability: The System.IO.Directory operations provide intuitive primitives, which can then be incorporated in powerful ways using succinct F# programs.

  • Laziness where needed: The function Seq.map applies the argument function lazily (on demand), which means subdirectories aren't read until required.

One subtlety with programming with on-demand or lazy values like sequences is that side effects such as reading and writing from an external store shouldn't in general happen until the lazy sequence value is consumed. For example, the previousallFiles function reads the top-level directory C: as soon as allFiles is applied to its argument. This may not be appropriate if the contents of C: are changing. You can delay the computation of the sequence by using the library function Seq.delay or by using a sequence expression, covered in the next section, where delays are inserted automatically by the F# compiler.

Using Sequence Expressions

Aggregate operators are a powerful way of working with seq<type> values. However, F# also provides a convenient and compact syntax called sequence expressions for specifying sequence values that can be built using operations such as choose, map, filter, and concat. You can also use sequence expressions to specify the shapes of lists and arrays. It's valuable to learn how to use sequence expressions:

  • They're a compact way of specifying interesting data and generative processes.

  • They're used to specify database queries when using data-access layers such as Microsoft's Language Integrated Queries (LINQ). See Chapter 15 for examples of using sequence expressions this way.

  • They're one particular use of computation expressions, a more general concept that has several uses in F# programming. Chapter 9 discusses computation expressions, and we show how to use them for asynchronous and parallel programming in Chapter 13.

Creating Sequence Expressions Using for

The simplest form of a sequence expression is seq { for value in expr .. expr -> expr }. Here, -> should be read "yield." This is a shorthand way of writing Seq.map over a range expression. For example, you can generate an enumeration of numbers and their squares as follows:

> let squares = seq { for i in 0 .. 10 -> (i,i*i) };;
val squares : seq<int * int>

> squares;;
val it : seq<int * int> = [ (0,0); (1,1); (2,4); (3,9); ... ]

The more complete form of this construct is seq { for pattern in seq -> expression }. The pattern allows you to decompose the values yielded by the input enumerable. For example, you can consume the elements of squares using the pattern (i,iSquared):

> seq { for (i,iSquared) in squares -> (i,iSquared,i*iSquared) };;
val it : seq<int * int * int> = [ (0,0,0); (1,1,1); (2,4,8); (3,9,27); ... ]

The input seq can be a seq<type> or any type supporting a GetEnumerator method. (The latter is supported because some important types from the .NET libraries support this method without directly supporting the seq interface.)

Enriching Sequence Expressions with Additional Logic

A sequence expression often begins with for ... in ..., but you can use additional constructs. For example:

  • A secondary iteration: for pattern in seq do seq-expr

  • A filter: if expression then seq-expr

  • A conditional: if expression then seq-expr else seq-expr

  • A let binding: let pattern = expression in seq-expr

Secondary iterations generate additional sequences, all of which are collected and concatenated together. Filters let you skip elements that don't satisfy a given predicate. To see both of these in action, the following computes a checkerboard set of coordinates for a rectangular grid:

let checkerboardCoordinates n =
   seq { for row in 1 .. n do
             for col in 1 .. n do
                 if (row+col) % 2 = 0 then
                     yield (row,col) }
> checkerboardCoordinates 3;;
val it : seq<(int * int)> = seq [(1, 1); (1, 3); (2, 2); (3, 1); ...]

Using let clauses in sequence expressions allows you to compute intermediary results. For example, the following code gets the creation time and last-access time for each file in a directory:

let fileInfo dir =
    seq { for file in Directory.GetFiles dir do
              let creationTime = File.GetCreationTime file
              let lastAccessTime = File.GetLastAccessTime file
              yield (file,creationTime,lastAccessTime) }

In the previous examples, each step of the iteration produces zero or one result. The final yield of a sequence expression can also be another sequence, signified through the use of the yield! keyword. The following sample shows how to redefine the allFiles function from the previous section using a sequence expression. Note that multiple generators can be included in one sequence expression; the results are implicitly collated together using Seq.append:

let rec allFiles dir =
    seq { for file in Directory.GetFiles dir do
              yield file
          for subdir in Directory.GetDirectories dir do
              yield! allFiles subdir }

Generating Lists and Arrays Using Sequence Expressions

You can also use range and sequence expressions to build list and array values. The syntax is identical, except the surrounding braces are replaced by the usual [ ] for lists and [| |] for arrays (Chapter 4 discusses arrays in more detail):

> [ 1 .. 4 ];;
val it: int list = [ 1; 2; 3; 4 ]

> [ for i in 0 .. 3 -> (i,i*i) ];;
val it : (int * int) list = [ (0,0); (1,1); (2,4); (3,9) ]

> [| for i in 0 .. 3 -> (i,i*i) |];;
val it : (int * int) [] = [ (0,0); (1,1); (2,4); (3,9) ]

Warning

F# lists and arrays are finite data structures built immediately rather than on demand, so you must take care that the length of the sequence is suitable. For example, [ 1I .. 1000000000I ] attempts to build a list that is one billion elements long.

Exploring Some Simple Type Definitions

F# is a typed language, and you often need to declare new shapes of types via type definitions and type abbreviations. This chapter covers only some of the simpler type definitions that are useful and succinct workhorses for functional programming. F# also lets you define a range of sophisticated type definitions related to object-oriented programming, discussed in Chapter 6. However, these aren't often required in basic functional programming.

Defining Type Abbreviations

Type abbreviations are the simplest type definitions:

type index = int
type flags = int64
type results = string * TimeSpan * int * int

It's common to use lowercase names for type abbreviations, but it's certainly not compulsory. Type abbreviations can be generic:

type StringMap<'T> = Microsoft.FSharp.Collections.Map<string,'T>
type Projections<'T,'U> = ('T -> 'U) * ('U -> 'T)

Type abbreviations aren't concrete, because they simply alias an existing type. They're expanded during the process of compiling F# code to the format shared between multiple .NET languages. The difference can, for example, be detected by other .NET languages; because of this, a number of restrictions apply to type abbreviations. For example, you can't augment them with additional members, as can be done for concrete types such as records, discriminated unions, and classes. In addition, you can't truly hide a type abbreviation using a signature (see Chapter 7).

Defining Records

The simplest concrete type definitions are records. Here's an example:

type Person =
    { Name: string;
      DateOfBirth: System.DateTime; }

You can construct record values by using the record labels:

> { Name = "Bill"; DateOfBirth = new System.DateTime(1962,09,02) };;
val it : Person = { Name="Bill"; DateOfBirth = 09/02/1962 }

Should there be a conflict between labels among multiple records, you can also construct record values by using an explicit type annotation:

> ({ Name = "Anna"; DateOfBirth = new System.DateTime(1968,07,23) } : Person);;
val it : Person = { Name="Anna"; DateOfBirth = 23/07/1968 }

Record values are often used to return results from functions:

type PageStats =
    { Site: string;
      Time: System.TimeSpan;
      Length: int;
      NumWords: int;
      NumHRefs: int }

This technique works well when returning a large number of heterogeneous results:

let stats site =
    let url = "http://" + site
    let html,t = time (fun () -> http url)
    let hwords = html |> getWords
    let hrefs = hWords |> List.filter (fun s -> s = "href")
    { Site=site; Time=t; Length=html.Length;
      NumWords=hwords.Length; NumHRefs=hrefs.Length }

Here is the type of stats:

val stats  : string -> PageStats

Here is how F# Interactive shows the results of applying the function:

> stats "www.live.com";;
val it : PageStats = { Site="www.live.com"; Time=0.872623628;
                       Length=7728, NumWords=1156; NumHRefs=10; }

Handling Non-Unique Record Field Names

Record labels need not be unique among multiple record types. Here is an example:

type Person =
    { Name: string;
      DateOfBirth: System.DateTime; }

type Company =
    { Name: string;
      Address: string; }

When record names are non-unique, constructions of record values may need to use object expressions in order to indicate the name of the record type, thus disambiguating the construction. For example, consider the following type definitions:

type Dot = { X: int; Y: int }
type Point = { X: float; Y: float }

On lookup, record labels are accessed using the dot (.) notation in the same way as properties. One slight difference is that in the absence of further qualifying information, the type of the object being accessed is inferred from the record label. This is based on the latest set of record labels in scope from record definitions and uses of open. For example, given the previous definitions, you have the following:

> let coords1 (p:Point) = (p.X,p.Y);;
val coords1 : Point -> float * float

> let coords2 (d:Dot) = (d.X,d.Y);;
val coords2 : Dot -> int * int

> let dist p = sqrt (p.X * p.X + p.Y * p.Y);; // use of X and Y implies type "Point"
val dist : Point -> float

The accesses to the labels X and Y in the first two definitions have been resolved using the type information provided by the type annotations. The accesses in the third definition have been resolved using the default interpretation of record field labels in the absence of any other qualifying information.

Cloning Records

Records support a convenient syntax to clone all the values in the record, creating a new value with some values replaced. Here is a simple example:

type Point3D = { X: float; Y: float; Z: float }
let p1 = { X=3.0; Y=4.0; Z=5.0 }
> let p2 = { p1 with Y=0.0; Z=0.0 };;
val p2 : Point3D

The definition of p2 is identical to this:

let p2 = { X=p1.X; Y=0.0; Z=0.0 }

This expression form doesn't mutate the values of a record, even if the fields of the original record are mutable.

Defining Discriminated Unions

The second kind of concrete type definition discussed in this section is a discriminated union. Here is a very simple example:

type Route = int
type Make = string
type Model = string
type Transport =
    | Car of Make * Model
    | Bicycle
    | Bus of Route

Each alternative of a discriminated union is called a discriminator. You can build values by using the discriminator much as if it were a function:

> let nick = Car("BMW","360");;
val nick : Transport

> let don = [ Bicycle; Bus 8 ];;
val don  : Transport list

> let james = [ Car ("Ford","Fiesta"); Bicycle ];;
val james  : Transport list

You can also use discriminators in pattern matching:

let averageSpeed (tr: Transport) =
    match tr with
    | Car _ -> 35
    | Bicycle -> 16
    | Bus _ -> 24

Several of the types you've already met are defined as discriminated unions. For example, the 'T option type is defined as follows:

type 'T option =
    | None
    | Some of 'T

Discriminated unions can include recursive references (the same is true of records and other type definitions). This is frequently used when representing structured languages via discriminated unions, a topic covered in depth in Chapter 9:

type Proposition =
    | True
    | And of Proposition * Proposition
    | Or of Proposition * Proposition
    | Not of Proposition

Recursive functions can be used to traverse such a type. For example:

let rec eval (p: Proposition) =
    match p with
    | True -> true
    | And(p1,p2) -> eval p1 && eval p2
    | Or (p1,p2) -> eval p1 || eval p2
    | Not(p1) -> not (eval p1)

The F# type of immutable lists is defined in this way:

type 'T list =
    | ([])
    | (::) of 'T * 'T list

A broad range of tree-like data structures are conveniently represented as discriminated unions. For example:

type Tree<'T> =
    | Tree of 'T * Tree<'T> * Tree<'T>
    | Tip of 'T

You can use recursive functions to compute properties of trees:

let rec size tree =
    match tree with
    | Tree(_,l,r) -> 1 + size l + size r
    | Tip _ -> 1

Here is the inferred type of size:

val size : Tree<'T> -> int

Here is an example of a constructed tree term and the use of the size function:

> let small = Tree("1",Tree("2",Tip("a"),Tip("b")),Tip("c"));;
val small : Tree<string>

> small;;
val it : Tree<string> = Tree ("1",Tree ("2",Tip("a"),Tip("b")),Tip("c"))

> size small;;
val it : int = 5

Chapters 8, 9, and 12 discuss symbolic manipulation based on trees.

Note

Discriminated unions are a powerful and important construct and are useful when modeling a finite, sealed set of choices. This makes them a perfect fit for many constructs that arise in applications and symbolic analysis libraries. They are, by design, nonextensible: subsequent modules can't add new cases to a discriminated union. This is deliberate: you get strong and useful guarantees by placing a limit on the range of possible values for a type. Extensible types can be defined through the use of records of functions and object interface types, discussed in Chapters 5 and 6.

Using Discriminated Unions as Records

Discriminated union types with only one data tag are an effective way to implement record-like types:

type Point3D = Vector3D of float * float * float
let origin = Vector3D(0.,0.,0.)
let unitX  = Vector3D(1.,0.,0.)
let unitY  = Vector3D(0.,1.,0.)
let unitZ  = Vector3D(0.,0.,1.)

These are particularly effective because they can be decomposed using succinct patterns in the same way as tuple arguments:

let length (Vector3D(dx,dy,dz)) = sqrt(dx*dx+dy*dy+dz*dz)

This technique is most useful for record-like values where there is some natural order on the constituent elements of the value (as shown earlier) or where the elements have different types.

Defining Multiple Types Simultaneously

Multiple types can be declared together to give a mutually recursive collection of types, including record types, discriminated unions, and abbreviations. The type definitions must be separated by the keyword and:

type Node =
    { Name : string;
      Links : Link list }
and Link =
    | Dangling
    | Link of Node

Summary

F# is a multiparadigm language, and this chapter looked at the core language constructs used for functional programming. The next chapter explores how to use F# for imperative programming: how to use mutable state, raise and handle exceptions, and perform I/O.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset