Chapter 9. LIBRARIES

A great deal of time and effort has been put into existing software. Many libraries have resulted from years of development and the ability to reuse this work is essential. There are two different forms of library of interest to the F# programmer:

  • .NET libraries

  • Native libraries

The .NET platform is famous for providing a wealth of library functions, particularly in the contexts of web and GUI programming. These libraries will be of interest to F# programmers primarily for the useful functionality that they provide, such as graphing tools. Many third party libraries are available for the .NET platform and we shall endeavour to cite those that we have found to be most useful and of the highest quality.

The term native library refers to platform-specific libraries that typically predate the .NET platform. The few remaining native libraries that have not been succeeded by .NET libraries are desirable primarily because they offer exceptionally high per-formance.

This chapter describes built-in functionality provided with the .NET platform as well as the use of both .NET and native libraries.

LOADING .NET LIBRARIES

On the .NET platform, extra functionality is typically provided in the form of a Dynamically Linked Library (DLL). These files have the suffix ".dll" and can be loaded into an F# interactive session or referenced from a compiled F# program using the #r directive. For example, the following loads the "XYGraph.dll" library:

> #r "XYGraph.dll";;

A set of directories is searched for this file and F# complains if it fails to find the DLL. The search can be expanded to include other directories using the # I directive. For example, the following adds the directory "C:Program Files" to the search path:

> #I @"C:Program Files";;

Compiled programs can also specify search paths and DLLs using - I and -r compile-line arguments.

CHARTING AND GRAPHING

Although the .NET platform does not bundle any charting or graphing tools there are a wide variety of libraries available, both open source and commercial products.

We have found the ComponentXtra libraries[19] to be simple to use and powerful enough to be useful in many situations. Moreover, their tools are free except for the saving and printing capabilities. The XYGraph class draws a 2D graph into a Windows form.

An object of the class XYGraph is a Windows Forms control and can be added into a form:

> let form = new Form(Text="Sine wave", Visible=true);;
val form : Form

> let graph =
    new componentXtra.XYGraph(Dock=DockStyle.Fill);;
val graph : componentXtra.XYGraph

> form.Controls.Add(graph);;

The labels on the graph are properties that can be set:

> graph.XtraTitle <- "Sine wave";;

> graph.XtraLabelX <- "x";;

> graph.XtraLabelY <- "sin x";;

A series can be added to the graph using the AddGraph method:

> let g =
The XYGraph tool from ComponentXtra.

Figure 9.1. The XYGraph tool from ComponentXtra.

let dash = Drawing2D.DashStyle.Solid
    graph.AddGraph("", dash, Color.Red, 1, false);;
val g : int

The int result is a handle to this series. Points can be added to the series by passing this handle and the new coordinates to the AddValue method:

> for x in O.Of .. O.Olf .. 6.29f do
    graph.AddValue(g, x, sin y);;

Finally, the graph must be redrawn with the new points:

> graph.DrawAll();;

The resulting 2D graph is illustrated in figure 9.1.

THREADS

Many applications can benefit from the ability to execute threads of program concur-rently. On the .NET platform, this functionality is provided by threading.

The design of concurrent programs and the use of threading on the .NET platform is a huge topic fraught with perils. Consequently, this chapter aims only to provide the reader with a basic understanding of threading on the .NET platform and to provide some useful higher-order functions that can be used to improve the performance of F# programs whilst avoiding as many pitfalls as possible.

Concurrency serves three different uses that have different properties and, conse-quently, require different support:

  1. CPU bound computation, e.g. dividing a long running computation between two CPUs to double performance.

  2. Non-CPU bound computation, e.g. downloading many protein sequences over the internet without wasting time waiting for one download to finish before beginning another.

  3. Interactivity, e.g. using a multithreaded GUI to ensure that the user interface of an application continues to function while the application is processing data.

CPU-bound computations are best suited to one thread for each CPU whereas non-CPU bound computations benefit from having a number of threads that is unrelated to the number of CPUs, chosen to be a trade-off between sufficient parallelism and the overhead of context switching between threads. Multithreaded GUIs require more sophisticated use of low-level threading constructs.

The following sections begin by detailing the basic use of .NET threads before presenting functions to handle both CPU-bound and non-CPU-bound computations safely and easily.

Thread safety

Determinism is a vitally important property of most programs. In the context of threading, a function that is careful to ensure determinism in the context of parallel execution is referred to as being thread safe. Designing functions to be thread safe is arbitrarily difficult and, consequently, containing this complexity is essential if large programs are to run correctly.

Consider a program that spawns two threads that execute in parallel, adjusting a counter. The first thread decrements the counter and the second thread increments the counter. All things being equal, we might expect the counter to be either −1, 0 or 1 at any given time. However, this is not the case for two reasons:

  • CPU time may not be divided equally between the two threads by the OS.

  • Increment and decrement are not atomic operations, so the value of the counter may be adjusted by one thread while the other thread is in the middle of an increment or decrement.

The unequal division of CPU time can be accounted for by queuing operations for execution rather than dividing them equally between threads. Each thread of computation executes tasks from the queue. If one thread is given more CPU time then it will consume more tasks from the queue, keeping all CPUs busy and making more efficient use of CPU time.

The latter problem is more subtle and can be solved using some low-level constructs. Specifically, threads can be synchronized to ensure that certain operations (increment and decrement in this case) are executed atomically by only one thread at a time. This is achieved by wrapping the non-atomic operation in a lock, a low-level construct that excludes other threads from executing the code at the same time. In this case, the increment and decrement would be wrapped in locks that depended upon a single value, thus precluding one thread from incrementing while the other was decrementing and vice-versa.

The remainder of this section introduces the fundamental threading constructs provided by the .NET platform before describing how useful higher-order functions can be composed from these constructs and reused in F# programs to simplify the use of concurrency in many scientific programs.

Basic use

Threading is provided via classes from the System. Threading namespace:

> open System.Threading;;

Concurrent threads of computation can be spawned by constructing a Thread object with a function f to execute and calling its Start method to begin execution. The following higher-order spawn function begins executing the given function f concurrently on a separate thread:

> let spawn (f : unit -> unit) =
    let thread = new Thread(f)
    thread.Start()
    thread;;
val spawn : (unit -> unit) -> Thread

After the spawn function is invoked, another computation is running concurrently. In most cases, the main thread will want to pause until the spawned thread completes, e.g. to read the result of the completed thread. This is achieved by calling the Join method of the Thread object that was returned by the spawn function.

The following execute function spawns several threads and waits for them all to complete before returning:

> let execute n f =
    [| for i in 1 .. n ->
         spawn f |]
    |> Array. iter (fun t -> t.Join());;
val execute : int -> (unit -> unit) -> unit

In the context of functional programming, the single most important improvement that can be made to the spawn and execute functions is to create variants that allow values fed through functions that execute concurrently. This requires the type of the function f to be generalized from unit -> unit to ' a -> 'b.

This improvement hints at what is perhaps the single most useful parallel program-ming construct in scientific computing: the parallel higher-order map function. This function applies a given function to each element of an array, executing applications concurrently.

A naive implementation of a parallel map can be written using the minimal threading functionality already discussed:

> let map f a =
    let b = Array.map (fun _ -> None) a
    let f i x = spawn (fun () -> b.[i] <- Some(f x))
    for thread in Array.mapi f a do
      thread.Join()
    Array.map Option.get b;;
val map : ('a -> 'b) -> 'a array -> 'b array

This implementation spawns one thread for each element of the input array a. Each thread fills the corresponding element of the array b with its result. As the type of the result is not known, the intermediate array b contains option types that are initially set to None. All of the threads are started and then all of the threads are joined, pausing the main thread until every spawned thread has completed. When all of the threads have completed the array b contains only Some values, which are extracted using the Option. get function to return the final result.

There is a trade-off between the performance improvement due to concurrent execution distributed across more than one CPU and the overheads involved in this parallel map function:

  • Boxing and unboxing of the option type.

  • Thread creation and handling.

  • Context switching if there are more threads than free CPUs.

  • Repeated iteration over the input array.

However, even this naive implementation can give a significant performance improve-ment when the input array contains one element for each CPU, the computations take roughly-equal time and the time taken is much longer than the overheads.

For example, computing the 40th Fibonacci number using the fib function (given at the beginning of section 6.1.4) takes 2.6s:

> time fib 4 0;;
Took 2 6 06ms
val it : int = 102334155

Performing the same computation twice using the sequential Array. map func tion takes just over twice as long, as expected:

> time (Array.map fib) [[40; 40 |] ,-;
Took 533 6ms
val it : int array = [|102334155; 102334155|]

Exploiting two CPUs using the parallel map function improves performance over the sequential map by 70% in this case:

> time (map fib) [| 4 0; 40 |];;
Took 3138ms
val it : int array = [|102334155; 102334155|]

However, applying this naive parallel map to a long array of quick computations increases the relative cost of this function's overheads. Consequently, the parallel map function can also be much slower than a sequential map, at least four orders of magnitude slower when mapping single increments over each element of a 104-element array:

> time (Array.map ((+) 1)) [|l .. 100 0 0|];;
Took 0ms
val it : unit = ()

> time (map ((+) 1)) [|l .. 10000 |];;
Took 9797ms
val it : unit = ()

Using a smaller number of threads and synchronizing their access to the input array greatly improves the worst-case performance of this parallel map. Implementing this improved parallel map requires the use of .NET thread synchronization constructs.

Locks

Two of the deficiencies of the naive parallel map can be addressed by distributing the element-wise computations over a small number of threads. Using fewer threads reduces the overheads of thread creation and handling and greatly improves worst-case performance, broadening the utility of the parallel map function.

A lock is as a way to mutually exclude threads from performing certain tasks concurrently, such as incrementing a counter. Exactly this functionality can be used to synchronise a small number of threads to process array elements concurrently, each thread consuming the next available element until no elements remain.

A section of code is most elegantly locked using the higher-order function lock:

> lock;;
val it : obj -> (unit -> 'a) -> 'a

This function waits until the given object is unlocked, then locks it, executes the given function and unlocks the object before returning. Thus, a thread safe increment of an int ref called n may be written:

lock n (fun () -> incr n)

This leads to a more efficient implementation of the parallel map that uses a small number of threads, each of which use a shared counter to keep track of the next unmapped element in the array. The task of incrementing the counter and returning the next unmapped element (if any) may be written:

> let next i n () =
    if !i = n then None else
      incr i
    Some(!i - 1);;
val next : int ref -> int -> unit -> int option

The parallel map itself spawns threads executing a loop function which repeatedly maps a single element and looks for the next unmapped element:

> let map max_threads f a =
    let n = Array.length a
    let b = Array.create n None
    let i = ref 0
    let rec apply i =
      b.[i] <- Some(f a.[i])
      loop()
    and loop () =
      Option.iter apply (lock i (next i n))
    execute max_threads loop
    Array.map Option.get b;;
val map : int -> ('a -> 'b) -> 'a array -> 'b array

For CPU-bound operations, this map should be invoked with one thread for each available CPU or core:

> let cpu_map f a =
    map System.Environment.ProcessorCount f a;;
val cpu_map : ('a -> 'b) -> 'a array -> 'b array

In the worse case described above, this implementation is three orders of magnitude faster than the previous implementation:

> time (cpu_map ((+) 1)) [|l .. 1000|]|> ignore;;
Took 5ms
val it : unit = ()

This cpu_map function is quite efficient and distributes computation as evenly as possible whilst remaining safe. Consequently, this is the parallel map of choice for scientific applications that value simplicity over performance. The overheads of this parallel map implementation are primarily the creation of local threads and the synchronization between those threads.

The thread pool

The previous implementation of parallel map distributed computations over its own set of threads. A set of threads that serves this purpose is known as a thread pool and the .NET platform actually provides a global thread pool that typically contains 25 worker threads and is accessed via the ThreadPool namespace.

The advantage of using a global thread pool is that the worker threads have already been created, removing this overhead from the parallel map function. The disadvantage is lack of safety: functions that use the threadpool recursively can deadlock threads in the threadpool and the global threadpool has substantial overheads for queueing so it is typically much slower to use.

The QueueUserWorkltem method can be used to queue computations on the global thread pool directly. Asynchronous delegates provide an alternative, and often easier, way to execute computations concurrently using the threadpool.

Asynchronous delegates

A delegate is the .NET representation of a type-safe function pointer. In F#, a closure can be converted into a delegate using the System. Converter class.

Delegates provide asynchronous invocation via BeginInvoke and EndInvoke methods. The former queues the delegate in the thread pool, applying any arguments, and the latter waits for it to complete and recovers its return value.

A parallel map that uses the global thread pool by invoking asynchronous delegates may be written:

> let global_map f a =
    let d = new System.Converter<'a, 'b>(f)
    Array.map (fun x -> d.Beginlnvoke(x, null, null)) a
    |> Array.map (fun a -> d.Endlnvoke(a));;
val global_map :
  int -> ('a -> 'b) -> 'a array -> 'b array

This implementation of a parallel map is faster than the previous implementation when the total time taken to perform the whole map is small (< 1ms) because the time taken to execute the previous map implementation is dominated by the thread creation. However, this implementation is significantly slower in almost all other cases because it queueing jobs for the threadpool is slower and there are many more threads in the global threadpool than CPUs so the computations are constantly context switched between.

Perhaps more importantly, this global_map function handles exceptions by reraising them on the main thread whereas the previous cpu_map function will give undefined behaviour because it makes no attempt to handle exceptions.

Background threads

On the .NET platform, applications have background threads and foreground threads. The difference between background and foreground threads relates to the termination of the application. An application is terminated when all of its foreground threads complete, i.e. any outstanding background threads are terminated automatically.

The IsBackground property can be used to mark a thread as a background thread:

> thread.IsBackground <- true;;

This allows branch computations to be quit automatically and is of particular importance in the context of visualization, where computations should be performed in background threads to ensure that the application ends when the user interface is closed.

RANDOM NUMBERS

The System. Random class provided by .NET can be used to generate uniformly-distributed int and float random numbers. A friendlier interface can be obtained by creating a global random number generator and providing some useful functions to use it:

> module Random =
    let rand = new System.Random()
    let int n = rand.Next(n)
    let float x = x * rand.NextDouble();;

The int and float functions in this Random module generate uniformly-distributed random int and float values, respectively. The results lie in the range [0, x) where x is the function argument.

For example, random floats uniformly distributed between 0 and 3 may be gener-ated using:

> Random.float 3.0;;
val it : float = 1.639560843

This Random module is an OCaml-compatible replacement of the System. Random class and will be used in the remainder of this book.

REGULAR EXPRESSIONS

Particularly with the advent of bioinformatics, a growing number of scientific appli-cations are manipulating strings as well as numbers. The F# string type provides simple access to Unicode strings and can be used to perform a variety of simple ac-tions. However, many applications benefit from sophisticated and specialized forms of pattern matching that are designed to act upon strings.

We have already discussed the concept of regular expressions in the context of the fslex lexer generator, in section 5.5.1.

Definitions relating to .NET regular expressions are held in the namespace:

> open System.Text.RegularExpressions;;

We shall now examine some of the functionality provided by a simple regular expression before presenting a complete program to compute the frequencies of different words in a text document.

The Regex class can be used to create an object representing a given regular expression. For example, the following regular expression matches any sequence of one or more whitespace characters:

> let whitespace = new Regex(@"s+");;
val whitespace : Regex

Note that we have used the notation @"..." to automate the escaping of back-slashes in a string. The alternative would be to omit the @ and escape the strings manually, which is more tedious and error-prone: "\s + ". This will be more beneficial for longer, more complicated regular expressions.

This .NET object of type Regex has several member functions that can be used to manipulate strings by finding substrings that match this regular expression.

For example, the Replace member function finds all substrings that match the regular expression and replaces them with the given string. The Replace member function of the whitespace object can therefore be used to collapse whitespace in a string by replacing all sequences of one or more whitespace characters with a single space:

> let collapse_whitespace string =
    whitespace.Replace(string, " ");;
val collapse_whitespace : string -> string

Applying this function to a string containing superfluous whitespace produces a new string with sequences of whitespace replaced by single spaces:

> collapse_whitespace "Too many
 spaces.";;
val it : string = "Too many spaces."

The same is_whitespace object can be used to split a string into whitespace-separated substrings using the Split member:

> whitespace.Split("Too many spaces.");;
val it : string = [|"Too"; "many"; "spaces."|]

This simple example has illustrated the basic functionality of .NET regular ex-pressions. This functionality can be used to dissect simple file formats to interpret textual data.

VECTORS AND MATRICES

The F# standard library provides efficient functions for handling arbitrary-dimensionality vectors and matrices, including specialized versions for vectors and matrices with float elements. Vectors are considered to be column vectors by default. Row vectors are also supported and are distinguished from column vectors by the type system, to improve correctness.

Vectors and matrices of floats can be constructed from sequences and sequences of sequences using the vector and matrix functions, respectively:

> let r = vector [2.0; 3.0];;
val e : Math.vector

> let m = matrix [[0.0; 1.0]; [-1.0; 0.0]];;
val m : Math.matrix

Arithmetic operators are provided for vectors and matrices. Vector addition:

> r + r;;
val it : vector = vector [4.0; 6.0]

Scaling:

> 3.0 $* r;;
val it : vector = vector [6.0; 9.0]

Matrix multiplication:

> m * m;;
val it : Matrix<float> =
  matrix [[-1.0; 0.0]; [0.0; −1.0]]

Transformation of a column vector by premultiplying a matrix:

> m * r;;
val it : vector = vector [3.0; −2.0]

Transformation of a row vector by postmultiplying a matrix:

> Math.Vector.Transpose r * m;;
val it : vector = vector [-3.0; 2.0]

Many scientific problems can be phrased in terms of vector-matrix algebra. Thus, the ability to handle vectors and matrices can be instrumental in writing scientific programs. In particular, the ability to perform some complicated computations on them (e.g. finding the eigenvalues of a matrix) can be pivotal in scientific programs. Such computations are often prone to numerical error and, therefore, can be tedious to program robustly. Interfacing to existing libraries that implement this functionality is covered later in this chapter.

DOWNLOADING FROM THE WEB

The Worldwide Web (WWW) contains a wealth of information for scientists, much of it in a form that can be downloaded and analyzed by machine. Indeed, even the structure of the Web has been the subject of scientific research. Naturally, the .NET platform provides all of the tools required to tap this resource quickly and easily.

Definitions relating to networking and the internet are held in the System. Net class and definitions relating to IO are held in the System. IO namespace. The .NET libraries provide access to the Web such that programs can download files over an internet connection. A file to download a given URL by passing a stream to a function k may be written in terms of these classes:

> let download (url : string) k =
    let request = System.Net.WebRequest.Create(url)
    let response = request.GetResponse()
    use stream = response.GetResponseStream()
    k stream;;
val download : string -> (IO.Stream -> 'a) -> 'a

The function k is given the open stream and is expected to read everything it needs from the stream before returning, as the stream will be closed when the continuation returns.

The simplest continuation simply reads the stream into a string:

> let string_of_stream (stream : #System.IO.Stream) =
    (new System.IO.StreamReader(stream)).ReadToEnd();;
val string_of_stream : #IO.Stream -> string

For example, we can download the Google home page with:

> download "http://www.google.com" string_of_stream;;
val it : string
  = "<htmlhead><meta http-equiv= "content-type ...

This makes F# and, in particular, the F# interactive mode a powerful tool for scientists analyzing data available on the web.

COMPRESSION

The .NET standard library includes functions for compressing and uncompressing streams. Many on-line resources store information in compressed form and these functions can be used to uncompress the data ready for analysis.

The System. IO.Compression namespace contains definitions relating to data compression. The following gun zip function can be used to decompress a stream representing compressed data stored in the GZip format (commonly found on Unix systems):

> open System;;
> let gunzip (stream : #IO.Stream) =
    let mode = IO.Compression.CompressionMode.Decompress
    new IO.Compression.GZipStream(stream, mode);;
val gunzip : #IO.Stream -> IO.Compression.GZipStream

This function is used to decompress downloaded data in future example programs.

HANDLING XML

The .NET standard library includes many data structures and algorithms for manip-ulating data stored in the XML format. On-line scientific databases are increasingly using the XML format.

Definitions relating to XML are found in the System. Xml class.

Reading

Data in the XML format can be read from a stream, such as the streams generated by the download and gunzip functions defined above, into an object of the class XmlDocument by constructing such an object and invoking its Load method on the stream.

For example, a function to load an XML document from a stream (such as a GZip stream) may be written:

> open System;;

> let xml_of_stream stream =
    let doc = new Xml.XmlDocument()
    doc.Load(stream :> IO.Stream)
    doc;;
val xml_of_stream : #IO.Stream -> Xml.XmlDocument

We shall use this function in future examples dealing with XML.

Writing

Once created, an XML document doc can be written to a stream in XML format simply by invoking the Save method of the doc object. For example, the following prints doc to the console in XML format:

doc.Save(stdout);;

The ability to load and save XML documents using the .NET libraries is particu-larly useful given the increasing amount of scientific data found on the web in XML format. However, the F# programming language has many features that make it ide-ally suited to tree manipulation, including the manipulation of XML data. In order to leverage these language features, it is useful to translate the .NET representation of XML into a native F# variant type.

Declarative representation

The object-oriented approach used by the System.Xml library is not ideal for visualizing in the F# interactive mode or dissecting using pattern matching. Consequently, it is often beneficial to rewrite XML into a declarative form. A variant type can be used to represent XML data in a more F#-friendly way:

> type xml =
    | Element of string *
                 (string * string) list *
                 xml list
    | Text of string;;

For example, the XML snippet:

<doc name="My document">
  <title>This is a document</title>
</doc>

will be represented by the F# value:

Element("doc",
["name", "My document"],
      [Element("title",
               [] ,
               [Text "This is a document"])])

Values of the XmlNode and XmlElement types can be converted to the F# variant type xml using the following pair of mutually-recursive functions:

> let rec node (n : Xml.XmlNode) = match n.NodeType with
    | Xml.XmlNodeType.Element ->
        element (n :?> Xml.XmlElement)
    | Xml.XmlNodeType.Text -> Text n.InnerText
    | _ -> invalid_arg "node"
  and element elt =
    Element(elt.LocalName,
            [for attrib in elt.Attributes ->
               attrib.Name, attrib.Value],
            [for child in elt.ChildNodes ->
               node child]);;
val node : Xml.XmlNode -> xml
val element : Xml.XmlElement -> xml

Note the use of the downcast operator : ? > to convert a node from the parent XmlNode class to the derived XmlElement class.

An XML document doc of the type XmlDocument can be converted into a value of the type xml by applying this element function to its DocumentElement property:

element doc.DocumentElement

The declarative representation of an XML document as a value of the F# type xml, rather than an inheritance hierarchy of objects, allows functions to dissect XML data using pattern matching. This concept is taken a step further in chapter 10 by using active patterns to dissect an object-oriented representation directly without having to copy it into an F# variant type.

CALLING NATIVE LIBRARIES

The main advantage of native-code libraries is that they can provide much better performance than .NET libraries. However, this performance comes at a grave cost in terms of safety. Native-code libraries can cause all-manner of problems when used incorrectly, including random crashing. Consequently, native code libraries should be used only when their functionality is not available via a safer route (i.e. a .NET library) or when performance is critical. In the interests of correctness, the amount of unsafe code should be minimized and thoroughly checked before use.

A binding is a piece of code that describes how an external function or program can be invoked from within the host language (F#). In this case, bindings are F# libraries that include references to external native code libraries. The simplest form of binding details the name library of the DLL, the function func 1 inside the DLL and the C signature of the function including its name func2 in F#:

[<D11Import(@"library.dll", EntryPoint="func1 ") >] extern double *func2(args);

Note that the System.Runtime. InteropServices namespace contains the definition of Dl1Import.

For example, the following F# snippet links to the library mylib.dll, binding the function fib to an F# function called ext_fib which has a signature int - > int:

[<DllImport(©"mylib.dll", EntryPoint="fib")>]
extern int ext_fib(int n);

For detailed information about writing bindings to native-code libraries, refer to the F# manual.

FOURIER TRANSFORM

The ability to compute a numerical approximation to the Fourier transform of a signal is of fundamental importance in scientific computing. A great deal of computer science research has been directed at this area and the latest Fast Fourier Transform (FFT) algorithms are capable of transforming n uniform samplings of a signal into Fourier space in 0(n logn) time and with (

FOURIER TRANSFORM

The FFT algorithm is based upon non-trivial results from number theory. Specifi-cally, the logarithm in the complexity of the FFT stems from the recursive subdivision of the n -element input vector into n /p- element subvectors where p is a small prime fac-tor of n. When n does not have any small prime factors (e.g. when n itself is prime), the algorithm expands the input into a larger m -element vector where m > 2n such that m has many small prime factors and can then be subdivided efficiently. Although the performance follows an n log n trend, the results for individual n are strongly dependent upon the factorization of n and, consequently, appear "noisy". The vari-ation in performance between consecutive n can be as much as a factor of four and, consequently, larger values of n can actually be significantly faster.

Native-code bindings

The FFTW library, developed at MIT, is the best freely-available FFT implementation. This section details minimal bindings to the FFTW library.

The Fourier transform and inverse Fourier transform are referred to as forward and backward transforms:

> type direction = Forward | Backward;;

Searching possible factorizations can take as long as the FFT itself but factorization plans can be reused when computing many FFTs of the same length. Two of search algorithms implemented by FFTW allow a factorization plan to be estimated from general performance information or calculated more accurately by making real performance measurements:

> type precalc = Estimate | Measure;;

In the interests of performance, FFTW aligns data to CPU-friendly boundaries and provides functions to allocate and free aligned memory:

> open System.Runtime. InteropServices;;

> [<DllImport(@"libfftw3-3.dll",
              EntryPoint="fftw_malloc")>]
  extern double *fftw_malloc(int size);;
val fftw_malloc : int -> double nativeptr

> [<DllImport(@"libfftw3-3.dll" ,
              EntryPoint="fftw_free")>]
  extern void fftw_free(double *data);;
val fftw_free : double nativeptr -> unit

Factorization plans are created by the fftw_p1an_dft_ld function:

> [<DllImport(@"libfftw3-3.dll",
              EntryPoint="fftw_plan_dft_ld")>]
  extern void *plan_dft_ld_(int n, double *i, double *o,
                            int sign, int flags);;
val plan_dft_ld_ :
  int -> double nativeptr -> double nativeptr ->
    int -> int -> nativeint

Note that this function has been bound to the F# function p lan_dft_ld_ where the final underscore is taken to indicate an unsafe binding. The i and o arguments are the input and output vectors and the return value (declared to be of the C type void *) is an FFTW "plan".

The sign and flag arguments to the plan_dft_ld_ function can be derived from direction and precalc values:

> let plan_dft_ld n i o d e =
    let d =
      match d with
      | Forward -> −1
      | Backward -> 1
    let flags =
      match e with
      | Estimate -> 64
      | Measure -> 0
    plan_dft_ld_(n, i, o, d, flags + 1);;
val plan_dft_ld :
  int -> double nativeptr -> double nativeptr ->
direction -> precalc -> nativeint

Note that the flags variable always has bit 1 set, allowing FFTW to alter the input array.

The execute function is the only thread-safe function provided by FFTW and it computes the FFT for a given plan.

> [<DllImport(@"libfftw3-3.dll",
              EntryPoint="fftw_execute")>]
  extern void execute(void *plan);;
val execute : nativeint -> unit

Interface in F#

The following function creates an FFTW-compatible array to store n complex num-bers, returning both the native pointer to the array and the NativeArray itself:

> let make n =
    let ptr = fftw_malloc(16 * n)
    let na = NativeArray.FromPtr(ptr, n)
    ptr, na;;
val make : int -> double nativeptr * NativeArray<double>

Note that the argument to fftw_malloc is the size of the array in bytes. In this case there are 8 bytes per float and two floats per complex number.

The following function sets the ith element of the native array a to the complex number z:

> let set (a : NativeArray<double>) i (z : Complex) =
    a. [2*i] <- z.r
    a. [2*i + 1] <- z. i;;
val set : NativeArray<double> -> int -> Complex -> unit

The following function gets the ith element of the native array a as a complex number:

> let get (a : NativeArray<double>) i =
    Math.Complex.Create(a. [2*i] , a.[2*i + 1]);;
val get : NativeArray<double> -> int -> Complex

Rather than deleting factorization plans, we shall memoize them indefinitely in a hash table for later reuse:

> let plan =
    memoize (fun (n, d, e) ->
      let ptr, array = make n
      a, plan_dft_ld n ptr ptr d e);;
val plan :
  (int -> direction -> precalc ->
NativeArray<double> * nativeint)

A Fourier transform can be computed in the given direction using a controllable amount of precomputation for the factorization by creating a plan, copying the vector into the plan, applying the execute function and copying the data back out:

> let fft direction effort a =
    let n = Array.length a
    if n=0 then [||] else
    let array, plan = plan n direction effort
    Array.iteri (set array) a
    execute plan
    let s = complex (1.0 / sqrt(float n)) 0.0
    Array.init n (fun i -> s * get array i);;
val fft :
  direction -> precalc -> Complex array -> Complex array

Finally, we can define some easy-to-use functions for computing the Fourier transforms of complex arrays:

> let fourier = fft Forward Estimate;;
val fourier : Complex array -> Complex array

> let ifourier = fft Backward Estimate;;
val fourier : Complex array -> Complex array

The library functions can then be used.

Pretty printing complex numbers

In the interests of clarity, let us define and register a pretty printer for the Complex type:

> let chop x =
    if abs x < sqrt epsilon_float then 0.0 else x;;
val chop : float -> float

> let string_of_complex (z : Complex) =
    match chop z.Real, chop z.Imag with
    | 0.0, 0.0 -> "0"
    | r, 0.0 -> sprintf "%g" r
    | 0.0, i -> sprintf "%gi" i
    | r, i when i < 0.0 -> sprintf "%g - %gi" r (-i)
    | r, i -> sprintf "%g + %gi" r i;;
val string_of_complex : Math.Complex -> string

> fsi.AddPrinter(string_of_complex);;
val it : unit = ()

Complex numbers are now chopped and pretty printed in a much more concise way.

Fourier series of a discretely sampled sine wave, showing: a) the samples ur rϵ [0,16) and Fourier series , and b) the corresponding Fourier coefficients vs computed numerically using FFTW.

Figure 9.2. Fourier series of a discretely sampled sine wave, showing: a) the samples ur rϵ [0,16) and Fourier series

Fourier series of a discretely sampled sine wave, showing: a) the samples ur rϵ [0,16) and Fourier series , and b) the corresponding Fourier coefficients vs computed numerically using FFTW.
, and b) the corresponding Fourier coefficients vs computed numerically using FFTW.

Example use

As an example, the following creates a 16-element array a containing four repeats of (0,1,0,-1):

> let a =
    [| for i in 0 .. 15 ->
         complex (float i / 2.0 * Math.PI |> sin) 0.0 |]
val a : Complex array

> a;;
val it : Complex array =
  [|0; 1; 0; −1; 0; 1; 0; −1; 0; 1; 0; −1; 0; 1; 0; -l|]

This discrete sampling is illustrated in figure 9.2a. We shall now calculate the functional form of the Fourier series via the numerically computed series found using FFTW.

Taking the samples to be unit-separated samples, the Nyquist frequency is

Example use

The FFT of a is:

> fourier a;;
val it : Complex [] =
  [|0; 0; 0; 0; −2i; 0; 0; 0; 0; 0; 0; 0; 2i; 0; 0; 0|]

Each element vs of this array may be taken to represent the frequency v (s) = sv Ny/n and amplitude

Example use

In this case, the only non-zero elements are v4= —8i and v 12 = 8i. This shows that the signal can be represented as the sum of two plane waves which, in fact, partly cancel to give a sine wave:

Example use

as expected.

Also, the inverse FFT of the FFT of a recovers the original a to within a numerical error of 10−16:

> ifourier (fourier a);;
val it : Complex [] =
  [|0; 1; 0; −1; 0; 1; 0; −1; 0; 1; 0; −1; 0; 1; 0; -l|]

This FFT library allow the Fourier series of large data sets with up to three dimensions to be computed efficiently.

METAPROGRAMMING

The generation of explicit subprograms is a form of metapwgramming. The F# programming language makes metaprogramming easy for two reasons:

  1. The F# language is derived from languages that were designed to manipulate programs.

  2. The .NET platform provides light-weight code generation, allowing its inter-mediate language (IL) to be generated and invoked at run-time.

One of the most important aspects of metaprogramming is partial specialization. This refers to the generation of specialized subprograms that can perform a task more efficiently than a generic program.

Metaprogramming is currently available via the direct generation of IL code and via a high-level interface provided by the LINQ library in .NET 3.

Emitting IL code

Definitions relating to IL code generation are contained in the following namespaces:

> open System;;
> open System.Reflection;;
> open System.Reflection.Emit;;

A dynamically-generated function called f that accepts one int parameter and returns an int can be created with:

> let dm =
    new DynamicMethod("f",
                      (type int),
                      [|(type int)|],
                      (type obj));;
val dm : DynamicMethod

An object used to generate the IL code for this new function can be obtained with:

> let il = dm.GetlLGenerator();;
val il : ILGenerator

IL code is a stack-based bytecode, meaning that the expression (x + 2) × 3 is represented by commands to push argument zero (x) and the constant 2, followed by an add to consume the two stack entries and push the value of x +2:

> il.Emit(OpCodes.Ldarg, 0);;
val it : unit = ()

> il.Emit(Opcodes.Ldc_I4, 2);;
val it : unit = ()

> il.Emit(Opcodes.Add);;
val it : unit = ()

Then push 3 and consume two stack entries with a multiply:

> il.Emit(OpCodes.Ldc_I4, 3);;
val it : unit = ()

> il.Emit(Opcodes.Mul);;
val it : unit = ()

Finally, return with the value on the top of the stack:

> il.Emit(OpCodes.Ret);;
val it : unit = ()

This completes the definition of a new function. The function can be invoked using the Invoke method and supplying an obj array of the arguments. For example, 3(x + 2) = 21 for x = 5:

> (dm.Invoke(null, [|box 5|]) :?> int);;
val it : int = 21

Direct generation of IL code is harder and more error prone than using a high level library. However, the high-level libraries available at the time of writing lack some important features that may impede the use of metaprogramming in scientific computing.

Compiling with LINQ

The LINQ project is a new part of .NET that provides metaprogramming as a way to improve cross-language interoperability and, in particular, database querying. Amongst other things, this library provides a simple abstract syntax tree representa-tion of expressions and a compiler that can convert these high-level abstract syntax trees into IL code.

At the time of writing, the LINQ project installs into its own directory:

> #I @"C:Program FilesLINQ PreviewBin";;

The following DLL contains the required functionality:

> #r "System.Query.dll";;

The following namespaces contain relevant definitions:

> open System;;
> open System.Expressions;;

The following .NET delegate type can be used to represent a function type for single-argument functions:

> type ('a, 'b) fn = delegate of 'a -> 'b;;

In the interests of clarity, we use the following abbreviation:

> type E = Expression;;

The following int function creates a constant integer expression:

> let int (n : int) = E.Constant(n);;
val int : int -> ConstantExpression

The two following binary operators + : and * : are used to compose expressions representing arithmetic operations:

> let ( +: ) f g = E.Add(f, g);;
val ( +: ) :
  #Expression -> #Expression -> BinaryExpression
> let ( *: ) f g = E.Multiply(f, g);;
val ( *: ) :
  #Expression -> #Expression -> BinaryExpression

A compiled delegate representing the &lumbda:-expression x→ (x + 2) ×3 may be generated with:

> let f : fn<int, int> =
    let x = E.Parameter((type Int32), "x")
    (E.Lambda((x +: int 2) *: int 3, x)).Compile();;
val f : (int, int) fn

Note the type annotation for f is required for this example to work.

The generated delegate function f can be invoked with its argument using the Invoke method with

> f.Invoke(5);;
val it : int = 21

Metaprogramming can be used in scientific computing to improve the performance of many algorithms such as low-dimensional vector and matrix manipulation and wavelet/Fourier transforms.



[19] http://www.componentxtra.com

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset