Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 17. Interoperating with C and COM

Software integration and reuse is becoming one of the most relevant activities of software development nowadays. This chapter discusses how F# programs can interoperate with the outside world, accessing code available in the form of DLLs and COM components.

Common Language Runtime

Libraries and binary components provide a common way to reuse software; even the simplest C program is linked to the standard C runtime to benefit from core functions such as memory management and I/O. Modern programs depend on a large number of libraries that are shipped in binary form, and only some of them are written in the same language as the program. Libraries can be linked statically during compilation into the executable or can be loaded dynamically during program execution. Dynamic linking has become significantly common to help share code (dynamic libraries can be linked by different programs and shared among them) and adapt program behavior while executing.

Interoperability among binaries compiled by different compilers, even of the same language, can be a nightmare. One of the goals of the .NET initiative was to ease this issue by introducing the Common Language Runtime (CLR), which is targeted by different compilers and different languages to help interoperability among software developed in those languages.

The CLR is a runtime designed to run programs compiled for the .NET platform. The binary format of these programs differs from the traditional one adopted by executables; Microsoft terminology uses managed for the first class of programs and unmanaged otherwise (see Figure 17-1).

Figure 17.1. Compilation scheme for managed and unmanaged code

A Deeper Look Inside .NET Executables

Programs for the .NET platform are distributed in a form that is executed by the CLR. Binaries are expressed in an intermediate language that is compiled incrementally by the Just-In-Time (JIT) compiler during program execution. A .NET assembly, in the form of a .dll or an .exe file, contains the definition of a set of types and the definition of the method bodies, and the additional data describing the structure of the code in the intermediate language form is known as metadata. The intermediate language is used to define method bodies based on a stack-based machine, where operations are performed by loading values on a stack of operands and then invoking methods or operators.

Consider the following simple F# program in the Program.fs source file:

open System
let i = 2
Console.WriteLine("Input a number:")
let v = Int32.Parse(Console.ReadLine())
Console.WriteLine(i * v)

The F# compiler generates an executable that can be disassembled using the ildasm.exe tool distributed with the .NET Framework. The following screenshot shows the structure of the generated assembly. Because everything in the CLR is defined in terms of types, the F# compiler must introduce the class $Program$ Main in the <StartupCode$applicationname> namespace. In this class, the definition of the main@ static method is the entry point for the execution of the program. This method contains the intermediate language corresponding to the example F# program. The F# compiler generates several elements that aren't defined in the program, whose goal is to preserve the semantics of the F# program in the intermediate language.

If you open the main@ method, you find the following code, which is annotated here with the corresponding F# statements:

.method public static void  main@() cil managed
{
  .entrypoint
  // Code size       38 (0x26)
  .maxstack  4

   // Console.WriteLine("Input a number:")
  IL_0000:  ldstr      "Input a number:"
  IL_0005:  call       void [mscorlib]System.Console::WriteLine(string)

  // let v = Int32.Parse(Console.ReadLine())
  IL_000a:  call       string [mscorlib]System.Console::ReadLine()
  IL_000f:  call       int32 [mscorlib]System.Int32::Parse(string)
  IL_0014:  stsfld     int32 '<StartupCode$ConsoleApplication1>'.$Program::v@4

  // Console.WriteLine(i * v) // Note that i is constant and its value has been inlined
  IL_0019:  ldc.i4.2
  IL_001a:  call       int32 Program::get_v()
  IL_001f:  mul
  IL_0020:  call       void [mscorlib]System.Console::WriteLine(int32)

// Exits
  IL_0025:  ret
} // end of method $Program$Main::main@

The ldxxx instructions are used to load values onto the operand stack of the abstract machine, and the stxxx instructions store values from that stack in locations (locals, arguments, or class fields). In this example, a static field is used for v, and the value of i is inlined using the ldc instruction. For method invocations, arguments are loaded on the stack, and a call operation is used to invoke the method.

The JIT compiler is responsible for generating the binary code that runs on the actual processor. The code generated by the JIT interacts with all the elements of the runtime, including external code loaded dynamically in the form of DLLs or COM components.

Because the F# compiler targets the CLR, its output is managed code, allowing compiled programs to interact directly with other programming languages targeting the .NET platform. Chapter 10 showed how to exploit this form of interoperability, when you saw how to develop a graphic control in F# and use it in a C# application.

Memory Management at Runtime

Interoperability of F# programs with unmanaged code requires an understanding of the structure of the most important elements of a programming language's runtime. In particular, you must consider how program memory is organized at runtime. Memory used by a program is generally classified in three classes depending on the way it's handled:

Static memory: Allocated for the entire lifetime of the program
Automatic memory: Allocated and freed automatically when functions or methods are executed
Dynamic memory: Explicitly allocated by the program, and freed explicitly or by an automatic program called the garbage collector

As a rule of thumb, top-level variables and static fields belong to the first class, function arguments and local variables belong to the second class, and memory explicitly allocated using the new operator belongs to the last class. The code generated by the JIT compiler uses different data structures to manage memory and automatically interacts with the operating system to request and release memory during program execution.

Each execution thread has a stack where local variables and arguments are allocated when a function or method is invoked (see Figure 17-2). A stack is used because it naturally follows the execution flow of method and function calls. The topmost record contains data about the currently executing function; below that is the record of the caller of the function, which sits on top of another record of its caller, and so on. These activation records are memory blocks used to hold the memory required during the execution of the function and are naturally freed at the end of its execution by popping the record off the stack. The stack data structure is used to implement the automatic memory of the program; and because different threads execute different functions at the same time, a separate stack is assigned to each of them.

Figure 17.2. Memory organization of a running CLR program

Dynamic memory is allocated in the heap, which is a data structure where data resides for an amount of time not directly related to the events of program execution. The memory is explicitly allocated by the program, and it's deallocated either explicitly or automatically depending on the strategy adopted by the runtime to manage the heap. In the CLR, the heap is managed by a garbage collector, which is a program that tracks memory usage and reclaims memory that is no longer used by the program. Data in the heap is always referenced from the stack—or other known areas such as static memory—either directly or indirectly. The garbage collector can deduce the memory potentially reachable by program execution in the future, and the remaining memory can be collected. During garbage collection, the running program may be suspended because the collector may need to manipulate objects needed by its execution. In particular, a garbage collector may adopt a strategy called copy collection that can move objects in memory; during this process, the references may be inconsistent. To avoid dangling references, the memory model adopted by the CLR requires that methods access the heap through object references stored on the stack; objects in the heap are forbidden to reference data on the stack.

Data structures are the essential tool provided by programming languages to group values. A data structure is rendered as a contiguous area of memory in which the constituents are available at a given offset from the beginning of the memory. The actual layout of an object is determined by the compiler (or by the interpreter for interpreted languages) and is usually irrelevant to the program because the programming language provides operators to access fields without having to explicitly access the memory. System programming, however, often requires explicit manipulation of memory, and programming languages such as C let you control the in-memory layout of data structures. The C specification, for instance, defines that fields of a structure are laid out sequentially, although the compiler is allowed to insert extra space between them. Padding is used to align fields at word boundaries of the particular architecture in order to optimize the access to the fields of the structure. Thus, the same data structure in a program may lead to different memory layouts depending on the strategy of the compiler or the runtime, even in a language such as C where field ordering is well defined. By default, the CLR lays out structures in memory without any constraint, which gives you the freedom of optimizing memory usage on a particular architecture, although it may introduce interoperability issues if a portion of memory must be shared among the runtimes of different languages.^[4]

Interoperability among different programming languages revolves mostly around memory layouts, because the execution of a compiled routine is a jump to a memory address. But routines access memory explicitly, expecting that data is organized in a certain way. The rest of this chapter discusses the mechanisms used by the CLR to interoperate with external code in the form of DLLs or COM components.

COM Interoperability

Component Object Model (COM) is a technology that Microsoft introduced in the 1990s to support interoperability among different programs possibly developed by different vendors. The Object Linking and Embedding (OLE) technology that lets you embed arbitrary content in a Microsoft Word document, for instance, relies on this infrastructure. COM is a binary standard that allows code written in different languages to interoperate, assuming that the programming language supports this infrastructure. Most of the Windows operating system and its applications are based on COM components.

The CLR was initially conceived as an essential tool to develop COM components, because COM was the key technology at the end of 1990s. It's no surprise that the Microsoft implementation of CLR interoperates easily and efficiently with the COM infrastructure.

This section briefly reviews the main concepts of the COM infrastructure and its goals in order to show you how COM components can be consumed from F# (and vice versa) and how F# components can be exposed as COM components.

A COM component is a binary module with a well-defined interface that can be dynamically loaded at runtime by a running program. The COM design was influenced by CORBA and the Interface Definition Language (IDL) to describe a component as a set of interfaces. In the case of COM, however, components are always loaded inside the process using the dynamic loading of DLLs. Even when a component runs in a different process, a stub is loaded as a DLL, and it's responsible for interprocess communication.

When you create an instance of a COM component, you obtain a pointer to an IUnknown interface that acts as the entry point to all interfaces implemented by the component. The QueryInterface method of this interface allows you to get pointers to additional interfaces.

Interface pointers in COM are pointers to tables of pointers defining the method's location. The program must know the layout of the table in order to read the desired pointer and invoke the corresponding method. This knowledge can be compiled into the program (interfaces must be known at compile time) or acquired at runtime by accessing component metadata in the form of an interface named IDispatch or a database called type library.

Because COM components can be compiled by any compiler supporting the generation of memory layouts compatible with the standard, it's necessary for the client to share the same layout for data structures that must be passed or returned by the component methods. The standard type system for COM, defined in ole.dll, defines a simple and restricted set of types. COM types correspond to the Variant type of Visual Basic and provide only basic types and arrays. For structured types, COM requires a custom marshaller to be developed, but this has been rarely used in components that are widely available.

The COM infrastructure provides a memory manager that uses reference counting to automatically free components when they aren't used anymore. Whenever a copy of a pointer to an interface is copied, you must invoke the AddRef method of the IUnknown interface (every interface inherits from IUnknown); and when the pointer is no longer required, you should call the Release method to decrement the counter inside the component. When the counter reaches zero, the component is automatically freed. This strategy of memory management, although more automatic than the traditional malloc/free handling of the heap, has proven to be error prone, because programmers often forget to increment the counter when pointers are copied (risk of dangling pointers) or decrement when a pointer is no longer needed (risk of memory wasted in the garbage).

When Microsoft started developing the runtime that has become the CLR, which was destined to replace the COM infrastructure, several design goals addressed common issues of COM development:

Memory management: Reference counting has proven to be error prone. A fully automated memory manager was needed to address this issue.
Pervasive metadata: The COM type system was incomplete, and the custom marshaller was too restrictive. A more complete and general type system whose description was available at runtime would have eased interoperability.
Data and metadata separation: The separation between data and metadata has proven to be fragile because components without their description are useless, and vice versa. A binary format containing both components and their descriptions avoids these issues.
Distributed components: DCOM, the distributed COM infrastructure, has proven to be inefficient. The CLR was designed with a distributed memory management approach to reduce the network overhead required to keep remote components alive.

The need for a better component infrastructure led Microsoft to create the CLR, but the following concepts from COM proved so successful that they motivated several aspects of the CLR:

Binary interoperability: The ability to interoperate at the binary level gives you the freedom to develop components from any language supporting the component infrastructure. For instance, Visual Basic developers can benefit from C++ components, and vice versa.
Dynamic loading: The interactive dynamic loading of components is an essential element to allow scripting engines such as Visual Basic for Applications to access the component model.
Reflection: The component description available at runtime allows programs to deal with unknown components. This is especially important for programs featuring scripting environments, as witnessed by the widespread use of IDispatch and type libraries in COM.

COM Metadata and the Windows Registry

COM components are compiled modules that conform to well-defined protocols designed to allow binary interoperability among different languages. An essential trait of component architectures is the ability to dynamically create components at runtime. It's necessary for an infrastructure to locate and inspect components in order to find and load them. The Windows registry holds this information, which is why it's such an important structure in the operating system.

The HKEY_CLASSES_ROOT registry key holds the definition of all the components installed on the local computer. It's helpful to understand the basic layout of the registry in this respect when you're dealing with COM components. The following is a simple script in JScript, executed^[5] by the Windows Scripting Host, which is an interpreter used to execute Visual Basic and JScript scripts on Windows:

w = WScript.CreateObject("Word.Application");
w.Visible = true;
WScript.Echo("Press OK to close Word");
w.Quit();

This simple script creates an instance of a Microsoft Word application and shows its window programmatically by setting the Visible property to true. Assuming that the script is executed using the wscript command (the default), its execution is stopped until the message box displayed by the Echo method is closed, and then Word is closed.

How can the COM infrastructure dynamically locate the Word component and load it without prior knowledge about it? The COM infrastructure is accessed through the ubiquitous CreateObject method, which accepts as input a string containing the program ID of the COM component to be loaded. This is the human-readable name of the component; but the COM infrastructure was conceived as a foundation of a potentially large number of components and therefore adopted globally unique identifier (GUID) strings to define components. GUIDs are often displayed during software installation and are familiar for their mysterious syntax of a sequence of hexadecimal digits enclosed in curly braces. GUIDs are also used to identify COM classes; these IDs are known as CLSIDs and are stored in the Windows registry as subkeys containing further metadata about the COM object. When CreateObject is invoked, the code infrastructure looks for the key:

HKEY_CLASSES_ROOTWord.ApplicationCLSID

The default value for the key in this example (on one of our computers) is as follows:

{000209FF-0000-0000-C000-000000000046}

Now, you can access the registry key defining the COM component and find all the information relative to the component. The following screenshot shows the content of the LocalServer32 subkey, which says that winword.exe is the container of the Word.Application component. If a COM component should be executed in a process different from that of the creator, LocalServer32 contains the location of the executable. Components are often loaded in-process in the form of a DLL; in this case, the InprocServer32 key indicates the location of the library.

To get a feel for the number of COM components installed on a Windows system, you can use a few lines of F# using fsi.exe as a shell:

> open Microsoft.Win32;;
> let k = Registry.ClassesRoot.OpenSubKey("CLSID");;
val k : RegistryKey
> k.GetSubKeyNames().Length;;
val it : int = 9237
> k.Close();;
val it : unit = ()

The registry is also responsible for storing information (that is, the metadata associated with COM components that describes their interfaces) about type libraries. Type libraries are also described using registry entries and identified by GUIDs available under the registry key:

HKEY_CLASSES_ROOTTypeLib

COM components can be easily consumed from F# programs, and the opposite is also possible by exposing .NET objects as COM components. The following example is similar to the one discussed in the "COM Metadata and the Windows Registry" sidebar; it's based on the Windows Scripting Host but uses F# and fsi.exe:

> open System;;
> let o = Activator.CreateInstance(Type.GetTypeFromProgID("Word.Application"));;
val o : obj
> let t = o.GetType();;
val t : Type = System.__ComObject
> t.GetProperty("Visible").SetValue(o, (true :> Object), null);;
val it : unit  = ()
> let m = t.GetMethod("Quit");;
val m : Reflection.MethodInfo
> m.GetParameters().Length;;
val it : int = 3
> m.GetParameters();;
val it : ParameterInfo []
       = [|System.Object& SaveChanges
             {Attributes = In, Optional, HasFieldMarshal;
              DefaultValue = System.Reflection.Missing;
              IsIn = true;
              IsLcid = false;
              IsOptional = true;
              IsOut = false;
              IsRetval = false;
              Member =
                Void Quit(System.Object ByRef,
                            System.Object ByRef, System.Object ByRef);
              MetadataToken = 134223449;
              Name = "SaveChanges";
              ParameterType = System.Object&;
              Position = 0;
              RawDefaultValue = System.Reflection.Missing;};
           ... more ... |]
> m.Invoke(o, [| null; null; null |]);;
val it : obj = null

Because F# imposes type inference, you can't use the simple syntax provided by an interpreter. The compiler should know in advance the number and type of arguments of a method and the methods exposed by an object. Remember that even though fsi.exe allows you to interactively execute F# statements, it's still subject to the constraints of a compiled language. Because you're creating an instance of a COM component dynamically in this example, the compiler doesn't know anything about this component. Thus, it can be typed as System.Object. To obtain the same behavior as an interpreted language, you must resort to .NET runtime's reflection support. Using the GetType method, you can obtain an object describing the type of the object o. Then, you can obtain a PropertyInfo object describing the Visible property, and you can invoke the SetValue method on it to show the Word main window. The SetValue method is generic; therefore, you have to cast the Boolean value to System.Object to comply with the method signature.

In a similar way, you can obtain an instance of the MethodInfo class describing the Quit method. Because a method has a signature, you ask for the parameters; there are three of them, and they're optional. You can invoke the Quit method by calling the Invoke method and passing the object target of the invocation and an array of arguments that you set to null because arguments are optional.

Note

Although COM technology is still widely used for obtaining so-called automation, .NET is quietly entering the picture, and several COM components are implemented using the CLR. Whenever a reference to mscoree.dll appears in the InprocServer32 registry key, the .NET runtime is used to deliver the COM services using the specified assembly. Through COM interfaces, native and .NET components can be composed seamlessly, leading to very complex interactions between managed and unmanaged worlds. Microsoft Word 2010, for instance, returns a .NET object instead than a COM wrapper, which provides access to Word services without the need for explicit use of reflection.

How can the runtime interact with COM components? The basic approach is based on the COM callable wrapper (CCW) and the runtime callable wrapper (RCW), as shown in Figure 17-3. The former is a chunk of memory dynamically generated with a layout compatible with the one expected from COM components, so that external programs—even legacy Visual Basic 6 applications—can access services implemented as managed components. The latter is more common and creates a .NET type dealing with the COM component, taking care of all the interoperability issues. It's worth noting that although the CCW can always be generated because the .NET runtime has full knowledge about assemblies, the opposite isn't always possible. Without IDispatch or type libraries, there is no description of a COM component at runtime. Moreover, if a component uses custom marshalling, it can't be wrapped by an RCW. Fortunately, for the majority of COM components, it's possible to generate an RCW.

Figure 17.3. The wrappers generated by the CLR to interact with COM components

Programming patterns based on event-driven programming are widely adopted, and COM components have a programming pattern to implement callbacks based on the notion of a sink. The programming pattern is based on the delegate event model, and the sink is where a listener can register a COM interface that should be invoked by a component to notify an event. The Internet Explorer Web Browser COM component (implemented by shdocvw.dll), for instance, provides a number of events to notify its host about various events such as loading a page or clicking a hyperlink. The RCW generated by the runtime exposes these events in the form of delegates and takes care of handling all the details required to perform the communication between managed and unmanaged code.

Although COM components can be accessed dynamically using .NET reflection, explicitly relying on the ability of the CLR to generate CCW and RCW, it's desirable to use a less verbose approach to COM interoperability. The .NET runtime ships with tools that allow you to generate RCW and CCW wrappers offline, which lets you use COM components as .NET classes and vice versa. These tools are as follows:

tlbimp.exe: This is a tool for generating an RCW of a COM component given its type library.
aximp.exe: This is similar to tlbimp.exe and supports the generation of ActiveX components^[6] that have graphical interfaces (and that need to be integrated with Windows Forms).
tlbexp.exe: This generates a COM type library describing a .NET assembly. The CLR is loaded as a COM component and generates the appropriate CCW to make .NET types accessible as COM components.
regasm.exe: This is similar to tlbexp.exe. It also performs the registration of the assembly as a COM component.

To better understand how COM components can be accessed from your F# programs and vice versa, let's consider two examples. In the first, you wrap the widely used Flash Player into a form interactively; and in the second, you see how an F# object type can be consumed as if it were a COM component.

The Flash Player you're accustomed to using in everyday browsing is an ActiveX control that is loaded by Internet Explorer using an OBJECT element in the HTML page (it's also a plug-in for other browsers, but here you're interested in the COM component). By using a search engine, you can easily find that an HTML element similar to the following is used to embed the player in Internet Explorer:

<OBJECT
      classid ="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000"
      codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab"
      width   ="640" height="480"
      title   ="My movie">
   <param name="movie"   value="MyMovie.swf" />
   <param name="quality" value="high" />
</OBJECT>

From this tag, you know that the CLSID of the Flash Player ActiveX component is the one specified with the classid parameter of the OBJECT element. You can now look in the Windows registry under HKEY_CLASSES_ROOTCLSID for the subkey corresponding to the CLSID of the Flash ActiveX control. If you look at the subkeys, you notice that the ProgID of the component is ShockwaveFlash.ShockwaveFlash, and InprocServer32 indicates that its location is C:Windowssystem32MacromedFlashFlash10d.ocx. You can also find the GUID relative to the component type library—which, when investigated, shows that the type library is contained in the same OCX file.

Note

With a 64-bit version of Windows and the 32-bit version of the Flash Player, you should look for the key CLSID under HKEY_CLASSES_ROOTWow6432Node, which is where the 32-bit component's information is stored. In general, all references to the 32-bit code are stored separately from the 64-bit information. The loader tricks old 32-bit code to see different portions of the registry. In addition, executable files are stored under %WinDir%SysWow64 instead of %WinDir%system32. Moreover, to wrap 32- or 64-bit components, you need the corresponding version of the .NET tool.

Because Flash Player is an ActiveX control with a GUI, you can rely on aximp.exe rather than just tlbimp.exe to generate the RCW for the COM component:

C:> aximp c:WindowsSystem32MacromedFlashFlash10d.ocx
Generated Assembly: C:ShockwaveFlashObjects.dll
Generated Assembly: C:AxShockwaveFlashObjects.dll

If you use ildasm.exe to analyze the structure of the generated assemblies, notice that the wrapper of the COM component is contained in ShockwaveFlashObjects.dll and is generated by the tlbimp.exe tool. The second assembly contains a Windows Forms host for ActiveX components and is configured to host the COM component, exposing the GUI features in terms of the elements of the Windows Forms framework.

You can test the Flash Player embedded in an interactive F# session as follows:

> #I @"c:";;
--> Added 'c: ' to library include path
> #r "AxShockwaveFlashObjects.dll";;
--> Referenced 'c:AxShockwaveFlashObjects.dll'
> open AxShockwaveFlashObjects;;
> open System.Windows.Forms;;
> let f = new Form();;
val f : Form
> let flash = new AxShockwaveFlash();;
val flash : AxShockwaveFlash
Binding session to 'c:AxShockwaveFlashObjects.dll'...
> f.Show();;
val it : unit = ()
> flash.Dock <- DockStyle.Fill;;
val it : unit = ()
> f.Controls.Add(flash);;
val it : unit = ()
> flash.LoadMovie(0, "http://laptop.org/img/meshDemo18.swf");;
val it : unit = ()

You first add to the include path of the fsi.exe directory containing the assemblies generated by aximp.exe using the #I directive, and then you reference the AxShockwaveFlashObjects.dll assembly using the #r directive. The namespace AxShockwaveFlashObjects containing the AxShockwaveFlash class is opened; this is the managed class wrapping the ActiveX control. You create an instance of the Flash Player that is now exposed as a Windows Forms control; then, you set the Dock property to DockStyle.Fill to let the control occupy the entire area of the form. Finally, you add the control to the form.

When you're typing the commands into F# Interactive, it's possible to test the content of the form. When it first appears, a right-click on the client area is ignored. After the ActiveX control is added to the form, the right-click displays the context menu of the Flash Player. You can now programmatically control the player by setting the properties and invoking its methods; the generated wrapper takes care of all the communications with the ActiveX component.

The Running Object Table

Sometimes you need to obtain a reference to an out-of-process COM object that is already running. This is useful when you want to automate some task of an already-started application or reuse an object model without needing to start a process more than once. The easiest way to achieve this is through the GetActiveObject method featured by the Marshal class:

#r "EnvDTE"
open System.Runtime.InteropServices
let appObj =Marshal.GetActiveObject("VisualStudio.DTE") :?> EnvDTE80.DTE2
printfn "%s" appObj.ActiveDocument.FullName

In this example, you obtain a reference to one of the most important interfaces of Visual Studio's COM automation model. An interesting experiment is to print the name of the active document open in the editor and try to run different instances of Visual Studio, opening different documents. The COM infrastructure connects to one instance of the COM server without being able to specify a particular one.

You can find a specific instance by accessing a system-wide data structure called the Running Object Table (ROT), which provides a list of running COM servers. Because the name of a running server must be unique within the ROT, many servers mangle the PID with the COM ProgID so it's possible to connect to a given instance; this is the case for Visual Studio. The following F# function connects to a specific Visual Studio instance:

#r "EnvDTE"
open System.Runtime.InteropServices
open System.Runtime.InteropServices.ComTypes

[<DllImport("ole32.dll")>]
extern int internal GetRunningObjectTable(uint32 reserved, IRunningObjectTable& pprot)

[<DllImport("ole32.dll")>]
extern int internal CreateBindCtx(uint32 reserved, IBindCtx& pctx)

let FetchVSDTE (pid:int) =
  let mutable (prot:IRunningObjectTable) = null
  let mutable (pmonkenum:IEnumMoniker) = null
  let (monikers:IMoniker[]) =  Array.create 1 null
  let pfeteched = System.IntPtr.Zero
  let mutable (ret:obj) = null
  let endpid = sprintf ":%d" pid

  try

if (GetRunningObjectTable(0u, &prot) <> 0) || (prot = null) then
        failwith "Error opening the ROT"
    prot.EnumRunning(&pmonkenum)
    pmonkenum.Reset()
    while pmonkenum.Next(1, monikers, pfeteched) = 0 do
      let mutable (insname:string) = null
      let mutable (pctx:IBindCtx) = null
      CreateBindCtx(0u, &pctx) |> ignore
      (monikers.[0]).GetDisplayName(pctx, null, &insname);
      Marshal.ReleaseComObject(pctx) |> ignore
      if insname.StartsWith("!VisualStudio.DTE") && insname.EndsWith(endpid) then
        prot.GetObject(monikers.[0], &ret) |> ignore
  finally
    if prot <> null then Marshal.ReleaseComObject(prot) |> ignore
    if pmonkenum <> null then Marshal.ReleaseComObject(pmonkenum) |> ignore
(ret :?> EnvDTE.DTE)

You use two PInvoke declarations to import functions from the ole.dll COM library where the ROT is defined. After you get a reference to the table, you perform an enumeration of its elements, retrieving the display name and looking for any Visual Studio DTE with a specific PID at its end. The GetObject method is used to finally connect to the desired interface.

This example shows the flexible control .NET and F# can provide over COM infrastructure. The ability to access specific COM object instances is widely used on the server side to implement services made available through web pages.

Next, let's see an example of exposing an F# object type as a COM component. There are several reasons it can be useful to expose a managed class as a COM component, but perhaps the most important is interoperability with legacy systems. COM has been around for a decade and has permeated every aspect of Windows development. Systems have largely used COM infrastructures, and they can be extended using this technology. COM components are heavily used by applications based on the Active Scripting architecture, such as ASP and VBA in Microsoft Office. The ability to expose F# code to existing applications is useful because it allows you to immediately start using this new language and integrate it seamlessly into existing systems.

Note

Version 4 of CLR introduced a new set of tools for .NET. You must pay attention to which version you're using. If you compile an assembly targeting a framework older than version 4, use the tools located in the directory C:Program Files (x86)Microsoft SDKsWindowsv7.0ABin. Otherwise, use the tools in C:Program Files (x86)Microsoft SDKsWindowsv7.0ABinNETFX 4.0 Tools.

Suppose you're exposing a simple F# object type as a COM component, and you invoke a method of this object type from a JScript script. You define the following type in the hwfs.fs file:

namespace HelloWorld

open System

type FSCOMComponent() =
    member x.HelloWorld() = "Hello world from F#!"

The assembly that must be exposed as a COM component should be added to the global assembly cache (GAC), which is where shared .NET assemblies are stored. Assemblies present in the GAC must be strongly named, which means a public key cryptographic signature must be used to certify the assembly. To perform the test, you generate a key pair to be used to sign the assembly using the sn.exe command available with the .NET SDK:

C:> sn -k testkey.snk

Microsoft (R) .NET Framework Strong Name Utility  Version 4.0.30319.1
Copyright (c) Microsoft Corporation.  All rights reserved.

Key pair written to testkey.snk

You can compile the program in a DLL called hwfs.dll:

C:> fsc -a --keyfile:testkey.snk hwfs.fs

Note that you use the --keyfile switch to indicate to the compiler that the output should be signed using the specified key pair. Now, you can add the assembly to the GAC (note that under Windows Vista and 7, the shell used should run with administrator privileges):

C: > gacutil /i hwfs.dll
Microsoft (R) .NET Global Assembly Cache Utility.  Version 4.0.30319.1
Copyright (c) Microsoft Corporation.  All rights reserved.

Assembly successfully added to the cache

Use the regasm.exe tool to register the hwfs.dll assembly as a COM component:

C: > regasm hwfs.dll
Microsoft (R) .NET Framework Assembly Registration Utility 4.0.30319.1
Copyright (C) Microsoft Corporation 1998-2004.  All rights reserved.

Types registered successfully

Next, you must write the script using the JScript language to test the component. The script uses the CreateObject method to create an instance of the F# object type, with the CCW generated by the CLR taking care of all the interoperability issues. But what is the ProgID of the COM component? You use regasm.exe with the /regfile switch to generate a registry file containing the keys corresponding to the registration of the COM component instead of performing it. The generated registry file contains the following component registration (we've included only the most relevant entries):

[HKEY_CLASSES_ROOTHelloWorld.FSCOMComponent]
@="HelloWorld.FSCOMComponent"
[HKEY_CLASSES_ROOTHelloWorld.FSCOMComponentCLSID]
@="{30094EBA-CDCB-3C57-9297-49C724421ACB}"
[HKEY_CLASSES_ROOTCLSID{30094EBA-CDCB-3C57-9297-49C724421ACB}]

@="HelloWorld.FSCOMComponent"
[HKEY_CLASSES_ROOTCLSID{30094EBA-CDCB-3C57-9297-49C724421ACB}InprocServer32]
@="mscoree.dll"
"ThreadingModel"="Both"
"Class"="HelloWorld.FSCOMComponent"
"Assembly"="hwfs, Version=0.0.0.0, Culture=neutral, PublicKeyToken=d25287aa2b2b0de6"
"RuntimeVersion"="v4.0.30319"

The InprocServer32 subkey indicates that the COM component is implemented by mscoree.dll, which is the CLR. The additional attributes indicate the assembly that should be run by the runtime.

Note that the ProgID and the class name of the component are HelloWorld.FSCOMComponent. You can now write the following script in the hwfs.js file:

o = WScript.CreateObject("HelloWorld.FSCOMComponent");
WScript.Echo(o.HelloWorld());

If you execute the script (here using the command-based host cscript), you obtain the expected output:

C: > cscript foo.js
Microsoft (R) Windows Script Host Version 5.8
Copyright (C) Microsoft Corporation. All rights reserved.

Hello world from F#!

Using the fully qualified .NET name to name a COM component isn't always a good idea. Names are important, especially in a component infrastructure, because they're used for long time. So far, you've used only the basic features of COM interoperability, but a number of custom attributes can give you finer control over the CCW generation. These attributes are defined in the System.Runtime.InteropServices namespace; and among these classes you find the ProgIdAttribute class, whose name hints that it's related to the ProgID. You can annotate your F# object type using this attribute:

namespace HelloWorld

open System
open System.Runtime.InteropServices

[<ProgId("Hwfs.FSComponent")>]
type FSCOMComponent() =
    member x.HelloWorld() = "Hello world from F#!"

First, unregister the previous component:

C:> regasm hwfs.dll /unregister
Microsoft (R) .NET Framework Assembly Registration Utility 4.0.30319.1
Copyright (C) Microsoft Corporation 1998-2004.  All rights reserved.

Types un-registered successfully
C:> gacutil /u hwfs
Microsoft (R) .NET Global Assembly Cache Utility.  Version 4.0.30319.1

Copyright (c) Microsoft Corporation.  All rights reserved.

Assembly: hwfs, Version=0.0.0.0, Culture=neutral,
    PublicKeyToken=8c1f06f522fc70f8, processorArchitecture=MSIL
Uninstalled: hwfs, Version=0.0.0.0, Culture=neutral,
    PublicKeyToken=8c1f06f522fc70f8, processorArchitecture=MSIL
Number of assemblies uninstalled = 1
Number of failures = 0

Now you can update the script as follows and register everything again after recompiling the F# file:

o = WScript.CreateObject("Hwfs.FSComponent");
WScript.Echo(o.HelloWorld());

Using other attributes, it's possible to specify the GUIDs to be used and several other aspects that are important in some situations. When a system expects a component implementing a given COM interface, the COM component is expected to return the pointer to the interface with the appropriate GUID. In this case, the ability to explicitly indicate a GUID is essential to defining a .NET interface that should be marshalled as the intended COM interface.

Complex COM components can be tricky to wrap using these tools, and official wrappers are maintained by component developers. Microsoft provides the managed version of the Office components, the managed DirectX library, and the Web Browser control to spare programmers from having to build their own wrappers.

In conclusion, it's possible to use COM components from F# and vice versa. You can expose F# libraries as COM components, which allows you to extend existing systems using F# or use legacy code in F# programs.

Platform Invoke

COM interoperability is an important area in F#, but it's limited to Windows and to the Microsoft implementation of the ECMA and ISO standards of the Common Language Infrastructure (CLI). The CLI standard, however, devises a standard mechanism for interoperability that is called Platform Invoke (PInvoke to friends); it's a core feature of the standard available on all CLI implementations, including Mono.

The basic model underlying PInvoke is based on loading DLLs into the program, which allows managed code to invoke exported functions. DLLs don't provide information other than the entry point location of a function; this isn't enough to perform the invocation unless additional information is made available to the runtime.

The invocation of a function requires the following:

The address of the code in memory
The calling convention, which is how parameters, return values, and other information are passed through the stack to the function
Marshalling of values and pointers so that the different runtime support can operate consistently on the same values

You obtain the address of the entry point using a system call that returns the pointer to the function given a string. You must provide the remaining information to instruct the CLR about how the function pointer should be used.

Calling Conventions

Function and method calls (a method call is similar to a function call but with an additional pointer referring to the object passed to the method) are performed by using a shared stack between the caller and the callee. An activation record is pushed onto the stack when the function is called, and memory is allocated for arguments, the return value, and local variables. Additional information is also stored in the activation record, such as information about exception handling and the return address when the execution of the function terminates.

The physical structure of the activation record is established by the compiler (or by the JIT in the case of the CLR), and this knowledge must be shared between the caller and the called function. When the binary code is generated by a compiler, this isn't an issue; but when code generated by different compilers must interact, it may become a significant issue. Although each compiler may adopt a different convention, the need to perform system calls requires that the calling convention adopted by the operating system is implemented, and it's often used to interact with different runtimes. Another popular approach is to support the calling convention adopted by C compilers, because it's widely used and has become a fairly universal language for interoperability. Note that although many operating systems are implemented in C, the libraries providing system calls may adopt different calling conventions. This is the case with Microsoft Windows: the operating system adopts the stdcall calling convention rather than cdecl, which is the C calling convention.

A significant dimension in the arena of possible calling conventions is the responsibility for removing the activation record from the thread stack. At first glance, it may seem obvious that before returning, the called function resets the stack pointer to the previous state. This isn't the case for programming languages such as C that allow functions with a variable number of arguments, such as printf. When variable arguments are allowed, the caller knows the exact size of the activation record; therefore, it's the caller's responsibility to free the stack at the end of the function call. Apart from being consistent with the chosen convention, there may seem to be little difference between the two choices; but if the caller is responsible for cleaning the stack, each function invocation requires more instructions, which leads to larger executables. For this reason, Windows uses the stdcall calling convention instead of the C calling convention. It's important to notice that the CLR uses an array of objects to pass a variable number of arguments, which is very different from the variable arguments of C: the method receives a single pointer to the array that resides in the heap.

It's important to note that if the memory layout of the activation record is compatible, as it is in Windows, using the cdecl convention instead of the stdcall convention leads to a subtle memory leak. If the runtime assumes the stdcall convention (which is the default), and the callee assumes the cdecl convention, the arguments pushed on the stack aren't freed, and at each invocation the height of the stack grows until a stack overflow happens.

The CLR supports a number of calling conventions. The two most important are stdcall and cdecl. Other implementations of the runtime may provide additional conventions to the user. In the PInvoke design, nothing restricts the supported conventions to these two (and in fact the runtime uses the fcall convention to invoke services provided by the runtime from managed code).

The additional information required to perform the function call is provided by custom attributes that are used to decorate a function prototype and inform the runtime about the signature of the exported function.

Getting Started with PInvoke

This section starts with a simple example of a DLL developed using C++, to which you add code during your experiments using PInvoke. The CInteropDLL.h header file declares a macro defining the decorations associated with each exported function:

#define CINTEROPDLL_API __declspec(dllexport)
extern "C" {
void CINTEROPDLL_API HelloWorld();
}

The __declspec directive is specific to the Microsoft Visual C++ compiler. Other compilers may provide different ways to indicate the functions that must be exported when compiling a DLL.

The first function is HelloWorld; its definition is as expected:

void CINTEROPDLL_API HelloWorld()
{
    printf("Hello C world invoked by F#!
");
}

Say you now want to invoke the HelloWorld function from an F# program. You have to define the prototype of the function and inform the runtime how to access the DLL and the other information needed to perform the invocation. The program performing the invocation is the following:

open System.Runtime.InteropServices

module CInterop =
    [<DllImport("CInteropDLL", CallingConvention=CallingConvention.Cdecl)>]
    extern void HelloWorld()

CInterop.HelloWorld()

The extern keyword informs the compiler that the function definition is external to the program and must be accessed through the PInvoke interface. A C-style prototype definition follows the keyword, and the whole declaration is annotated with a custom attribute defined in the System.Runtime.InteropServices namespace. The F# compiler adopts C-style syntax for extern prototypes, including argument types (as you see later), because C headers and prototypes are widely used; this choice helps in the PInvoke definition. The DllImport custom attribute provides the information needed to perform the invocation. The first argument is the name of the DLL containing the function; the remaining option specifies the calling convention chosen to make the call. Because you don't specify otherwise, the runtime assumes that the name of the F# function is the same as the name of the entry point in the DLL. It's possible to override this behavior using the EntryPoint parameter in the DllImport attribute.

It's important to note the declarative approach of the PInvoke interface. No code is involved in accessing external functions. The runtime interprets metadata in order to automatically interoperate with native code contained in a DLL. This is a different approach from the one adopted by different virtual machines, such as the Java virtual machine. The Java Native Interface (JNI) requires that you define a layer of code using types of the virtual machine and invoke the native code.

PInvoke requires high privileges in order to execute native code, because the activation record of the native function is allocated on the same stack containing the activation records of managed functions and methods. Moreover, as discussed shortly, it's also possible to have the native code invoking a delegate marshalled as a function pointer, allowing stacks with native and managed activation records to be interleaved.

The HelloWorld function is a simple case because the function doesn't have input arguments and doesn't return any value. Consider this function with input arguments and a return value:

int CINTEROPDLL_API Sum(int i, int j)
{
    return i + j;
}

Invoking the Sum function requires integer values to be marshalled to the native code and the value returned to managed code. Simple types such as integers are easy to marshal because they're usually passed by value and use types of the underlying architecture. The F# program using the Sum function is as follows:

module CInterop =
    [<DllImport("CInteropDLL", CallingConvention=CallingConvention.Cdecl)>]
    extern int Sum(int i, int j)

printf "Sum(1, 1) = %d
" (CInterop.Sum(1, 1));

Parameter passing assumes the same semantics of the CLR, and parameters are passed by value for value types and by the value of the reference for reference types. Again, you use the custom attribute to specify the calling convention for the invocation.

Data Structures

Let's first cover what happens when structured data is marshalled by the CLR in the case of nontrivial argument types. Here you see the SumC function responsible for adding two complex numbers defined by the Complex C data structure:

typedef struct _Complex {
    double re;
    double im;
} Complex;

Complex CINTEROPDLL_API SumC(Complex c1, Complex c2)
{
    Complex ret;
    ret.re = c1.re + c2.re;
    ret.im = c1.im + c2.im;
    return ret;
}

To invoke this function from F#, you must define a data structure in F# corresponding to the Complex C structure. If the memory layout of an instance of the F# structure is the same as that of the corresponding C structure, then values can be shared between the two languages. But how can you control the memory layout of a managed data structure? Fortunately, the PInvoke specification helps with custom attributes that let you specify the memory layout of data structures. The StructLayout custom attribute indicates the strategy adopted by the runtime to lay out fields of the data structure. By default, the runtime adopts its own strategy in an attempt to optimize the size of the structure, keeping fields aligned to the machine world in order to ensure fast access to the structure's fields. The C standard ensures that fields are laid out in memory sequentially in the order they appear in the structure definition; other languages may use different strategies. Using an appropriate argument, you can indicate that a C-like sequential layout strategy should be adopted. It's also possible to provide an explicit layout for the structure, indicating the offset in memory for each field of the structure. This example uses the sequential layout for the Complex value type:

module CInterop =
    [<Struct; StructLayout(LayoutKind.Sequential)>]
    type Complex =
        val mutable re:double
        val mutable im:double

        new(r,i) = { re = r; im = i; }

    [<DllImport("CInteropDLL")>]
    extern Complex SumC(Complex c1, Complex c2)

let c1 = CInterop.Complex(1.0, 0.0)
let c2 = CInterop.Complex(0.0, 1.0)

let mutable c3 = CInterop.SumC(c1, c2)
printf "c3 = SumC(c1, c2) = %f + %fi
" c3.re c3.im;

The SumC prototype refers to the F# Complex value type. But because the layout of the structure in memory is the same as the corresponding C structure, the runtime passes the bits that are consistent with those expected by the C code.

Marshalling Parameters

A critical aspect of dealing with PInvoke is ensuring that values are marshalled correctly between managed and native code and vice versa. A structure's memory layout doesn't depend only on the order of the fields. Compilers often introduce padding to align fields to memory addresses so that access to fields requires fewer memory operations, because CPUs load data into registers with the same strategy. Padding may speed up access to the data structure, but it introduces inefficiencies in memory usage: there may be gaps in the structures, leading to allocated but unused memory.

Consider, for instance, the following C structure:

struct Foo {
    int i;
    char c;
    short s;
};

Depending on the compiler decision, it may occupy from 8 to 12 bytes on a 32-bit architecture. The most compact version of the structure uses the first 4 bytes for i, a single byte for c, and 2 more bytes for s. If the compiler aligns fields to addresses that are multiples of 4, then the integer i occupies the first slot, 4 more bytes are allocated for c (although only one is used), and the same happens for s.

Padding is a common practice in C programs; because it may affect performance and memory usage, directives instruct the compiler about padding. It's possible to have data structures with different padding strategies running within the same program.

The first step you face when using PInvoke to access native code is finding the definition of data structures, including information about padding. Then, you can annotate F# structures to have the same layout as the native ones, and the CLR can automate the marshalling of data. Note that you can pass parameters by reference; thus, the C code may access the memory managed by the runtime, and errors in memory layout may result in corrupted memory. For this reason, you should keep PInvoke code to a minimum and verify it accurately to ensure that the execution state of the virtual machine is preserved. The declarative nature of the interface is a great help in this respect because you must check declarations and not interop code.

Not all the values are marshalled as is to native code; some values may require additional work from the runtime. Strings, for instance, have different memory representations between native and managed code. C strings are arrays of bytes that are null terminated, whereas runtime strings are .NET objects with a different layout. Also, function pointers are mediated by the runtime: the calling convention adopted by the CLR isn't compatible with external conventions, so code stubs are generated that can be called by native code from managed code and vice versa.

In the SumC example, arguments are passed by value, but native code often requires data structures to be passed by reference to avoid the cost of copying the entire structure and passing only a pointer to the native data. The ZeroC function resets a complex number whose pointer is passed as an argument:

void CINTEROPDLL_API ZeroC(Complex* c)
{
    c->re = 0;
    c->im = 0;
}

The F# declaration for the function is the same as the C prototype:

[<DllImport("CInteropDLL")>]
extern void ZeroC(Complex* c)

Now you need a way to obtain a pointer given a value of type Complex in F#. You can use the && operator to indicate a pass by reference; this results in passing the pointer to the structure expected by the C function:

let mutable c4 = CInterop.SumC(c1, c2)
printf "c4 = SumC(c1, c2) = %f + %fi
" c4.re c4.im

CInterop.ZeroC(&&c4)
printf "c4 = %f + %fi
" c4.re c4.im

In C and C++, the notion of objects (or struct instances) and the classes of memory are orthogonal: an object can be allocated on the stack or on the heap and share the same declaration. In .NET, this isn't the case; objects are instances of classes and are allocated on the heap, and value types are stored in the stack or wrapped into objects in the heap.

Is it possible to pass objects to native functions through PInvoke? The main issue with objects is that the heap is managed by the garbage collector, and one possible strategy for garbage collection is copy collection (a technique that moves objects in the heap when a collection occurs). Thus, the base address in memory of an object may change over time, which can be a serious problem if the pointer to the object has been marshalled to a native function through a PInvoke invocation. The CLR provides an operation called pinning that prevents an object from moving during garbage collection. Pinned pointers are assigned to local variables, and pinning is released when the function performing the pinning exits. It's important to understand the scope of pinning: if the native code stores the pointer somewhere before returning, the pointer may become invalid but still usable from native code.

Now, let's define an object type for Complex and marshal F# objects to a C function. The goal is to marshal the F# object to the ZeroC function. In this case, you can't use the pass-by-reference operator, and you must define everything so that the type checker is happy. You can define another function that refers to ZeroC but with a different signature involving ObjComplex, which is an object type similar to the Complex value type. The EntryPoint parameter maps the F# function onto the same ZeroC C function, although in this case the argument is of type ObjComplex rather than Complex:

module CInterop =
    [<StructLayout(LayoutKind.Sequential)>]
    type ObjComplex =
        val mutable re:double
        val mutable im:double

        new() as x = { re = 0.0; im = 0.0 }
        new(r:double, i:double) as x = { re = r; im = i }

     [<DllImport("CInteropDLL", EntryPoint="ZeroC")>]
    extern void ObjZeroC(ObjComplex c)

let oc = CInterop.ObjComplex(2.0, 1.0)
printf "oc = %f + %fi
" oc.re oc.im
CInterop.ObjZeroC(oc)
printf "oc = %f + %fi
" oc.re oc.im

In this case, the object reference is marshalled as a pointer to the C code, and you don't need the && operator in order to call the function. The object is pinned to ensure that it doesn't move during the function call.

Marshalling Strings

PInvoke defines the default behavior for mapping common types used by the Win32 API. Table 17-1 shows the default conversions. Most of the mappings are natural, but note that there are several entries for strings. This is because strings are represented in different ways in programming language runtimes.

Table 17.1. Default Mapping for Types of the Win32 API Listed in Wtypes.h

Unmanaged Type in `Wtypes.h`	Unmanaged C Type	Managed Class	Description
`HANDLE`	`void*`	`System.IntPtr`	32 bits on 32-bit Windows operating systems, 64 bits on 64-bit Windows operating systems
`BYTE`	`unsigned char`	`System.Byte`	8 bits
`SHORT`	`short`	`System.Int16`	16 bits
`WORD`	`unsigned short`	`System.UInt16`	16 bits
`INT`	`int`	`System.Int32`	32 bits
`UINT`	`unsigned int`	`System.UInt32`	32 bits
`LONG`	`long`	`System.Int32`	32 bits
`BOOL`	`long`	`System.Int32`	32 bits
`DWORD`	`unsigned long`	`System.UInt32`	32 bits
`ULONG`	`unsigned long`	`System.UInt32`	32 bits
`CHAR`	`char`	`System.Char`	Decorate with ANSI
`LPSTR`	`char*`	`System.String` or `System.Text.StringBuilder`	Decorate with ANSI
`LPCSTR`	`const char*`	`System.String` or `System.Text.StringBuilder`	Decorate with ANSI
`LPWSTR`	`wchar_t*`	`System.String` or `System.Text.StringBuilder`	Decorate with Unicode
`LPCWSTR`	`const wchar_t*`	`System.String` or `System.Text.StringBuilder`	Decorate with Unicode
`FLOAT`	`Float`	`System.Single`	32 bits
`DOUBLE`	`Double`	`System.Double`	64 bits

To see how strings are marshalled, start with a simple C function that echoes a string on the console:

void CINTEROPDLL_API echo(char* str)
{
    puts(str);
}

The corresponding F# PInvoke prototype is as follows:

[<DllImport("CInteropDLL", CallingConvention=CallingConvention.Cdecl)>]
extern void echo(string s);

What happens when the F# function echo is invoked? The managed string is represented by an array of Unicode characters described by an object in the heap; the C function expects a pointer to an array of single-byte ANSI characters that are null terminated. The runtime is responsible for performing the conversion between the two formats, and it's performed by default when mapping a .NET string to an ANSI C string.

It's common to pass strings that are modified by C functions, yet .NET strings are immutable. For this reason, it's also possible to use a System.Text.StringBuilder object instead of a string. Instances of this class represent mutable strings and have an associated buffer containing the characters of the string. You can write a C function in the DLL that fills a string buffer given the size of the buffer:

void CINTEROPDLL_API sayhello(char* str, int sz)
{
    static char* data = "Hello from C code!";
    int len = min(sz, strlen(data));
    strncpy(str, data, len);
    str[len] = 0;
}

Because the function writes into the string buffer passed as an argument, you must take care and use a StringBuilder rather than a string to ensure that the buffer has the appropriate room for the function to write. You can use the following F# PInvoke prototype:

[<DllImport("CInteropDLL", CallingConvention=CallingConvention.Cdecl)>]
extern void sayhello(StringBuilder sb, int sz);

Because you have to indicate the size of the buffer, you can use a constructor of the StringBuilder class that allows you to do so:

let sb = new StringBuilder(50)

CInterop.sayhello(sb, 50)
printf "%s
" (sb.ToString())

You've used ANSI C strings so far, but this isn't the only type of string. Wide-character strings are becoming widely adopted and use 2 bytes to represent a single character; following the C tradition, the string is terminated by a null character. Consider a wide-character version of the sayhello function:

void CINTEROPDLL_API sayhellow(wchar_t* str, int sz)
{
    static wchar_t* data = L"Hello from C code Wide!";
    int len = min(sz, wcslen(data));
    wcsncpy(str, data, len);
    str[len] = 0;
}

How can you instruct the runtime that the StringBuilder should be marshalled as a wide-character string rather than an ANSI string? The declarative nature of PInvoke helps by providing a custom attribute to annotate function parameters of the prototype and to inform the CLR about the marshalling strategy to be adopted. The sayhellow function is declared in F# as follows:

[<DllImport("CInteropDLL", CallingConvention=CallingConvention.Cdecl)>]
extern void sayhellow([<MarshalAs(UnmanagedType.LPWStr)>]StringBuilder sb, int sz);

In this case, the MarshalAs attribute indicates that the string should be marshalled as LPWSTR rather than LPSTR.

Function Pointers

Another important data type that often should be passed to native code is a function pointer. Function pointers are widely used to implement callbacks and provide a simple form of functional programming; think for instance of a sort function that receives as input the pointer to the comparison function. Graphical toolkits have widely used this data type to implement event-driven programming, and they often have to pass a function that is invoked by another one.

PInvoke can marshal delegates as function pointers; again, the runtime is responsible for generating a suitable function pointer callable from native code. When the marshalled function pointer is invoked, a stub is called, and the activation record on the stack is rearranged to be compatible with the calling convention of the runtime. Then, the delegate function is invoked.

Although in principle the generated stub is responsible for implementing the calling convention adopted by the native code receiving the function pointer, the CLR supports only the stdcall calling convention for marshalling function pointers. Thus, the native code should adopt this calling convention when invoking the pointer; this restriction may cause problems, but in general on the Windows platform the stdcall calling convention is widely used.

The following C function uses a function pointer to apply a function to an array of integers:

typedef int (CALLBACK *TRANSFORM_CALLBACK)(int);

void CINTEROPDLL_API transformArray(int* data, int count, TRANSFORM_CALLBACK fn)
{
    int i;
    for (i = 0; i < count; i++)
        data[i] = fn(data[i]);
}

The TRANSFORM_CALLBACK type definition defines the prototype of the function pointer you're interested in here: a function taking an integer as the input argument and returning an integer as a result. The CALLBACK macro is specific to the Microsoft Visual C++ compiler and expands to __stdcall in order to indicate that the function pointer, when invoked, should adopt the stdcall calling convention instead of the cdecl calling convention.

The transformArray function takes as input an array of integers with its length and the function to apply to its elements. You now have to define the F# prototype for this function by introducing a delegate type with the same signature as TRANSFORM_CALLBACK:

type Callback = delegate of int -> int

[<DllImport("CInteropDLL", CallingConvention=CallingConvention.Cdecl)>]
extern void transformArray(int[] data, int count, Callback transform);

Now, you can increment all the elements of an array by using the C function:

let data = [| 1; 2; 3 |]
printf "%s
" (string.Join("; ", (Array.map any_to_string data)))

CInterop.transformArray(data, data.Length, new CInterop.Callback(fun x -> x + 1))
printf "%s
" (string.Join("; ", (Array.map any_to_string data)))

PInvoke declarations are concise, but you must pay attention to the fact that for data types such as function pointers, parameter passing can be expensive. In general, libraries assume that crossing the language boundary causes a loss of efficiency and callbacks are invoked at a price different from ordinary functions. In this respect, the example represents a situation where the overhead of PInvoke is significant: a single call to transformArray causes a number of callbacks without performing any real computation into the native code.

PInvoke Memory Mapping

As a more complicated example of PInvoke usage, this section shows you how to benefit from memory mapping into F# programs. Memory mapping is a popular technique that allows a program to see a file (or a portion of a file) as if it was in memory. This provides an efficient way to access files, because the operating system uses the machinery of virtual memory to access files and significantly speed up data access on files. After proper initialization, which is covered in a moment, the program obtains a pointer into the memory, and access to that portion of memory appears the same as accessing data stored into the file.

You can use memory mapping to both read and write files. Every access performed to memory is reflected in the corresponding position in the file.

This is a typical sequence of system calls in order to map a file into memory:

A call to the CreateFile system call to open the file and obtain a handle to the file.
A call to the CreateFileMapping system call to create a mapped file object.
One or more calls to MapViewOfFile and UnmapViewOfFile to map and release portions of a file into memory. In a typical usage, the whole file is mapped at once in memory.
A call to CloseHandle to release the file.

The PInvoke interface to the required functions involves simple type mappings as is usual for Win32 API functions. All the functions are in kernel32.dll, and the signature can be found in the Windows SDK. Listing 17-1 contains the definition of the F# wrapper for memory mapping.

The SetLastError parameter informs the runtime that the called function uses the Windows mechanism for error reporting and that the GetLastError function can be read in case of error; otherwise, the CLR ignores such a value. The CharSet parameter indicates the character set assumed, and it's used to distinguish between ANSI and Unicode characters; with Auto, you delegate the runtime to decide the appropriate version.

You can define the generic class MemMap that uses the functions to map a given file into memory. The goal of the class is to provide access to memory mapping in a system where memory isn't directly accessible because the runtime is responsible for its management. A natural programming abstraction to expose the memory to F# code is to provide an array-like interface where the memory is seen as a homogeneous array of values.

Example 17.1. Exposing Memory Mapping in F#

module MMap =

    open System
    open System.IO
    open System.Runtime.InteropServices
    open Microsoft.FSharp.NativeInterop
    open Printf

type HANDLE = nativeint
    type ADDR   = nativeint

    [<DllImport("kernel32", SetLastError=true)>]
    extern bool CloseHandle(HANDLE handler)

    [<DllImport("kernel32", SetLastError=true, CharSet=CharSet.Auto)>]
    extern HANDLE CreateFile(string lpFileName,
                             int dwDesiredAccess,
                             int dwShareMode,
                             HANDLE lpSecurityAttributes,
                             int dwCreationDisposition,
                             int dwFlagsAndAttributes,
                             HANDLE hTemplateFile)

    [<DllImport("kernel32", SetLastError=true, CharSet=CharSet.Auto)>]
    extern HANDLE CreateFileMapping(HANDLE hFile,
                                    HANDLE lpAttributes,
                                    int flProtect,
                                    int dwMaximumSizeLow,
                                    int dwMaximumSizeHigh,
                                    string lpName)

    [<DllImport("kernel32", SetLastError=true)>]
    extern ADDR MapViewOfFile(HANDLE hFileMappingObject,
                              int dwDesiredAccess,
                              int dwFileOffsetHigh,
                              int dwFileOffsetLow,
                              int dwNumBytesToMap)

    [<DllImport("kernel32", SetLastError=true, CharSet=CharSet.Auto)>]
    extern HANDLE OpenFileMapping(int dwDesiredAccess,
                                  bool bInheritHandle,
                                  string lpName)

    [<DllImport("kernel32", SetLastError=true)>]
    extern bool UnmapViewOfFile(ADDR lpBaseAddress)

    let INVALID_HANDLE = new IntPtr(−1)
    let MAP_READ    = 0x0004
    let GENERIC_READ = 0x80000000
    let NULL_HANDLE = IntPtr.Zero
    let FILE_SHARE_NONE = 0x0000
    let FILE_SHARE_READ = 0x0001
    let FILE_SHARE_WRITE = 0x0002
    let FILE_SHARE_READ_WRITE = 0x0003
    let CREATE_ALWAYS  = 0x0002
    let OPEN_EXISTING   = 0x0003
    let OPEN_ALWAYS  = 0x0004
    let READONLY = 0x00000002

    type MemMap<'a> (fileName) =

let ok =
            match typeof<'a>) with
            | ty when ty = typeof<int>)     -> true
            | ty when ty = typeof<int32>)   -> true
            | ty when ty = typeof<byte>)    -> true
            | ty when ty = typeof<sbyte>)   -> true
            | ty when ty = typeof<int16>)   -> true
            | ty when ty = typeof<uint16>)  -> true
            | ty when ty = typeof<int64>)   -> true
            | ty when ty = typeof<uint64>)  -> true
            | _ -> false

        do if not ok then failwithf
           "the type %s is not a basic blittable type" ((typeof<'a>).ToString())
        let hFile =
           CreateFile (fileName,
                         GENERIC_READ,
                         FILE_SHARE_READ_WRITE,
                         IntPtr.Zero, OPEN_EXISTING, 0, IntPtr.Zero  )
        do if ( hFile.Equals(INVALID_HANDLE) ) then
            Marshal.ThrowExceptionForHR(Marshal.GetHRForLastWin32Error());
        let hMap = CreateFileMapping (hFile, IntPtr.Zero, READONLY, 0,0, null )
        do CloseHandle(hFile) |> ignore
        do if hMap.Equals(NULL_HANDLE) then
            Marshal.ThrowExceptionForHR(Marshal.GetHRForLastWin32Error());

        let start = MapViewOfFile (hMap, MAP_READ,0,0,0)

        do  if ( start.Equals(IntPtr.Zero) ) then
             Marshal.ThrowExceptionForHR(
                  Marshal.GetHRForLastWin32Error())


        member m.AddressOf(i: int) : 'a nativeptr  =
             NativePtr.of_nativeint(start + (nativeint i))

        member m.GetBaseAddress (i:int) : int -> 'a =
            NativePtr.get (m.AddressOf(i))

        member m.Item
            with get(i : int) : 'a = m.GetBaseAddress 0 i

        member m.Close() =
           UnmapViewOfFile(start) |> ignore;
           CloseHandle(hMap) |> ignore

        interface IDisposable with
          member m.Dispose() =
             m.Close()

The class exposes two properties: Item and Element. The former returns a function that allows access to data in the mapped file at a given offset using a function; the latter allows access to the mapped file at a given offset from the origin.

The following example uses the MemMap class to read the first byte of a file:

let mm = new MMap.MemMap<byte>("somefile.txt")

printf "%A
" (mm.[0])

mm.Close()

Memory mapping provides good examples of how easy it can be to expose native functionalities into the .NET runtime and how F# can be effective in this task. It's also a good example of the right way to use PInvoke to avoid calling PInvoked functions directly and build wrappers that encapsulate them. Verifiable code is one of the greatest benefits provided by virtual machines, and PInvoke signatures often lead to nonverifiable code that requires high execution privileges and risks corrupting the runtime's memory.

A good approach to reduce the amount of potentially unsafe code is to define assemblies that are responsible for accessing native code with PInvoke and that expose functionalities in a .NET verifiable approach. This way, the code that should be trusted by the user is smaller, and programs can have all the benefits provided by verified code.

Wrapper Generation and Limits of PInvoke

PInvoke is a flexible and customizable interface, and it's expressive enough to define prototypes for most libraries available. However, in some situations it can be difficult to map directly the native interface into the corresponding signature. A significant example is function pointers embedded into structures, which are typical C programming patterns that approximate object-oriented programming. Here, the structure contains a number of pointers to functions that can be used as methods; but you must take care to pass the pointer to the structure as the first argument to simulate the this parameter. Oracle's Berkeley Database (BDB) is a popular database library that adopts this programming pattern. The core structure describing an open database is as follows:

struct __db {
        /* ... */
        DB_ENV *dbenv;              /* Backing environment. */
        DBTYPE type;               /* DB access method type. */
        /* ... */
        int  (*close) __P((DB *, u_int32_t));
        int  (*cursor) __P((DB *, DB_TXN *, DBC **, u_int32_t));
        int  (*del) __P((DB *, DB_TXN *, DBT *, u_int32_t));
        // ...
}

This was impossible to access directly from the PInvoke interface until .NET 2.0, because function pointers in managed structures were impossible to describe. With version 2 of the runtime, the System.Runtime.InteropServices.Marshal class features the GetFunctionPointerForDelegate for obtaining a pointer to a function that invokes a given delegate. The caller of the function must guarantee that the delegate object will remain alive for the lifetime of the structure, because stubs generated by the runtime aren't moved by the garbage collector but can still be collected. Furthermore, callbacks must adopt the stdcall calling convention: if this isn't the case, the PInvoke interface can't interact with the library.

When PInvoke's expressivity isn't enough for wrapping a function call, you can still write an adapter library in a native language such as C. This is the approach followed by the BDB# library, where an intermediate layer of code has been developed to make the interface to the library compatible with PInvoke. The trick has been, in this case, to define a function for each database function, taking as input the pointer to the structure and performing the appropriate call:

DB *db;
// BDB call
db->close(db, 0);
// Wrapper call
db_close(db, 0);

The problem with wrappers is that they must be maintained manually when the signatures of the original library change. The intermediate adapter makes it more difficult to maintain the code's overall interoperability.

Many libraries have a linear interface that can be easily wrapped using PInvoke, and of course wrapper generators have been developed. At the moment, there are no wrapper generators for F#, but the C-like syntax for PInvoke declarations makes it easy to translate C# wrappers into F# code. An example of such a tool is SWIG, which is a multilanguage wrapper generator that reads C header files and generates interop code for a large number of programming languages such as C#.

Summary

In this chapter, you saw how F# can interoperate with native code in the form of COM components and the standard Platform Invoke interface defined by the ECMA and ISO standards. Neither mechanism is dependent on F#, but the language exposes the appropriate abstractions built into the runtime. You studied how to consume COM components from F# programs and vice versa, and how to access DLLs through PInvoke.

^[4]Languages targeting .NET aren't affected by these interoperability issues because they share the same CLR runtime.

^[5]To execute the script, save the text in a file with .js extension (for instance, StartWord.js) and then double-click the Windows Explorer icon.

^[6]ActiveX components are COM components implementing a well-defined set of interfaces. They have a graphical interface. Internet Explorer is well known for loading these components, but ActiveX can be loaded by any application using the COM infrastructure.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 17. Interoperating with C and COM

Create new playlist

Sign In

Sign Up

Chapter 17. Interoperating with C and COM

Common Language Runtime

Memory Management at Runtime

COM Interoperability

Note

Note

Note

Platform Invoke

Getting Started with PInvoke

Data Structures

Marshalling Strings

Function Pointers

PInvoke Memory Mapping

Wrapper Generation and Limits of PInvoke

Summary

Table of Contents for
17. Interoperating with C and COM