Chapter 5. Higher-Order Functions

Welcome back, my friends, to the show that never ends. In this chapter, we’ll look at uses for higher-order functions. I’m going to show you novel ways to use them in C# to save yourself effort and to make code that is less likely to fail.

But, what are higher-order functions? This slightly odd name represents something very simple. In fact, you’ve likely been using higher-order functions for some time if you’ve spent much time working with LINQ. They come in two flavors; here’s the first:

var liberatorCrew = new []
{
    "Roj Blake",
    "Kerr Avon",
    "Vila Restal",
    "Jenna Stannis",
    "Cally",
    "Olag Gan",
    "Zen"
};
var filteredList = liberatorCrew.Where(x => x.First() > 'M');

Passed into the Where() function is an arrow expression, which is just shorthand for writing out an unnamed function. The longhand version would look like this:

function bool IsGreaterThanM(char c)
{
    return c > 'm';
}

So here, the function has been passed around as the parameter to another function, to be executed elsewhere inside it.

This is another example of the use of higher-order functions:

public Func<int, int> MakeAddFunc(int x) => y => x + y;

Notice here that there are two arrows, not one. We’re taking an integer x and from that returning a new function. In that new function, references to x will be filled in with whatever was provided when MakeAddFunc() was called originally.

For example:

var addTenFunction = MakeAddFunc(10);
var answer = addTenFunction(5);
// answer is 15

By passing 10 into MakeAddFunc() in this example, we create a new function whose purpose is simply to add 10 to whatever additional integer we pass into it.

In short, a higher-order function has one or more of the following properties:

  • Accepts a function as a parameter

  • Returns a function as its return type

In C#, this is all typically done with either a Func (for functions with a return type) or Action (for functions that return void) delegate type. Higher-order functions are a fairly simple idea that’s even easier to implement, but the effect they can have on your codebase is incredible.

In this chapter, I’m going to walk you through ways of using higher-order functions to improve your daily coding. I’ll also introduce a next-level usage of higher-order functions called combinators. These enable passing around functions to create a more complex and useful behavior.

Note

Combinators are called that, incidentally, because they originate from a mathematical technique called combinatory logic. You won’t need to worry about ever hearing that term again or about any references to advanced math—I’m not going there. It’s just in case you were curious…​

A Problem Report

To get started, let’s look at a bit of problem code. Imagine that your company asks you for a function to take a data store (such as an XML file or a JSON file), summarize the number of each possible value, and then transmit that data on to somewhere else. On top of that, the company wants a separate message to be sent in the event that no data is found at all. I run a really loose ship, so let’s keep things fun and imagine you work for the Evil Galactic Empire and are cataloguing Rebel Alliance ships on your radar.

The code might look something like this:

public void SendEnemyShipWeaponrySummary()
{
    try
    {
        var enemyShips = this.DataStore.GetEnemyShips();
        var summaryNumbers = enemyShips.GroupBy(x => x.Type)
            .Select(x => (Type: x.Key, Count: x.Count()));
        var report = new Report
        {
            Title = "Enemy Ship Type",
            Rows = summaryNumbers.Select(X => new ReportItem
            {
                ColumnOne = X.Type,
                ColumnTwo = X.Count.ToString()
            })
        };

        if (!report.Rows.Any())
            this.CommunicationSystem.SendNoDataWarning();
        else
            this.CommunicationSystem.SendReport(report);
    }
    catch (Exception e)
    {
        this.Logger.LogError(e,
        "An error occurred in " +
            nameof(SendEnemyShipWeaponrySummary) +
            ": " + e.Message);
    }
}

This is fine, isn’t it? Isn’t it? Well, think about this scenario. You’re sitting at your desk, eating your daily pot noodle,1 when you notice that—Jurassic Park style—a rhythmic ripple appears in your coffee. This signals the arrival of your worst nightmare: your boss! Let’s imagine that your boss is—thinking totally at random here—a tall, deep-voiced gentleman in a black cape and with appalling asthma. He also hates it when people displease him. Really hates it.

He’s happy with the first function you create. For this, you can breathe a sigh of relief. But now he wants a second function. This one is going to create another summary, but this time of the level of weaponry in each ship—whether they are unarmed, lightly armed, heavily armed, or capable of destroying planets. That sort of thing.

Easy, you think. The boss will be so impressed with how quickly you do this. So you do what seems easiest: Ctrl-C, then Ctrl-V to copy and paste the original, change the name, change the property you’re summarizing, and you end up with this:

public void GenerateEnemyShipWeaponrySummary()
{
    try
    {
        var enemyShips = this.DataStore.GetEnemyShips();
        var summaryNumbers = enemyShips.GroupBy(x => x.WeaponryLevel)
            .Select(x => (Type: x.Key, Count: x.Count()));
        var report = new Report
        {
            Title = "Enemy Ship Weaponry Level",
            Rows = summaryNumbers.Select(X => new ReportItem
            {
                ColumnOne = X.Type,
                ColumnTwo = X.Count.ToString()
            })
        };

        if (!report.Rows.Any())
            this.CommunicationSystem.SendNoDataWarning();
        else
            this.CommunicationSystem.SendReport(report);
    }
    catch (Exception e)
    {
        this.Logger.LogError(e,
        "An error occurred in " +
            nameof(SendEnemyShipWeaponrySummary) +
            ": " + e.Message);
    }
}

Five seconds of work, and a day or two of leaning on your figurative shovel with the odd complaint out loud of how hard the work is here, all while you secretly work on today’s Wordle. Job done, and slaps on the back all round, right? Right?

Well…​there are a couple of problems with this approach.

First, let’s think about unit testing. As good, upstanding code citizens, we unit-test all our code. Imagine that you unit-tested the snot out of that first function. When you copied and pasted the second, what was the level of unit-test coverage at that point?

I’ll give you a clue: it was between zero and zero. You could copy and paste the tests too, and that would be fine, but that’s now an awful lot more code that you’re copying and pasting every time.

This isn’t an approach that scales up well. What if your boss wanted another function after this one, and another, and another. What if you ended up being asked for 50 functions? Or 100? That’s a lot of code. You’d end up with code thousands of lines long, not something I’d be keen to support.

It gets worse when you consider something that happened to me near the beginning of my career. I was working for an organization that had a desktop application that carried out a series of complex calculations for each customer, based on a few input parameters. Each year the rules changed, but the old rule bases had to be replicated because it might be necessary to see what would have been calculated in a previous year.

So, the folks who had been developing the app before I joined the team had copied a whole chunk of code every year. They made a few little changes, added a link somewhere to the new version, and voilà. Job done.

I was tasked with making these annual changes one year, so off I went, young, innocent, and raring to make a difference in the world. When I was making my changes, I noticed something odd. There was a weird error with a field that had nothing to do with my changes. I fixed the bug, but then a thought occurred to me that made my heart sink.

I checked every previous version of the codebase for each previous year and found that nearly all had the same bug. It had been introduced about 10 years before, and every developer since then had replicated the bug precisely. I had to fix it 10 times over, increasing the testing effort by an order of magnitude.

With this in mind, ask yourself: did copying and pasting really save you any time? I routinely work on apps that stay in existence for decades and that show no sign of being put out to pasture anytime soon.

When I decide where to make time-saving measures for coding work, I try to look over the whole life of the application, and try to keep in mind any potential consequences for a decision a decade on.

To return to the subject at hand, how would I have used higher-order functions to solve this problem? Well, are you sitting comfortably? Then I’ll begin…​

Thunks

A bundle of code that carries a stored calculation, which can be executed later on request, is properly known as a thunk. It’s the same as the sound a plank of wood makes when it smacks you in the side of the head. There’s an argument to be had as to whether that hurts your head more or less than reading this book!

Here in C#, Func delegates are again the way to implement this. We can write functions that take Func delegates as parameter values to allow for certain calculations in our function to be left effectively blank, and which can be filled in from the outside world, via an arrow function.

Although this technique has a serious, proper, mathematical term, I like calling them doughnut functions, because it’s more descriptive. They’re like normal functions but with a hole in the middle! A hole I’d ask someone else to fill with the necessary functionality.

This is one potential way to refactor the problem report function:

public void SendEnemyShipWeaponrySummary() =>
  GenerateSummary(x => x.Type, "Enemy Ship Type Summary");

public void GenerateEnemyShipWeaponryLevelSummary() =>
  GenerateSummary(x => x.WeaponryLevel, "Enemy Ship WeaponryLevel");

private void GenerateSummary(
    Func<EnemyShip, string> summarySelector,
    string reportName)
{
    try
    {
        var enemyShips = this.DataStore.GetEnemyShips();
        var summaryNumbers = enemyShips.GroupBy(summarySelector)
            .Select(x => (Type: x.Key, Count: x.Count()));
        var report = new Report
        {
            Title = reportName,
            Rows = summaryNumbers.Select(X => new ReportItem
            {
                ColumnOne = X.Type,
                ColumnTwo = X.Count.ToString()
            })
    };

    if (!report.Rows.Any())
        this.CommunicationSystem.SendNoDataWarning();
    else
        this.CommunicationSystem.SendReport(report);
    }
    catch (Exception e)
    {
        this.Logger.LogError(e,
        $"An error occurred in " + nameof(GenerateSummary) +
            ", report: " + reportName +
            ", message: " + e.Message;
    }
}

In this revised version, we’ve gained a few advantages.

First, the number of additional lines per new report is just one! That’s a much tidier codebase and easier to read. The code is kept close to the intent of the new function—i.e., be the same as the first but with a few changes.

Second, after unit-testing function 1, when we create function 2, the unit-test level is still close to 100%. The only difference functionally is the report name and the field to be summarized.

Lastly, any enhancements or bug fixes to the base function will be shared among all report functions simultaneously. That’s a lot of benefit for relatively little effort. There’s also a very high degree of confidence that if one report function tests well, all the others will do the same.

We could walk away from this version happy. But if it were me, I’d consider going a step further and exposing the private version with its Func parameters on the interface for whatever wants to use it:

public interface IGenerateReports
{
    void GenerateSummary(Func<EnemyShip, string> summarySelector,
        string reportName)
}

The implementation would be the private function from the previous code sample made public instead. This way, there’s no need to ever modify the interface or implementing class again, at least not if all that’s wanted is an additional report for a different field.

This makes the business of creating reports something that can be done entirely arbitrarily by whatever code module consumes this class. It takes a lot of the burden of maintaining the report set from developers like ourselves and puts it more in the hands of the teams that care about the reports. Imagine the sheer number of requests for change that will now never need to come to a development team.

If we wanted to be really wild, we could expose further Func parameters as Func<ReportLine,string> to allow users of the report class to define custom formatting. We could also use Action parameters to allow for bespoke logging or event handling. This is just in my silly, made-up reporting class. The possibilities for the use of higher-order functions in this way are endless.

Despite being an FP feature, this is keeping us squarely in line with the O of the SOLID principles of object-oriented design, the open/closed principle, which states a module should be open to extension but closed to modification.2

It’s surprising how well OOP and FP can complement each other in C#. I often think it’s important for developers to make sure they are adept at both paradigms so they know how to use them together effectively.

Chaining Functions

Allow me to introduce you to the best friend you never knew you needed: the Map() function. This function is also commonly referred to as chain and pipe, but for the sake of consistency, we’ll call it Map() throughout this book. I’m afraid that a lot of functional structures tend to have many names in use, depending on the programming language and implementation. I’ll try to point out whenever this is the case.

Now, I’m British, and one cliché about British people is that we like talking about the weather. It’s entirely true. Our country has been known to go through four seasons in a single day, so the weather is a constant source of fascination to us.

When I used to work for an American company once upon a time, the topic of conversation with my colleagues over video calls would often turn inevitably to the subject of the weather. They’d tell me that the temperature outside was around 100 degrees. I work in Celsius, so to me this sounds rather suspiciously like the boiling point of water. Given that my colleagues were not screaming as their blood boiled away into steam, I suspected something else was at work. It was, of course, that they were working in Fahrenheit, so I had to convert this to something I understood with the following formula:

  1. Subtract 32.

  2. Multiply by 5.

  3. Divide by 9.

This gives a temperature in Celsius of around 38 degrees, which is warm and toasty, but for the most part safe for human life.

How could we code this process in exactly that multistep operation and then finish by returning a formatted string? We could stick it all together into a single line like this:

public string FahrenheitToCelsius(decimal tempInF) =>
    Math.Round(((tempInF-32) *5 / 9), 2) + "°C";

That’s not very readable though, is it? Honestly, I probably wouldn’t make too much fuss about that in production code, but I’m demonstrating a technique, so bear with me.

The multistep way to write this out is like this:

string FahrenheitToCelsius(decimal tempInF)
{
    var a = tempInF - 32;
    var b = a * 5;
    var c = b / 9;
    var d = Math.Round(c, 2);
    var returnValue = d + "°C";
    return returnValue;
}

This is much more readable and easier to maintain, but it still has an issue. We’re creating variables that are intended to be used a single time and then thrown away. In this little function, this approach is not terribly relevant, but what if this were a gigantic thousand-line function? What if instead of a little decimal variable like these, we had a large, complex object? All the way down at line 1,000, that variable—which is never intended to be used again—is still in scope, and holding up memory. It’s also a little messy to create a variable you aren’t planning to use beyond the next line. This is where Map() comes in.

Map() is somewhat like the LINQ Select() function, except instead of operating on each element of an enumerable, it operates on an object—any object. You pass it a lambda arrow function just the same as with Select() except that your x parameter refers to the base object. If you applied it to an enumerable, the x parameter would refer to the entire enumerable, not individual elements thereof.

Here’s how our modified Fahrenheit conversion would look:

public string FahrenheitToCelsius(decimal tempInF)  =>
    tempInF.Map(x => x - 32)
        .Map(x => x * 5)
        .Map(x => x / 9)
        .Map(x => Math.Round(x, 2))
        .Map(x => x + "°C");

This code has the same exact functionality, same friendly, multistage operation, but no throwaway variables. Each arrow function is executed; then after it’s completed, its contents are subject to garbage collection. The decimal x that is multiplied by 5 is subject for disposal when the next arrow function takes a copy of its result and divides that by 9.

Here’s how we implement Map():

public static class MapExtensionMethods
{
    public static TOut Map<TIn, TOut>(this TIn @this, Func<TIn, TOut> f) =>
        f(@this);
}

It’s tiny, isn’t it? Despite that, I use this particular method quite a lot—whenever I want to do a multistep transformation of data. It makes it easier to convert whole function bodies into simple arrow functions, like the Map()-based FahrenheitTo​Cel⁠sius() function.

This method has far more advanced versions, which include things like error handling, and which I’ll be getting into in Chapter 7. For now, though, this is a fantastic little toy that you can start playing with right away. Uncle Simon’s early Christmas gift to you. Ho, ho, ho.

A simpler implementation of Map() is possible, if you don’t want to change types with each transformation. This is cleaner and more concise, if it suits your needs. It could be implemented like this:

public static T Map<T>(this T @this, params Func<T,T>[] transformations) =>
    transformations.Aggregate(@this, (agg, x) => x(agg));

Using that, the basic Fahrenheit-to-Celsius transformation would look like this:

public decimal FahrenheitToCelsius(decimal tempInF)  =>
    tempInF.Map(
        x => x - 32,
        x => x * 5,
        x => x / 9
        x => Math.Round(x, 2);

This might be worth using to save a little bit of boilerplate in simpler cases, like the temperature conversion. See Chapter 8 for some ideas on how to make this look even better.

Fork Combinator

A fork combinator is used to take a single value, process it in multiple ways simultaneously, and then join up all those separate strands into a single, final value. This process can be used to simplify some fairly complex multistep calculations into a single line of code. I’ve also heard this process called a converge, but I like fork because it’s more descriptive of exactly how it works.

The process runs roughly like this:

  1. Start with a single value.

  2. Feed that value into a set of prong functions, each of which acts on the original input in isolation to produce some sort of output.

  3. A join function takes the result of the prongs and merges it into a final result.

Here are a few examples of how we might use it.

If we want to specify the number of arguments in our function definition rather than having an unspecified number of prongs from an array, we could use Fork() to calculate an average value:

var numbers = new [] { 4, 8, 15, 16, 23, 42 }
var average = numbers.Fork(
    x => x.Sum(),
    x => x.Count(),
    (s, c) => s / c
);
// average = 18

Or here’s a blast from the past—we can use Fork to calculate the hypotenuse of a triangle:

var triangle = new Triangle(100, 200);
var hypotenuse = triangle.Fork(
    x => Math.Pow(x.A, 2),
    x => Math.Pow(x.B, 2),
    (a2, b2) => Math.Sqrt(a2 + b2)
);

The implementation looks like this:

public static class ext
{
    public static TOut Fork<TIn, T1, T2, TOut>(
      this TIn @this,
      Func<TIn, T1> f1,
      Func<TIn, T2> f2,
      Func<T1,T2,TOut> fout)
    {
        var p1 = f1(@this);
        var p2 = f2(@this);
        var result = fout(p1, p2);
        return result;
    }
}

Note that having two generic types, one for each prong, means that any combination of types can be returned by those functions.

We could easily go out and write versions for any number of parameters beyond two as well, but each additional parameter we want to consider would require an additional extension method.

If we want to go further and have an unlimited number of prongs, that’s easily done, provided we are OK with having the same intermediate type generated by each:

public static class ForkExtensionMethods
{
    public static TEnd Fork<TStart, TMiddle, TEnd>(
       this TStart @this,
       Func<TMiddle, TEnd> joinFunction,
       params Func<TStart, TMiddle>[] prongs
    )
{
    var intermediateValues = prongs.Select(x => x(@this));
    var returnValue = joinFunction(intermediateValues);
    return returnValue;
}

We could use this, for instance, to create a text description based on an object:

var personData = this.personRepository.GetPerson(24601);
var description = personData.Fork(
    prongs => string.Join(Environment.NewLine, prongs),
    x => "My name is " + x.FirstName + " " + x.LastName,
    x => "I am " + x.Age + " years old.",
    x => "I live in " + x.Address.Town
)

// This might, for example, produce:
//
// My name is Jean Valjean
// I am 30 years old
// I live in Montreuil-sur-mer

In this example, we’re acting multiple times on a complex object (Person); there is no enumerable of properties that we can operate on with a Select() statement to get the list of descriptive strings we want. Using a fork combinator, we can effectively convert a single item into an array of items that we can then apply list operations to, in order to convert it into a more usable final result.

Also when using Fork like this, it’s easy enough to add as many more lines of description as we want but still maintain the same level of complexity and readability.

Alt Combinator

An alt combinator is used to bind together a set of functions to achieve the same end, but which should be tried one after the other until one of them returns a value. I’ve also seen this referred to as or, alternate, or alternation.

Think of it as working like this: “Try method A; if that doesn’t work, try method B; if that doesn’t work, try method C; if that doesn’t work, I suppose we’re out of luck.”

Let’s imagine a scenario where we might want to find something by trying multiple methods:

var jamesBond = "007"
    .Alt(x => this.hotelService.ScanGuestsForSpies(x),
        x => this.airportService.CheckPassengersForSpies(x),
        x => this.barService.CheckGutterForDrunkSpies(x));

if(jamesBond != null)
    this.deathTrapService.CauseHorribleDeath(jamesBond);

So long as one of those three methods returns a value corresponding to a hard-drinking, borderline-misogynist, thuggish employee of the British government, then the jamesBond variable won’t be null. Whichever function returns a value first is the last function to be run.

So how do we implement this function before we find our enemy has fled? Like this:

public static TOut Alt<TIn, TOut>(
    this TIn @this,
    params Func<TIn, TOut>[] args) =>
    args.Select(x => x(@this))
    .First(x => x != null);

Remember here that the LINQ Select() function operates on a lazy-loading principle, so even though we appear to be converting the whole of the Func array into concrete types, we’re not, because the First() function will prevent any elements from being executed after one of them has returned a non-null value. Isn’t LINQ great?

A slightly more real-world usage for this concept might occur with multiple stores for the same data, where each in turn needs to be checked. Maybe we have several sources for employee data, which either don’t all contain the same lists of people or might be unavailable periodically, requiring a fallback to an alternative means:

public Person GetEmployee(int empId) =>
    empId.Alt(
        x => this.employeeDbRepo.GetById(x),
        x => this.ActiveDirectoryClient.GetById(x),
        x => this.EmergencyBackupCsvClient.GetById(x)
    );

In this scenario, three levels of data are checked to find our employee’s information. First preference is given to our system’s database. This probably assumes the employee has been seen by the system before and it has stored something about them previously. Perhaps our employee is a new starter, and there are no database records? In that case, the only source of data available is Active Directory, from which we can get a few bits of essential information that can perhaps be padded out later. Finally, in the event that some sort of network outage occurs, the system can finally consider checking a local CSV file for cached information. I’m not saying that I’d necessarily do that last step, as security implications probably exist, but I’m demonstrating how flexible this approach allows you to be.

Compose

A common feature of functional languages is the ability to build up a complex function from a collection of smaller, simpler functions. Any process that involves combining functions is called composing.

JavaScript libraries like Ramda have terrific composing features available, but C#’s strong typing works against them in this instance.

C# has a few methods for composing functions. The first is the simplest, just using basic Map() functions, as described earlier in this chapter:

var input = 100M;
var f = (decimal x) => x.Map(x => x - 32)
    .Map(x => x * 5)
    .Map(x => x / 9)
    .Map(x => Math.Round(x, 2))
    .Map(x => $"{x} degrees");
var output = f(input);
// output = "37.78 degrees"

The f here is a composed higher-order function. The five functions (e.g., x => x - 32, those steps of the calculation) used to create f are described as anonymous lambda expressions. They combine like LEGO bricks to form a larger, more complex behavior.

A valid question here is what’s the point of composing functions? The answer is that we don’t necessarily have to do the whole thing all at once. We could build the logic we want in pieces and then ultimately create many functions by using those same base pieces.

Imagine now that we want to also hold a Func delegate that represents the opposite conversion—we’d end up with two functions like this:

var input = 100M;
var fahrenheitToCelsius = (decimal x) =>
    x.Map(x => x - 32)
        .Map(x => x * 5)
        .Map(x => x / 9)
        .Map(x => Math.Round(x, 2))
        .Map(x => $"{x} degrees");
var output = fahrenheitToCelsius(input);
Console.WriteLine(output);.
// 37.78 degrees

var input2 = 37.78M;
var celsiusToFahrenheit =    (decimal x) =>
    x.Map(x => x * 9)
        .Map(x => x / 5)
        .Map(x => x + 32)
        .Map(x => Math.Round(x, 2))
        .Map(x => $"{x} degrees");
var output2 = celsiusToFahrenheit(input2);
Console.WriteLine(output2);
// 100.00 degrees

The last two lines of each function are identical. Isn’t it a bit wasteful to repeat them each time? We can eliminate the repetition by using a Compose() function:

var formatDecimal = (decimal x) => x
    .Map(x => Math.Round(x, 2))
    .Map(x => $"{x} degrees");

var input = 100M;
var celsiusToFahrenheit = (decimal x) => x.Map(x => x - 32)
    .Map(x => x * 5)
    .Map(x => x / 9);
var fToCFormatted = celsiusToFahrenheit.Compose(formatDecimal);
var output = fToCFormatted(input);
Console.WriteLine(output);

var input2 = 37.78M;
var celsiusToFahrenheit =    (decimal x) =>
    x.Map(x => x * 9)
    .Map(x => x / 5)
    .Map(x => x + 32);
var cToFFormatted = celsiusToFahrenheit.Compose(formatDecimal);
var output2 = cToFFormatted(input2);
Console.WriteLine(output2);

Functionally, these new versions using Compose() are identical to the previous versions exclusively using Map().

The Compose() function performs nearly the same task as Map(), with the subtle difference that we’re ultimately producing a Func delegate at the end, not a final value. This is the code that performs the Compose() process:

public static class ComposeExtensionMethods
{
    public static Func<TIn, NewTOut> Compose<TIn, OldTOut, NewTOut>(
        this Func<TIn, OldTOut> @this,
        Func<OldTOut, NewTOut> f) =>
            x => f(@this(x));
}

By using Compose(), we’ve eliminated some unnecessary replication. Any improvements to the format process will be shared by both Func delegate objects simultaneously.

A limitation, exists, however. In C#, extension methods can’t be attached to lambda expressions or to functions directly. We can attach an extension to a lambda expression if it’s referenced as a Func or Action delegate, but for that to happen, it first needs to be assigned to a variable, where it will be automatically set as a delegate type for us. This is why it’s necessary in the preceding examples to assign the chains of Map() functions to a variable before calling Compose()—otherwise, it would be possible to simply call Compose() at the end of the Map() chain and save ourselves a variable assignment.

This process is not unlike reusing code via inheritance in OOP, except it’s done at the individual line level and requires significantly less boilerplate. It also keeps these similar, related pieces of code together, rather than having them be spread out over separate classes and files.

Transduce

A transducer is a way of combining list-based operations, like Select() and Where(), with some form of aggregation to perform multiple transformations to a list of values, before finally collapsing it down into a single, final value.

While Compose() is a useful feature, it has limitations. It effectively always takes the place of a Map() function—i.e., it acts on the object as a whole and can’t perform LINQ operations on enumerables. We could Compose() an array and put Select() and Where() operations inside each, but honestly that looks pretty messy:

var numbers = new [] { 4, 8, 15, 16, 23, 42 };
var add5 = (IEnumerable<int> x) => x.Select(y => y + 5);
var Add5MultiplyBy10 = add5.Compose(x => x.Select(y => y * 10));

var numbersGreaterThan100 = Add5MultiplyBy10.Compose(x => x.Where(y => y > 100));

var composeMessage = numbersGreaterThan100.Compose(x => string.Join(",", x));
Console.WriteLine("Output = " + composeMessage(numbers));
// Output = 130,200,210,280,470

If you’re happy with that, by all means use it. Nothing is wrong with it per se, aside from being rather inelegant.

We can use another structure, though: Transduce. A Transduce operation acts on an array and represents all the stages of a functional flow:

Filter()—i.e., .Where()

Reduce the number of elements.

Transform()—i.e., .Select()

Convert them to a new form.

Aggregate()—i.e., erm…​actually it is Aggregate

Whittle down the collection of many items to a single item by using these rules.

This could be implemented in C# in many ways, but here’s one possibility:

public static TFinalOut Transduce<TIn, TFilterOut, TFinalOut>(
    this IEnumerable<TIn> @this,
    Func<IEnumerable<TIn>, IEnumerable<TFilterOut>> transformer,
    Func<IEnumerable<TFilterOut>, TFinalOut> aggregator) =>
        aggregator(transformer(@this));

This extension method takes a transformer method, which can be any combination of Select() and Where() the user defines to transform the enumerable ultimately from one form and size to another. The method also takes an aggregator, which converts the output of the transformer into a single value.

This is how the Compose() function we defined previously could be implemented with this version of the Transduce() method:

var numbers = new [] { 4, 8, 15, 16, 23, 42 };

// N.B - I could make this a single line with brackets, but
// I find this more readable, and it's functionally identical due
// to lazy evaluation of enumerables
var transformer = (IEnumerable<int> x) => x
    .Select(y => y + 5)
    .Select(y => y * 10)
    .Where(y => y > 100);

var aggregator = (IEnumerable<int> x) => string.Join(", ", x);

var output = numbers.Transduce(transformer, aggregator);
Console.WriteLine("Output = " + output);
// Output = 130, 200, 210, 280, 470

Alternatively, if you’d prefer to handle everything as Func delegates, so that you can reuse the Transduce() method, it could be written in this way:

var numbers = new [] { 4, 8, 15, 16, 23, 42 };
var transformer = (IEnumerable<int> x) => x
    .Select(y => y + 5)
    .Select(y => y * 10)
    .Where(y => y > 100);

var aggregator = (IEnumerable<int> x) => string.Join(", ", x);

var transducer = transformer.ToTransducer(aggregator);
var output2 = transducer(numbers);
Console.WriteLine("Output = " + output2);

This is the updated extension method:

public static class TransducerExtensionMethod
{
    public static Func<IEnumerable<TIn>, TO2> ToTransducer<TIn, TO1, TO2>(
        this Func<IEnumerable<TIn>,
        IEnumerable<TO1>> @this,
        Func<IEnumerable<TO1>, TO2> aggregator) =>
            x => aggregator(@this(x));
}

We’ve now generated a Func delegate variable that can be used as a function on as many arrays of integers as we want, and that single Func will perform any number of transformations and filters required, and then aggregate the array down to a single, final value.

Tap

A common concern I hear raised about chains of functions is that it’s impossible to perform logging within them—unless we make one of the links in the chain a reference to a separate function that does have logging calls within it.

An FP technique can be used to inspect the contents of a function chain at any point: a Tap() function. A Tap() function is a bit like a wiretap in old detective films.3 It allows a stream of information to be monitored and acted on, but without disrupting or altering it.

The way to implement Tap() is like this:

public static class Extensions
{
    public static T Tap<T>(this T @this, Action<T> action)
    {
        action(@this);
        return @this;
    }
}

An Action delegate is effectively like a void returning function. In this instance, it accepts a single parameter: a generic type, T. The Tap() function passes the current value of the object in the chain into the Action, where logging can take place, then returns an unmodified copy of that same object.

We could use it like this:

var input = 100M;
var fahrenheitToCelsius = (decimal x) => x.Map(x => x - 32)
    .Map(x => x * 5)
    .Map(x => x / 9)
    .Tap(x => this.logger.LogInformation("the un-rounded value is " + x))
    .Map(x => Math.Round(x, 2))
    .Map(x => $"{x} degrees");
var output = fahrenheitToCelsius(input);
Console.WriteLine(output);
// 37.78 degrees

In this new version of the Fahrenheit-to-Celsius functional chain, we’re now tapping into it after the basic calculation is completed, but before we start rounding and formatting it to a string. Here we’ve added a call to a logger in Tap(), but we could switch that for a Console.WriteLine or whatever else we’d like.

Try/Catch

We can use several more advanced structures in FP for handling errors. If you just want something quick and easy that you can quickly implement in a few lines of code, but that has its limitations, keep reading. Otherwise, try having a look ahead at Chapters 6 and 7. You’ll find plenty there on handling errors without side effects.

For now, though, let’s see what we can do with a few simple lines of code…​

In theory, in the middle of functional-style code, no errors should be possible. If everything is done in line with the functional principles of side-effect-free code, immutable variables, and so on, then we should be safe. On the fringes, though, some interactions might be considered unsafe.

Let’s imagine we want to run a lookup in an external system with an integer ID. This external system could be a database, a web API, a flat file on a network share, anything at all. The thing all these possibilities have in common is that any can fail for many reasons, few, if any, of which are the fault of the developer.

There could be network issues, hardware issues on the local or remote computers, inadvertent human intervention. The list goes on.

This is one method by which we deal with that situation in object-oriented code:

pubic IEnumerable<Snack> GetSnackByType(int typeId)
{
 try
    {
        var returnValue = this.DataStore.GetSnackByType(typeId);
        return returnValue;
    }
    catch(Exception e)
    {
        this.logger.LogError(e, $"There aren't any pork scratchings left!");
        return Enumerable.Empty<Snack>()
    }
}

I dislike two aspects of this code block. The first is the amount of boilerplate bulking out the code. We have to add a lot of industrial-strength coding to protect ourselves from problems that we didn’t cause.

The other method often deployed to handle these scenarios is to have something that rethrows an exception up to a higher level to catch again. I really dislike this approach. It depends on defensive code having been written higher up and can cause unexpected behavior to occur, because we’ve disrupted the standard order of operations. We can’t say when and by what the error will be caught, or even if it’ll be caught. It could result in an unhandled Exception, which could terminate the application altogether.

The other issue is with try/catch blocks themselves. They break the order of operations, moving execution of the program from where we were to a potentially hard-to-find location. In this case, we have a nice, simple, compact little function, and the location of the catch is easy to determine. I’ve worked in codebases where the catch was several layers of functions higher than the place the fault occurred. Bugs were common in that codebase because assumptions were made about certain lines of code being reached when they weren’t because of the strange positioning of the try/catch block.

I probably wouldn’t have too many issues with this code block in production, but left unchecked, bad coding practices can leak in. Nothing in the code prevents future coders from introducing multilevel nested functions in here.

I think the best solution is to use an approach that removes all the boilerplate and makes it hard, or even impossible, to introduce bad code structure later. Consider something like this:

pubic IEnumerable<Snack> GetSnackByType(int typeId)
{
    var result = typeId.MapWithTryCatch(this.DataStore.GetSnackByType)
        ?? Enumerable.Empty<Snack>();
    return result;
}

We’re running a Map() function with an embedded try/catch. The new Map() function returns either a value if everything works or null if a failure occurs.

The extension method looks like this:

public static class Extensions
{
    public static TO MapWithTryCatch<TIn,TO>(this TIn @this, Func<TIn,TO> f)
    {
        try
        {
            return f(@this);
        }
        catch()
        {
            return default;
        }
    }
}

This isn’t quite a perfect solution, though. What about error logging? This is committing the cardinal sin of swallowing error messages unlogged.

We could think about solving this in a few ways. Any of these are equally fine, so proceed as your fancy takes you.

One option is to instead have an extension method that takes an ILogger instance to return a Func delegate containing the try/catch functionality:

public static class TryCatchExtensionMethods
{
    public static TOut CreateTryCatch<TIn,TOut>(this TIn @this, ILogger logger)
    {
        Func<TIn,TOut> f =>
        {
            try
            {
                return f(@this);
            }
            catch(Exception e)
            {
                logger.LogError(e, "An error occurred");
                return default;
            }

    }
}

The usage is pretty similar:

public IEnumerable<Snack> GetSnackByType(int typeId)
{
    var tryCatch = typeId.CreateTryCatch(this.logger);
    var result = tryCatch(this.DataStore.GetSnackByType)
        ?? Enumerable.Empty<Snack>();
    return result;
}

Only a single additional line of boilerplate is added, and now logging is being done. Sadly, we can’t add anything specific in the message besides the error itself. The extension method doesn’t know where it’s called from, or the context of the error, which is perfect for reusing the method all over the codebase.

If we don’t want the try/catch being aware of the ILogger interface, or we want to provide a custom error message every time, we need to look at something a little more complicated to handle error messaging.

One option is to return a metadata object that contains the return value of the function that’s being executed, and a bit of data about whether the code worked, whether errors occurred, and what they were. We could use something like this:

public class ExecutionResult<T>
{
    public T Result { get; init; }
    public Exception Error { get; init; }
}

public static class Extensions
{
    public static ExtensionResult<TOut> MapWithTryCatch<TIn,TOut>(
        this TIn @this,
        Func<TIn,TOut> f)
    {
        try
        {
         var result = f(@this);
         return new ExecutionResult<TOut>
         {
             Result = result
         };
        }
        catch(Exception e)
        {
            return new ExecutionResult<TOut>
            {
                Error = e
            };
        }
    }
}

I don’t really like this approach. It’s breaking one of the SOLID principles of object-oriented design, the interface segregation principle. Well, sort of. Technically, that applies to interfaces, but I try to apply it everywhere, even if I do write functional code. The idea is that we shouldn’t be forced to include something in a class or interface that we don’t need. Here, we’re forcing a successful run to include an Exception property that it’ll never need, and likewise, a failure run will have to include the Result property it’ll never need.

We could do this in other ways, but I’m making it simple, and returning either a version of the ExecutionResult class with the result or a default value of Result with the Exception.

This means we can use the extension method like this:

pubic IEnumerable<Snack> GetSnackByType(int typeId)
{
    var result = typeId.MapWithTryCatch(this.DataStore.GetSnackByType);
    if(result.Value == null)
    {
        this.Logger.LogException(result.Error, "We ran out of jammy dodgers!");
        return Enumerable.Empty<Snack>();
    }

    return result.Result;
}

The unnecessary fields aside, this approach has another issue: the onus is now on the developer using the try/catch function to add additional boilerplate to check for errors.

Skip ahead to Chapter 6 for an alternative way of handling this sort of return value in a more purely functional manner. For now, here’s a slightly cleaner way of handling it.

First, we add in another extension method, one that attaches to the ExecutionResult object this time:

public static T OnError<T>(
    this ExecutionResult<T> @this,
    Action<Exception> errorHandler)
    {
    if (@this.Error != null)
        errorHandler(@this.Error);
        return @this.Result;
    }

Here, we’re first checking for an error. If an error exists, we execute the user-defined Action, which will presumably be a logging operation. It finishes by unwrapping the ExecutionResult into just its actual returned data object.

All of that means we can now handle the try/catch like this:

public IEnumerable<Snack> GetSnackByTypeId(int typeId) =>
    typeId.MapWithTryCatch(DataStore.GetSnackByType)
        .OnError(e => this.Logger.LogError(e, "We ran out of custard creams!"));

This solution is far from perfect, but without moving on to another level in functional theory, it’s workable and elegant enough that it’s not setting off my internal perfectionist. It also forces the user to consider error handling when using this, which can only be a good thing!

Handling Nulls

Aren’t null reference exceptions annoying? If you want someone to blame, it’s a guy called Tony Hoare who invented the concept of null back in the ’60s. Actually, let’s not blame anyone. I’m sure he’s a lovely person, beloved by everyone who knows him. In any case, we can hopefully all agree that null reference exceptions are an absolute pain.

So, is there a functional way to deal with them? If you’ve read this far, you probably know that the answer is a resounding yes!4

The Unless() function takes in a Boolean condition and an Action delegate, and executes the Action only if the Boolean is false—i.e., the Action is always executed unless the condition is true.

The most common usage for something like this is—you guessed it—checking for null. Here’s an example of exactly the sort of code we’d want to replace. This is a rarely seen bit of source code for a Dalek:5

public void BusinessAsUsual()
{
    var enemies = this.scanner.FindLifeforms('all');
    foreach(var e in enemies)
    {
        this.Gun.Blast(e.Coordinates.Longitude, e.Coordinates.Latitude);
        this.Speech.ScreamAt(e, "EXTERMINATE");
    }
}

This is all well and good, and probably leaves a lot of people killed by a psychotic mutant in a mobile pepper-pot-shaped tank. But what if the Coordinates object was null for some reason? That’s right—null reference exception.

This is where we make this functional and introduce an Unless() function to prevent the exception from occurring. This is what Unless() looks like:

public static class UnlessExtensionMethods
{
    public void Unless<T>(this T @this, Func<bool> condition, Action<T> f)
    {
        if(!condition(@this)
        {
            f(@this);
        }
    }
}

The Unless() function has to be a void, unfortunately. If we swapped the Action for a Func, then it’s fine to return the result of the Func from the extension method. What about when the condition is true, though, and we don’t execute? What do we return then? There isn’t really an answer to that question.

This is how we could use the Unless() function to make a new, super-duper, even more deadly functional Dalek:

public void BusinessAsUsual()
{
    var enemies = this.scanner.FindLifeforms('all');

    foreach(var e in enemies)
    {
        e.unless(
            x => x.Coordinates == null,
            x => this.Gun.Blast(e.Coordinates.Longitude, e.Coordinates.Latitude)
        )

    // May as well do this anyway, since we're here.
        this.Speech.ScreamAt(e, "EXTERMINATE");
    }
}

Using this, a null Coordinates object won’t result in an exception; the gun simply won’t be fired.

The next few chapters provide more ways to prevent null exceptions—ways that require more advanced coding and a little theory, but that are much more thorough in the way they work. Stay tuned.

Update an Enumerable

I’m going to finish off this section with a useful example. It involves updating an element in an enumerable without changing any data at all!

The thing to remember about enumerables is that they are designed to use lazy evaluation—i.e., they don’t convert from a set of functions pointing at a data source to actual data until the last possible moment. Quite often, the use of Select() functions doesn’t trigger an evaluation, so we can use them to effectively create filters sitting between the data source and the place in the code where enumeration of the data will take place.

Here’s an example of altering an enumerable so that the item at position x is replaced:

var sourceData = new []
{
    "Hello", "Doctor", "Yesterday", "Today", "Tomorrow", "Continue"
}

var updatedData = sourceData.ReplaceAt(1, "Darkness, my old friend");
var finalString = string.Join(" ", updatedData);
// Hello Darkness, my old friend Yesterday Today Tomorrow Continue

This calls a function to replace the element at position 1 ("Doctor") with a new value. Despite having two variables, nothing is done to the source data at all. The variable SourceData remains the same after this code snippet has come to the end. Furthermore, no replacement is made until calling string.Join, because that’s the very moment at which concrete values are required.

This is how it’s done:

public static class Extensions
{
    public static IEnumerable<T> ReplaceAt(this IEnumerable<T> @this,
        int loc,
        T replacement) =>
        @this.Select((x, i) => i == loc ? replacement : x);
}

This enumerable, returned here, points at the original enumerable and gets its values from there, but with one crucial difference. If the index of the element ever equals the user-defined value (1, the second element, in this example), all other values are passed through, unaltered.

If we were so inclined, we could provide a function to perform the update—giving the user the ability to base the new version of the data item on the old version that is being replaced. This is how we’d achieve that:

public static class Extensions
{
    public static IEnumerable<T> ReplaceAt(this IEnumerable<T> @this,
        int loc,
        Func<T, T> replacement) =>
        @this.Select((x, i) => i == loc ? replacement(x) : x);
}

The code is easy enough to use too:

var sourceData = new []
{
    "Hello", "Doctor", "Yesterday", "Today", "Tomorrow", "Continue"
}

var updatedData = sourceData.ReplaceAt(1, x => x + " Who");
var finalString = string.Join(" ", updatedData);
// Hello Doctor Who Yesterday Today Tomorrow Continue

It’s also possible that we don’t know the ID of the element we want to update—in fact, we could have multiple items to update. The next example is an alternative enumerable update function based on providing a Func that takes T as a parameter and returns a bool (i.e., Func<T,bool>) to identify the records that should be updated.

This example is based on board games—one of my favorite hobbies, much to the annoyance of my ever-patient wife! In this scenario, there is a Tag property on the BoardGame object, which contains metadata tags that describe the game ("family", "co-op", "complex", stuff like that) and that will be used by a search engine app. It’s been decided that another tag should be added to games suitable for one player—"solo":

var sourceData = this.DataStore.GetBoardGames();

var updatedData = sourceData.ReplaceWhen(
    x => x.NumberOfPlayersAllowed.Contains(1),
    x => x with { Tags = x.Tags.Append("solo") });
this.DataStore.Save(updatedData);

The implementation is a variation on code we’ve already covered:

public static class ReplaceWhenExtensions
{
    public static IEnumerable<T> ReplaceWhen<T>(this IEnumerable<T> @this,
        Func<T, bool> shouldReplace,
        Func<T, T> replacement) =>
        @this.Select(x => shouldReplace(x) ? replacement(x) : x);
}

This function can be used to replace the need for many instances of if statements, and reduce them to simpler, more predictable operations.

Summary

In this chapter, we looked at various ways to use higher-order functions to provide rich functionality to our codebase, avoiding the need for OOP-style statements.

Do get in touch if you have any ideas of your own for higher-order function uses. You never know, it might end up in a future edition of this book!

The next chapter delves into discriminated unions and how this functional technique can help better model business logic concepts in your codebase, and remove the need for a lot of defensive code typically needed with nonfunctional projects. Enjoy!

1 Ideally the hottest, spiciest flavor you can find. Flames should be shooting from your mouth as you eat!

2 Read more about SOLID on Wikipedia or see my YouTube video “SOLID Principles in 5 Nightmares”.

3 I’d guess that’s where they get their name.

4 Also, congratulations for making it this far—although it probably didn’t take you anywhere so much time as it did me!

5 For the uninitiated, these are the main baddies in the British SF TV series Doctor Who. You can see them in action on YouTube.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset