Chapter 7. Records and Macros

As soon as your first Erlang product reaches the market and is deployed around the world, you start working on feature enhancements for the second release. Imagine 15,000 lines of code, which incidentally happens to be the size of the code base of the first Erlang product Ericsson shipped, the Mobility Server. In your code base, you have tuples that contain data relating to the existing features and constants that have been hardcoded. When you add new features, you need to add fields to these tuples. The problem is that the fields need to be updated not only in the code base where you are adding these features, but also in the remaining 15,000 lines of code where you aren’t adding them. Missing one tuple will cause a runtime error. Assuming your constants also need to be updated, you need to change the hardcoded values everywhere they are used. And even more costly than implementing these software changes is the fact that the entire code base needs to be retested to ensure that no new bugs have been introduced or fields and constant updates have been omitted.

One of the most common constructions in computing is to bring together a number of pieces of data as a single item. Erlang tuples provide the basic mechanism for collecting data, but they do have some disadvantages, particularly when a larger number of data items are collected as a single object. In the first part of this chapter, you will learn about records, which overcome most of these disadvantages and which also make code evolution easier to achieve. The key to this is the fact that records provide data abstraction by which the actual representation of the data is hidden from the programs that access it.

Macros allow you to write abbreviations that are expanded by the Erlang preprocessor. Macros can be used to make programs more readable, to extend the language, and to write debugging code. We conclude the chapter by describing the include directive, by which header files containing macro and record definitions are used in Erlang projects.

Although neither is essential for writing Erlang programs, both are useful in making programs easier to read, modify, and debug, facilitating code enhancements and support of deployed products. It is no coincidence that records and macros, the two constructs described in this chapter, were added to the language soon after Ericsson’s Mobility Server went into production and developers started to support it while working on enhancing its feature set.

Records

To understand the advantages of records, we will first introduce a small example dealing with information about people. Suppose you want to store basic information about a person, including his name, age, and telephone number. You could do this using three-element tuples of the form {Name,Age,Phone}:

-module(tuples1).
-export([test/1, test/2]).

birthday({Name,Age,Phone}) ->
  {Name,Age+1,Phone}.

joe() ->
  {"Joe", 21, "999-999"}.

showPerson({Name,Age,Phone}) ->
  io:format("name: ~p  age: ~p  phone: ~p~n", [Name,Age,Phone]).

test1() ->
  showPerson(joe()).

test2() ->
  showPerson(birthday(joe())).

At every point in the program where the person representation is used, it must be presented as a complete tuple: {Name,Age,Phone}. Although not apparently a problem for a three-element tuple, adding new fields means you would have to update the tuple everywhere, even in the code base where the new fields are not used. Missing an update will result in a badmatch runtime error when pattern-matching the tuple. Furthermore, tuples do not scale well when dealing with sizes of 30 or even 10 elements, as the potential for misunderstanding or error is much greater.

Introducing Records

A record is a data structure with a fixed number of fields that are accessed by name, similar to a C structure or a Pascal record. This differs from tuples, where fields are accessed by position. In the case of the person example, you would define a record type as follows:

-record(person, {name,age,phone}).

This introduces the record type person, where each record instance contains three fields named name, age, and phone. Field names are defined as atoms. Here is an example of a record instance of this type:

#person{name="Joe",
        age=21,
        phone="999-999"}

In the preceding code, #person is the constructor for person records. It just so happens in this example that we listed the fields in the same order as in the definition, but this is not necessary. The following expression gives the same value:

#person{phone="999-999",
        name="Joe",
        age=21}

In both examples, we defined all the fields, but it is possible to give default values for the fields in the record definition, as in the following:

-record(person, {name,age=0,phone=""}).

Now a person record like this one:

#person{name="Fred"}

will have age zero and an empty phone number; in the absence of a default value being specified, the “default default” is the atom undefined.

The general definition of a record name with fields named field1 to fieldn will take the following form:

-record{name, {field1 [ = default1 ],
               field2 [ = default2 ],
               ...
               fieldn [ = defaultn ] }

where the parts enclosed in square brackets are optional declarations of default field values. The same field name can be used in more than one record type; indeed, two records might share the same list of names. The name of the record can be used in only one definition, however, as this is used to identify the record.

Working with Records

Suppose you are given a record value. How can you access the fields, and how can you describe a modified record? Given the following example:

Person = #person{name="Fred"}

you access the fields of the record like this: Person#person.name, Person#person.age, and so on. What will be the values of these? The general form for this field access will be:

RecordExp#name.fieldName

where the name and fieldName cannot be variables and RecordExp is an expression denoting a record. Typically, this will be a variable, but it might also be the result of a function application or a field access for another record type.

Suppose you want to modify a single field of a record. You can write this directly, as in the following:

NewPerson = Person#person{age=37}

In such a case, the record syntax is a real advantage. You have mentioned only the field whose value is modified; those that are unchanged from Person to NewPerson need not figure in the definition. In fact, the record mechanism allows for any selection of the fields to be updated, as in:

NewPerson = Person#person{phone="999-999",age=37}

The general case will be:

RecordExp#name{..., fieldNamei=valuei, ... }

where the field updates can occur in any order, but each field name can occur, at most, only once.

Functions and Pattern Matching over Records

Using pattern matching over records it is possible to extract field values and to affect the control flow of computation. Suppose you want to define the birthday function, which increases the age of the person by one. You could define the function using field selection and update like this:

birthday(P) ->
    P#person{age = P#person.age + 1}.

But it is clearer to use pattern matching:

birthday(#person{age=Age} = P) ->
    P#person{age=Age+1}.

The preceding code makes it clear that the function is applied to a person record, as well as extracting the age field into the variable Age. It is also possible to match against field values so that you increase only Joe’s age, keeping everyone else the same age:

joesBirthday(#person{age=Age,name="Joe"} = P) ->
    P#person{age=Age+1};
joesBirthday(P) -> P.

Revisiting the example from the beginning of the section, you can give the definitions using records:

-module(records1).
-export([birthday/1, joe/0, showPerson/1]).

-record(person, {name,age=0,phone}).

birthday(#person{age=Age} = P) ->
  P#person{age=Age+1}.

joe() ->
  #person{name="Joe",
          age=21,
          phone="999-999"}.

showPerson(#person{age=Age,phone=Phone,name=Name}) ->
  io:format("name: ~p  age: ~p  phone: ~p~n", [Name,Age,Phone]).

Although the notation used here is a little more verbose, this is more than compensated for by the clarity of the code, which makes clear our intention to work with records of people, as well as concentrating on the relevant details: it is clear from the definition of birthday that it operates on the age field and leaves the others unchanged. Finally, the code is more easily modified if the composition of the record is changed or extended; the first exercise at the end of this chapter gives you a chance to verify this for yourself.

Record fields can contain any valid Erlang data types. As records are valid data types, fields can contain other records, resulting in nested records. For example, the content of the name field in a person record could itself be a record:

-record(name, {first, surname}).

P = #person{name = #name{first = "Robert",
                         surname = "Virding"}}
First = (P#person.name)#name.first.

Furthermore, field selection of a nested field can be given by a single expression, as in the definition of First earlier.

Records in the Shell

Records in Erlang are a compile-time feature, and they don’t have their own types in the virtual machine (VM). Because of this, the shell deals with them differently than it does other constructions.

Using the command rr(moduleName) in the shell, all record definitions in the module moduleName are loaded. You can otherwise define records directly in the shell itself using the command rd(name, {field1, field2, ... }), which defines the record name with fields field1, field2, and so on. This can be useful in testing and debugging, or if you do not have access to the module in which you’ve defined the record. Finally, the command rl() lists all the record definitions currently visible in the shell. Try them out in the shell:

1> c("/Users/Francesco/records1", [{outdir, "/Users/Francesco/"}]).
{ok,records1}
2> rr(records1).
[person]
3> Person = #person{name="Mike",age=30}.
#person{name = "Mike",age = 30,phone = undefined}
4> Person#person.age + 1.
31
5> NewPerson = Person#person{phone=5697}.
#person{name = "Mike",age = 30,phone = 5697}
6> rd(name, {first, surname}).
name
7> NewPerson = Person#person{name=#name{first="Mike",surname="Williams"}}.
#person{name = #name{first = "Mike",surname = "Williams"},
        age = 30,phone = undefined}
8> FirstName = (NewPerson#person.name)#name.first.
"Mike"
9> rl().
-record(name,{first,surname}).
-record(person,{name,age = 0,phone}).
ok
10> Person = Person#person{name=#name{first="Chris",surname="Williams"}}.
** exception error: no match of right hand side value
                    #person{name = #name{first = "Mike",surname = "Williams"},
                            age = 30,phone = undefined}

In the preceding example, we load the person record definition from the records1 module, create an instance of it, and extract the age field. In command 6, we create a new record of type name, with the fields first and surname. We bind the name field of the record stored in the variable Person to a new record instance we create in one operation. Finally, in command 8, we extract the first name by looking up the name field in the record of type person stored in the variable NewPerson, all in one operation.

Look at what happens in command 10. This is a very common error made by beginners and seasoned programmers, that is, forgetting that Erlang variables are single assignment and that the = operator is nondestructive. In command 10, you might think you are changing the value of the name field to a new name, but you are in fact pattern-matching a record you’ve just created on the right side with the contents of the bound variable Person on the left. The pattern matching fails, as the record name contains the fields "Mike" and "Williams" on the left and the fields "Chris" and "Williams" on the right.

Finally, the shell commands rf(RecordName) and rf() forget one or all of the record definitions currently visible in the shell.

Record Implementation

We are now about to let you in on a poorly kept secret. We would rather not tell you, but when testing with records from the shell, using debugging tools to troubleshoot your code, or printing out internal data structures, you are bound to come across this. The Erlang compiler implements records before programs are run. Records are translated into tuples, and functions over records translate to functions and BIFs over the corresponding tuples. You can see this from this shell interaction:

11> records1:joe().
#person{name = "Joe",age = 21,phone = "999-999"}
12> records1:joe() == {person,"Joe",21,"999-999"}.
true
13> Tuple = {name,"Francesco","Cesarini"}.
#name{first = "Francesco",surname = "Cesarini"}
14> Tuple#name.first.
"Francesco"

From the preceding code, you can deduce that person is a 4-tuple, the first element being the atom person “tagging” the tuple and the remaining elements being the tuple fields in the order in which they are listed in the declaration of the record. The name record is a 3-tuple, where the first element is the atom name, the second is the first name field, and the third is the surname field.

Note how the shell by default assumes that a tuple is a record. This will unfortunately be the same in your programs, so whatever you do, never, ever use the tuple representations of records in your programs. If you do, the authors of this book will disown you and deny any involvement in helping you learn Erlang. We mean it!

Warning

Why should you never use the tuple representation of records? Using the representation breaks data abstraction, so any modification to your record type will not be reflected in the code using the tuples. If you add a field to the record, the size of the tuple created by the compiler will change, resulting in a badmatch error when trying to pattern-match the record to your tuple (where you obviously forgot to add the new element). Swapping the field order in the record will not affect your code if you are using records, as you access the fields by name. If in some places, however, you use a tuple and forget to swap all occurrences, your program may fail, or worse, may behave in an unexpected and unintended way. Finally, even though this should be the least of your worries, the internal record representation might change in future releases of Erlang, making your code nonbackward-compatible.

To view the code produced as a source code transformation on records, compile your module and include the 'E' option. This results in a file with the E suffix. As an example, let’s compile the records1 module using compile:file(records1, ['E']) or the shell command c(records1, ['E']), producing a file called records1.E. No beam file containing the object code is produced. Note the slightly different syntax to what you have read so far, and pay particular attention to the record operations and tests which have been mapped to tuples, as well as the module_info functions which have been added. We will not go into the details of the various commands, as they are implementation-dependent and outside the scope of this book. They are, however, still interesting to see:

-file("/Users/Francesco/records1.erl", 1).

birthday({person,_,Age,_} = P) ->
    begin
        Rec0 = Age + 1,
        Rec1 = P,
        case Rec1 of
            {person,_,_,_} ->
                setelement(3, Rec1, Rec0);
            _ ->
                erlang:error({badrecord,person})
        end
    end.

joe() ->
    {person,"Joe",21,"999-999"}.

showPerson({person,Name,Age,Phone}) ->
    io:format("name: ~p  age: ~p  phone: ~p~n", [Name,Age,Phone
]).

module_info() ->
    erlang:get_module_info(records1).

module_info(X) ->
    erlang:get_module_info(records1, X).

Record BIFs

The BIF record_info will give information about a record type and its representation. The function call record_info(fields, recType) will return the list of field names in the recType, and the function call record_info(size, recType) will return the size of the representing tuple, namely the number of fields plus one. The position of a field in the representing tuple is given by #recType.fieldName, where both recType and fieldName are atoms:

15> #person.name.
2
16> record_info(size, person).
4
17> record_info(fields, person).
[name,age,phone]
18> RecType = person.
person
19> record_info(fields, RecType).
* 1: illegal record info
20> RecType#name.
* 1: syntax error before: '.'

Note how command 19 failed. If you type the same code in a module as part of a function and compile it, the compilation will also fail. The reason is simple. The record_info/2 BIF and the #RecordType.Field operations must contain literal atoms; they may not contain variables. This is because they are handled by the compiler and converted to their respective values before the code is run and the variables are bound.

A BIF that you can use in guards is is_record(Term, RecordTag). The BIF will verify that Term is a tuple, that its first element is RecordTag, and that the size of the tuple is correct. This BIF returns the atom true or false.

Macros

Macros allow you to write abbreviations of Erlang constructs that the Erlang Preprocessor (EPP) expands at compile time. You can use macros to make programs more readable and to implement features outside the language itself. With conditional macros, it becomes possible to write programs that can be customized in different ways, switching between debugging and production modes or among different architectures.

Simple Macros

The simplest macro can be used to define a constant, as in:

-define(TIMEOUT, 1000).

The macro is used by putting a ? in front of the macro name, as in:

receive
    after ?TIMEOUT -> ok
end

After macro expansion in epp, the preceding code will give the following Erlang program:

receive
    after 1000 -> ok
end

The general form of a simple macro definition is:

-define(Name,Replacement).

where it is customary—but not required—to CAPITALIZE the Name. In the earlier example, the Replacement was the literal 1000; it can, in fact, be any sequence of Erlang tokens—that is, a sequence of “words” such as variables, atoms, symbols, or punctuation. The result need not be a complete Erlang expression or a top-level form (i.e., a function definition or compiler directive). It is not possible to build new tokens through macro expansion. As an example, consider the following:

-define(FUNC,X).
-define(TION,+X).

double(X) -> ?FUNC?TION.

Here, you can see that the replacement for TION is not an expression, but on expansion a legitimate function (or top-level form) definition is produced. Note that when appending macros, a space delimiting their results is added to the result by default:

double(X) -> X + X.

Parameterized Macros

Macros can take parameters which are indicated by variable names. The general form for parameterized macros is:

-define(Name(Var1,Var2,...,VarN), Replacement).

where, as for normal Erlang variables, the variables Var1, Var2, ..., VarN need to begin with a capital letter. Here is an example:

-define(Multiple(X,Y),X rem Y == 0).

tstFun(Z,W) when ?Multiple(Z,W) -> true;
tstFun(Z,W)                     -> false.

The macro definition is used here to make a guard expression more readable; a macro rather than a function needs to be used, as the syntax for guards precludes function calls in guards. After macro expansion, the call is “inlined” thus:

tstFun(Z,W) when Z rem W == 0 -> true;
tstFun(Z,W)                   -> false.

Another example of parameterized macros could be for diagnostic printouts. It is not uncommon to come across code where two macros have been defined, but one is commented out:

%-define(DBG(Str, Args), ok).
-define(DBG(Str, Args), io:format(Str, Args)).

birthday(#person{age=Age} = P) ->
    ?DBG("in records1:birthday(~p)~n", [P]),
    P#person{age=Age+1}.

When developing the system, you have all of the debug printouts on in the code. When you want to turn them off, all you need to do is comment the second definition of DBG and uncomment the first one before recompiling the code.

Debugging and Macros

One of the major uses of macros in Erlang is to allow code to be instrumented in various ways. The advantage of the macro approach is that in using conditional macros (which we will describe in this section), it is possible to generate different versions of code, such as a debugging version and a production version.

The first aspect of this is the ability to get hold of the argument to a macro as a string, made up of the tokens comprising the argument. You do this by prefixing the variable with ??, as in ??Call:

-define(VALUE(Call),io:format("~p = ~p~n",[??Call,Call])).
test1() -> ?VALUE(length([1,2,3])).

The first use of the Call parameter is as ??Call, which will be expanded to the text of the parameter as a string; the second call will be expanded to a call to length so that in the shell, you would see the following:

36> macros1: test1().
"length ( [ 1 , 2 , 3 ] )" = 3

Second, there is a set of predefined macros that are commonly used in debugging code:

?MODULE

This expands to the name of the module in which it is used.

?MODULE_STRING

This expands to a string consisting of the name of the module in which it is used.

?FILE

This expands to the name of the file in which it is used.

?LINE

This expands to the line number of the position at which it is used.

?MACHINE

This expands to the name of the VM that is being used; currently, the only possible value for this is BEAM.

Finally, it is possible to define conditional macros, which will be expanded in different ways according to different flags passed to the compiler. Conditional macros are a more elegant and effective way to get the same effect as the earlier ?DBG example, where given two macros, the user comments one out. The following directives make this possible:

-undef(Flag).

This will unset the Flag.

-ifdef(Flag).

If Flag is set, the statements that follow are executed.

-ifndef(Flag).

If Flag is not set, the statements that follow are executed.

-else.

This provides an alternative catch-all case: if this case is reached, the statements that follow are executed.

-endif.

This terminates the conditional construct.

Here is an example of their use:

-ifdef(debug).
     -define(DBG(Str, Args), io:format(Str, Args)).
-else.
     -define(DBG(Str, Args), ok).
-endif.

In the code this is used as follows:

?DBG("~p:call(~p) called~n",[?MODULE, Request])

To turn on system debugging, you need to set the debug flag. You can do this in the shell using the following command:

c(Module,[{d,debug}]).

Or, you can do it programmatically, using compile:file/2 with similar flags. You can unset the flag by using c(Module,[{u,debug}]).

Conditional macro definitions such as these need to be properly nested, and cannot occur within function definitions.

To debug macro definitions, it is possible to get the compiler to dump a file of the results of applying epp to a file. You do this in a shell with c(Module,['P']) and in a program with compile:file/2; these commands dump the result in the file Module.P. The 'P' flag differs from the 'E' flag in that code transformations necessary for record operations are not done by 'P'.

Include Files

It is customary to put record and macro definitions into an include file so that they can be shared across multiple modules throughout a project, and not simply in a single module. To make the definitions available to more than one module, you place them in a separate file and include them in a module using theinclude directive, usually placed after the module and export directives:

-include("File.hrl").

In the preceding directive, the quotes "..." around the filename are mandatory. Include files customarily have the suffix .hrl, but this is not enforced.

The compiler has a list of paths to search for include files, the first of which is the current directory followed by the directory containing the source code being compiled. You can include other paths in the path list by compiling your code using the i option: c(Module, [{i, Dir}]). Several directories can be specified, where the directory specified last is searched first.

Exercises

Exercise 7-1: Extending Records

Extend the person record type to include a field for the address of the person. Which of the existing functions over person need to be modified, and which can be left unchanged?

Exercise 7-2: Record Guards

Using the record BIF record(P, person), it is possible to check whether the variable P contains a person record. Explain how you would use this to modify the function foobar, defined as follows:

foobar(P) when P#person.name == "Joe" -> ...

so that it will not fail if applied to a nonrecord.

Exercise 7-3: The db.erl Exercise Revisited

Revisit the database example db.erl that you wrote in Exercise 3-4 in Chapter 3. Rewrite it using records instead of tuples. As a record, you could use the following definition:

-record{data, {key, data}).

You should remember to place this definition in an include file. Test your results using the database server developed in Exercise 5-1 in Chapter 5.

Exercise 7-4: Records and Shapes

Define a record type to represent circles; define another to represent rectangles. You should assume the following:

  • A circle has a radius.

  • A rectangle has a length and a width.

Give functions that work over these types to give the perimeter and area of these geometric figures. Once this is completed, add the code for triangles to your type definitions and functions, where you can assume that the triangle is described by the lengths of its three sides.

Exercise 7-5: Binary Tree Records

Define a record type to represent binary trees with numerical values held at internal nodes and at the leaves. Figure 7-1 shows an example.

Define functions over the record type to do the following:

  • Sum the values contained in the tree.

  • Find the maximum value contained in the tree (if any).

A tree is ordered if, for all nodes, the values in the left subtree below the node are smaller than or equal to the value held at the node, and this value is less than or equal to all the values in the right subtree below the node. Figure 7-2 shows an example:

  • Define a function to check whether a binary tree is ordered.

  • Define a function to insert a value in an ordered tree so that the order is preserved.

An example of a binary tree
Figure 7-1. An example of a binary tree
An ordered binary tree
Figure 7-2. An ordered binary tree

Exercise 7-6: Parameterized Macros

Define a parameterized macro SHOW_EVAL that will simply return the result of an expression when the show mode is switched off, but which will also print the expression and its value when the show flag is on. You should ensure that the expression is evaluated only once whichever case holds.

Exercise 7-7: Counting Calls

How can you use the Erlang macro facility to count the number of calls to a particular function in a particular module?

Exercise 7-8: Enumerated Types

An enumerated type consists of a finite number of elements, such as the days of the week or months of the year. How can you use macros to help the implementation of enumerated types in Erlang?

Exercise 7-9: Debugging the db.erl Exercise

Extend the database example in Exercise 7-3 so that it includes optional debugging code reporting on the actions requested of the database as they are executed.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset