© Leon Starr, Andrew Mangogna and Stephen Mellor 2017

Leon Starr, Andrew Mangogna and Stephen Mellor, Models to Code, 10.1007/978-1-4842-2217-1_10

10. Pycca and Other Platforms

Leon Starr, Andrew Mangogna2 and Stephen Mellor1

(1)San Francisco, California, USA

(2)Nipomo, California, USA

In this chapter, we discuss the design and implementation of the pycca program itself. Pycca is designed as a language processor that reads DSL statements to populate a platform model and generates code by using a template system that queries the populated platform model. It is implemented in the Tcl language.

We also present some size and execution speed measurements of the ALS system on a representative microcontroller platform to show that the resulting memory usage and execution speed are appropriate for our targeted platform.

To conclude the chapter, we present a brief overview of a target platform using Berkeley DB, a key/value data store engine, to store domain data rather than keeping all the data in memory.

Design of the Pycca Program

There remains one area in our translation approach that we have not discussed. We have shown input to pycca and output from it, but we have not discussed pycca itself as a program. Space does not allow a complete description of the pycca implementation. The source code and documentation for pycca are freely available from this book’s website. In this section, we discuss several aspects of the design and implementation of the pycca program itself.

Pycca has three major design elements:

  • Platform model

  • Domain-specific language processing

  • Template-driven code generation

Platform Model

It should come as no surprise that there is a model underlying pycca operations. In the following discussion, we use the same names for things that are at two levels of abstraction. Let’s define some terms to keep things clear. When we speak of the executable model, we are referring to the model of logic that is to be translated. When we speak of the platform model, we are referring to the model of the implementation technology platform onto which we are translating.

The platform model for pycca is specific to the targeted implementation technology. This is not a general model of modeling itself (a metamodel). The pycca platform model does not set the rules for how to create an executable model. Rather, it gives the rules for how our C implementation will be formed. The platform model reflects the technology choices we have made for the specific implementation that pycca generates. Targeting a different platform or making different choices about the implementation details on our platform would be reflected in platform model differences.

Figure 10-1 shows a fragment of the platform model for pycca.

A421575_1_En_10_Fig1_HTML.gif
Figure 10-1. Fragment of pycca platform model

Notice that there are classes named Domain and Class. These classes model the implementation counterparts of a Domain or Class in the executable model.

The platform model states that Classes contain Data Elements and that a class may contain no Data Elements at all (R3). Data Elements are part of exactly one class, and so there is no sharing of Data Elements among Classes. All Data Elements are one of three types (R4): Attributes, ClassRefs, or SubtypeRefs. Further, class references are one of four types (R21).

For our target, we have decided that all class instances will be held in memory and stored in arrays of C structures. The members of a class structure are either attributes, references (of some form) to classes, or references to subclasses in a generalization. The classes in the platform model are used to model the implementation structure members that are generated by pycca. For pycca, the Class class in the platform model has a direct correspondence to the C structure that is generated for the implementation.

For example, consider the Duty_Station class definition from our Air Traffic Control model:

class Duty_Station
    attribute (Station_Number Number)
    attribute (Name_t Location)
    attribute (Aircraft_Maximum Capacity)
    reference R3 -> On_Duty_Controller
end

The R3 reference is a singular reference to an instance of the On_Duty_Controller class. An instance of SingClassRef (and all its related superclass instances) would be created to correspond to the reference statement in the Duty_Station class definition. When pycca generates code, it uses this information in two ways. When the Duty_Station structure is declared, it contains a member, struct On_Duty_Control *R3. This is shown in the following pycca-generated structure definition:

/*
 * Duty_Station structure definition
 */
struct Duty_Station {
    struct mechinstance common_ ; // must be first !
    Station_Number Number ;
    Name_T Location ;
    Aircraft_Maximum Capacity ;
    struct On_Duty_Controller *R3 ;
} ;

When creating the initializers for the initial instance population, the R3 member is set to the address of an On_Duty_Controller array member or NULL, depending on the population requested. This is shown for the initial instance population of our example:

/*
 * Initial Instance Storage for, "Duty_Station"
 */
static struct Duty_Station Duty_Station_storage[3] = {
    {.common_ = {1, 0, &Duty_Station_class}, "S1", "Front", 20, .R3 = NULL},
    {.common_ = {2, 0, &Duty_Station_class}, "S2", "Center", 30, .R3 = NULL},
    {.common_ = {3, 0, &Duty_Station_class}, "S3", "Front", 45, .R3 = NULL}
} ;

Notice that many of the classes in the platform model have File and Line attributes. These attributes record the name of the input file and line number within the file where the entity was defined. This information is helpful for producing error messages as well as for inserting #line directives in the generated code file.

The pycca platform model has 35 classes. They cover the types of information that are specified in the DSL language statements, such as classes, attributes, state transitions, and initial instances. The pycca platform model forms the fundamental basis for how the rest of the program is organized and how the processing required for code generation works.

Domain Specific Language Processing

Generating a parser for a computer language is a well-understood problem. Pycca follows the usual pattern of defining a grammar and using a parser generator to create a pycca DSL interpreter. As the input is scanned and recognized, code is executed when grammar elements are reduced. The code executed when DSL statements are recognized creates instances of the platform-model classes. Data from the language statements correspond to attributes in the platform-model class instances. You can think of the pycca DSL as a textual and more convenient representation of a platform-model population. The same effect could be created by populating the platform model directly, but using a DSL provides an opportunity to organize the platform-model population in a more human-friendly way.

As with other computer languages, it is possible to write pycca DSL statements with correct syntax that are meaningless. So, pycca performs a set of semantic checks after the input files have been parsed and the platform model populated. Some of the semantics are enforced by the constraints on the platform model, such as those implied by platform-model relationships. Others require executing code.

For example, if a state in a state model is marked as a final state, no outbound transitions are allowed from that state. Because an instance that transitions to a final state is deleted after its state activity is executed, it is not possible for it to respond to other events. Pycca also disallows isolated states (states that have neither outbound nor inbound transitions). There are 20 such semantic checks.

Template-Driven Code Generation

The code generator for pycca is designed using template expansion. This is a common idiom for generating everything from accounting reports to web pages. Pycca uses it much like any template system. Text is passed from the template to the output. Commands are embedded in the template, and when the template expander recognizes a command, it is executed, and the result returned from the embedded command is placed in the output. The template pycca uses contains embedded commands which query the platform model to find the information needed at that point in the code generation and then format the information as valid C language statements. Conceptually, there is little difference between what pycca does and what a banking program might do to consult a database and print an account statement. The goal of pycca is to produce output for a C compiler rather than a bank customer.

The use of a template permits us to order the generated output to suit the C compiler. There is a separate template for the generated header and code files. The templates are designed and ordered so that the embedded expansion commands place their output at the appropriate location in the output. The C language requires a lot of type annotation and insists that symbol names be declared before (or sometimes at the same time) as they are defined. Declarations and definitions are key concepts in C, and usually declarations must precede definitions. For example, forward declarations of state activity functions are placed before the definitions of the data structures that use the function names. These are then placed before the definitions of the state activity’s function code.

Pycca Implementation

Pycca is implemented in the Tcl language. Tcl is a dynamic language often used as a scripting language. The choice of a scripting language for implementing pycca may seem unusual, but Tcl has many desirable characteristics:

  • Tcl is a mature, stable language and has been under active development for more than 20 years.

  • Tcl is platform independent and runs on Linux (and other UNIX derivatives), macOS, and Windows.

  • Tcl programs can be distributed as single-file executable without external dependencies or complicated installation requirements.

  • Tcl is extensible and supports a large standard library of procedures.

In practice, the choice of language used to implement a program such as pycca should be made on matters of availability and convenience. It should be a language that is familiar to the implementer and that conveniently supports the implementation of the design ideas from the previous section. Here we discuss how Tcl is used to implement the pycca design.

The platform model for pycca is a normalized relational schema, as are all of our example models. Pycca uses the TclRAL (Relational Algebra Library) package to implement the operations on the platform model. The TclRAL package implements relational algebra, in which the operators are Tcl commands and the values and variables are directly integrated into the Tcl language. You can think of the TclRAL implementation of the platform model as an in-memory database that uses ordinary Tcl language commands to query and manipulate the platform-model data. It was designed expressly to support the integrity constraints of our modeling approach. TclRAL is completely integrated into the Tcl value system. There is no “impedance mismatch” in TclRAL in the sense that one does not use a different language to query and manipulate data and then have to transfer query results back into the implementation language for further processing. Contrast this approach with SQL, which requires you to deal with the inevitable boundary between the implementation language and the query language.

Following in the tradition of the venerable lex and yacc programs, Tcl has fickle and taccle to perform the same functions of generating a lexical analyzer and a parser. In this case, the generated analyzer and parser are delivered as Tcl code rather than C code.

The standard Tcl library contains procedures to perform template expansion, and these are used to generate the header and code files. Templates contain ordinary text passed directly to the output and embedded Tcl commands. The embedded commands are executed, and their output is written to the generated file. The embedded commands are implemented as TclRAL queries on the platform model with appropriate formatting of the results into C language statements.

The pycca program consists of approximately 5,000 lines of Tcl code, grammar specification, and lexical analyzer specification, not counting comments or blank lines. We are not fond of using lines of code as a software metric and present only a gross, relative indication of the size of the pycca program. So pycca is not a large program, but much of that can be attributed to using Tcl and the fact that Tcl as a language is expressive in fewer lines of code. If it had been written in C, pycca would probably be several times larger.

Pycca Performance

Not only is it necessary to achieve a translation of a model into running code, but it is also essential that the quality of the resulting implementation meet the needs of the project. It does little good to produce insightful models and a faithful translation to code if the performance requirements are not met.

Our target platform is microcontroller-based systems. The major challenge for these types of systems is the limited computing resources available. Both memory and processor speeds are usually quite small. In this section, we show some performance numbers for the translation of the ALS system.

Target Hardware Platform

Many commercially available platforms would satisfy our needs. We have chosen the Giant Gecko from Silicon Labs. This microcontroller is available in a starter kit called EFM32GG-STK3700.

The microcontroller on the starter kit is the EFM32GG990F1024. This computer is an ARM Cortex-M3 based SOC with 1 MiByte of flash and 128 KiByte of RAM. The SOC is capable of ultra-low power sleep modes and consumes approximately 219 μ A / MHz when executing code from flash memory. Although capable of running at a 48 MHz clock frequency, the microcontroller was clocked at 7 MHz in these measurements. These specifications place the Giant Gecko at the more capable end of the class of microcontrollers we target.

Target Software Platform

The code for this example was built using the Silicon Labs Simplicity Studio development environment. We have included code from the vendor-supplied hardware access library as well as startup code and other small code pieces required to build a complete application.

The application was built using the GNU compiler suite:

arm-none-eabi-gcc (GNU Tools for ARM Embedded Processors) 5.4.1 20160919 (release)
[ARM/embedded-5- branch revision 240496]

The application was compiled with preprocessor symbols NDEBUG, MECH_NINCL_STDIO, and RELEASE defined to remove all the assertions and uses of C standard I/O functions (for example, printf). Optimization was configured for minimum size using the -Os setting. Unused data and functions in object files were discarded in the final executable. This combination of build settings yields the smallest executable. When built for debugging and instrumentation, executables are often twice as large as those built for release.

ALS Code Size

The following measurements are for the ALS application example shown in Chapters 68. This example consists of two domains and the bridge code between them. The Lubrication domain contained external entity references to a UI and Alarms domain, and these references have been stubbed out. Also the SIO domain contained external entity references for access to device hardware, and these too have been stubbed out.

Table 10-1 shows the overall memory usage, in bytes, of the integrated lube/sio application.

Table 10-1. ALS Application Code Size (Bytes)

File

Code and Constants

Initialized Data

Uninitialized Data

Total

lube_sio.axf

12,064

1,784

756

14,606

The total memory usage for the application easily fits our target hardware and would fit many other target hardware platforms of this class as well.

We can break down the memory usage by examining the sizes of the domains and the bridge code, as shown in Table 10-2.

Table 10-2. ALS Domain Code Size (Bytes)

File

Code and Constants

Initialized Data

Uninitialized Data

Total

lube.o

2,663

476

0

3,139

sio.o

2,070

776

0

2,846

lube_sio_bridge.o

471

0

0

471

Total

5,204

1,252

0

6,456

Of the total memory usage, less than half is devoted to the domain and bridge code. We should note that the initial instance population for this application was quite small, and this is reflected in the size of the initialized data. Increasing the size of the initial instance population would only increase the usage of initialized data. The code and constants memory usage would remain the same, regardless of the size of the initial instance population, because that memory usage already includes all the code and pycca-generated information for the domains. Note that the initialized data also results in the same amount of space being allocated to RAM, which does not show up in these totals. At reset time, the initialized data is copied from flash memory to RAM by compiler-supplied startup code. Because flash memory technology does not allow direct updating in the same way as RAM, the RAM copy allows the values of class instances to be updated during the running of the program. This need to allocate twice the space for initialized data (in flash and an equal amount in RAM) is a consequence of the usual split between flash memory and RAM characteristic of microcontroller SOC designs.

It is worthwhile examining the contribution of the ST/MX domain to overall memory usage. This is important because the Model Execution domain must be present to run pycca-generated applications and represents a fixed cost that must be amortized across the entire application. See Table 10-3.

Table 10-3. Model Execution Domain Code Size (Bytes)

File

Code and Constants

Initialized Data

Uninitialized Data

Total

mechs.o

2,032

12

720

2,764

pycca_portal.o

990

0

0

990

platform.o

471

1

0

472

Total

3,521

13

720

4,226

The mechs.o file contains all the target-independent code of the MX domain. This includes the event queues and event signal/dispatch code, and so forth. The pycca_portal.o file consists of the portal functions used in bridging. Finally, the platform.o file consists of the platform-specific code required by the MX domain. This includes control of the timing resource used for delayed events, code to deal with low-power mode sleep and wake-up, and code to interface to the synchronization queue. The uninitialized data is allocated to event queues, event control blocks, and other internal resources of the ST/MX implementation. These resources can be expanded or contracted as needed, and this total represents the default sizing of 10 event control blocks with 16 bytes of parameter data space and 10 sync queue slots.

The memory usage by the ST/MX domain compares favorably with that of many RTOS implementations that target microcontrollers. However, that comparison is not direct. An RTOS will include facilities for multitasking, inter-task synchronization, and mutual exclusion not present in ST/MX. ST/MX is single threaded, strictly event driven with run-to-completion execution, and does not need tasking services nor the synchronization operations and mutual exclusion mechanisms required to support them. Conversely, ST/MX can signal and dispatch events to state machines, a feature not available in RTOSs.

The approximately 25 percent of the remaining memory usage comes from startup code, standard C libraries, compiler libraries, hardware access libraries, external entity stubs, and other “glue” code necessary to obtain a running application. While this is a substantial part of the total memory usage, it represents a fixed cost for the application.

Execution Speed

We present only one execution speed measurement, shown in Table 10-4. Signaling and dispatching an event is a common operation in the ST/MX domain. Here we have measured the number of CPU cycles to signal and dispatch a self-directed event. This includes the time required to allocate the ECB, fill it in, add it to the event queue, return to the main loop to decide whether there is another event to dispatch, remove the ECB from the event queue, compute the transition, and enter the state activity. Conveniently, the ARM Cortex-M3 includes a cycle counter for these purposes.

Table 10-4. Timing to Signal and Dispatch an Event

Cycles

Time @ 7 MHz

Operations / s @ 7MHz

4,696

67 μ s

1,491

It takes 4,696 cycles to complete the signal/dispatch operation. At the 7 MHz clock frequency, this means it takes approximately 67 μs or alternatively, we may perform approximately 1,491 such operations per second. These numbers appear distorted when compared to the capabilities of modern desktop and server class computers running at 2–3 GHz frequencies and having large instruction and data caches. We must be careful not to extrapolate between such vastly different computing technologies.

Performance Discussion

Performance comparisons in this realm are difficult to make. We have no benchmarks to test MX domain implementations. Direct comparisons are seldom possible, since providing the same functionality for an application in both a modeled and non-modeled implementation is so expensive.

The measurements presented here demonstrate that the performance of pycca-translated domains do match the computational facilities provided by many microcontroller-based systems. We have remained well within our memory targets and with a reasonable cycle count for the implementation of event signaling. Experience over many systems has shown that carefully tailoring the translation scheme and model execution domain to the target platform allows translated models to meet a project’s performance goals.

Supplying Implementation-Specific Code

When we model application logic, the scale of the problem is one of the implementation aspects that we do not consider. Whether a class has only a few instances or millions of instances does not change the fundamentals of the application logic. If a particular activity is required to find a class instance based on the value of an attribute, whether there are only a few instances or a great many instances does not affect the fact that we must find the required instance.

But when we consider the implementation of such searching, scale matters a great deal. When the number of instances of a class becomes large enough and the frequency at which we must search the instances to find a particular one increases, the simple sequential search provided by pycca macros can become a performance problem.

Consider our Air Traffic Controller class from Chapter 4:

class Air_Traffic_Controller
    attribute (Employee_ID ID)
    attribute (Name_T Name)
    attribute (Experience_Level Rating)
    # ...
    # other parts of the Air Traffic Controller Class
end

We could use a pycca macro to find an instance of Air Traffic Controller that matches a given ID:

ClassRefVar(Air_Traffic_Controller, atc) ;
PYCCA_selectOneInstWhere(atc, Air_Traffic_Controller, strcmp(atc->ID, "ATC-137") == 0) ;
if (atc >= EndStorage(Air_Traffic_Controller)) {
    // not found
} else {
    // found
}

This code performs a simple, sequential iteration across all the instances of Air Traffic Controller and compares the value of the ID attribute looking for a match. This approach is simple, already provided, and for a small number of instances, works well. As we scale up the number of instances of Air Traffic Controller, we will need something better.

We could, for example, use a binary search to reduce the number of comparisons. A binary search requires a particular ordering of the searched items. Pycca, however, organizes the initial instance population in memory in the order of definition. To convert our sequential search to a binary search, we need to order the initial instances of Air Traffic Controller by ascending order of the ID attribute. Let’s suppose our initial instance population is as follows:

table
Air_Traffic_Controller (Employee_ID ID) (Name_T Name) (Experience_Level Rating) R1
@atc51  {"ATC-51"} {"Ianto"}    {"C"}       -> On_Duty_Controller.atc51
@atc53  {"ATC-53"} {"Toshiko"}  {"A"}       -> On_Duty_Controller.atc53
@atc67  {"ATC-67"} {"Gwen"}     {"B"}       -> On_Duty_Controller.atc67
@atc77  {"ATC-77"} {"John"}     {"B"}       -> Off_Duty_Controller.atc77
@atc87  {"ATC-87"} {"Fred"}     {"B"}       -> Off_Duty_Controller.atc87
# ...
# many other controllers in ascending order of the ID attribute
# ...
end

By defining a class operation, we can add our own code to apply a binary search to find a matching Air Traffic Controller. The standard library bsearch() function requires the following:

  • A pointer to the search key

  • A pointer to the beginning of an array of items to be searched

  • The number of items in the array

  • The size of each array item

  • A function returning an integer that compares two items

We would define the class operation as part of the Air Traffic Controller class:

class Air_Traffic_Controller
    # ... other parts of the Air Traffic Controller Class
    class operation findByEmployeeID(char const *eid) :
            (struct Air_Traffic_Controller *) {
        struct Air_Traffic_Controller key = {
            .ID = eid
        } ;
        return (struct Air_Traffic_Controller *)bsearch(&key,
                BeginStorage(Air_Traffic_Controller),       // ❶
                ATCTRL_AIR_TRAFFIC_CONTROLLER_INST_COUNT,   // ❷
                sizeof(struct Air_Traffic_Controller),
                atc_compare_ids) ;                          // ❸
    }
end
implementation prolog {                                     // ❹
    #include <stdlib.h>
    #include <string.h>
    static int
    atc_compare_ids(void const *m1, void const *m2)
    {
        struct Air_Traffic_Controller const *atc1 = m1 ;
        struct Air_Traffic_Controller const *atc2 = m2 ;
        return strcmp(atc1->ID, atc2->ID) ;
    }
}
  • ❶ The BeginStorage macro resolves to the address of the storage array for the given class.

  • ❷ Pycca emits a macro definition for the number of instances of a class.

  • ❸ We must supply a comparison function for bsearch.

  • ❹ Placing the comparison function in the implementation prolog ensures that its definition appears before it is used in the class operation.

Now we can locate a reference to Air Traffic Controller, ATC-137, by using the following code:

ClassRefVar(Air_Traffic_Controller, atc) ;
atc = ClassOp(Air_Traffic_Controller, findByEmployeeID)("ATC-137") ;
if (atc == NULL) {
    // not found
} else {
    // found
}

There are, of course, other ways to implement this search. For example, we can use bsearch only for static instance populations. For a dynamic population of Air Traffic Controller instances, we might choose to keep a hash table. Then activity code that creates or deletes an Air Traffic Controller instance would also add or remove the instance reference from the hash table, which would be keyed by the ID attribute value. You would probably code the instance creations and hash table addition operations together into a class operation, and similarly with the instance deletion and hash table removal operations.

Our point here is that the implementation can be tailored to match the scale of the problem. Most important, the specific implementation mechanisms do not affect the model logic. If the model logic requires a particular Air Traffic Controller to be found, then the translation must choose the appropriate implementation for the search, and that choice is based, at least in part, on the number of instances that need to be searched. Searching is a well-researched problem in computer science whose results we can draw upon here.

Considering Other Platforms

In this book, we have remained focused on one particular target platform. This focus on a single platform has helped demonstrate concepts in translation without the burden of showing how the same effect is achieved using another implementation mechanism. In this section, we broaden our discussion to consider a different translation target. Space does not allow an extended discussion of the many possibilities of implementation technology that we might wish to use to meet the requirements of a system. To bound our discussion, we change only the platform requirement for how data is managed. We keep the programming language as C and the execution single threaded. Many of the concepts for our microcontroller target carry forward, and so we focus on how data management might be changed.

How data is held and managed is one of the key factors in determining the characteristics of an MX domain. In the microcontroller target platform, we decided that all domain data would be held in primary memory, directly addressable by the processor. Here, we change that requirement and insist that our alternative MX domain hold data in secondary storage. Many classes of application either have too much data to be held in primary memory or have other requirements to persist domain data into nonvolatile secondary storage.

In this section, we outline what a MX domain might look like that holds domain data in secondary storage. We do not present a complete MX domain in this discussion. Rather, we present a series of examples of how model-level data concepts might be implemented using a persistent data storage mechanism. Again, we must be precise about what the target platform supports:

  • The implementation language is C.

  • Domain data is managed using Berkeley DB.

  • Execution is single threaded.

  • The target hardware platform is a desktop or server class of processor.

  • We assume a POSIX operating system environment with GiBytes of primary memory and secondary disk storage at least 10 times the size of primary memory.

The main differences between the microcontroller target that we have been discussing and this new platform is the use of Berkeley DB to manage the domain data and the assumption of a much more capable computer running a fully featured operating system. We have purposely kept the implementation language and the single-threaded nature of the execution the same as our microcontroller target to avoid introducing other elements.

Berkeley DB is a general-purpose embedded database engine. The central concept in Berkeley DB is that of a persistent key/value data store in which keys and values are arbitrary byte arrays of data. The library is mature, well supported, and provides features well beyond our uses in this example. Complete information on Berkeley DB can be found at the http://www.oracle.com/technetwork/database/database-technologies/berkeleydb/overview/index.html .

Mapping Domain Data to Berkeley DB

Our first task is to map model execution data concepts to Berkeley DB implementation mechanisms. Figure 10-2 shows how model data management concepts are mapped onto Berkeley DB facilities and how Berkeley DB uses the file system for persistent storage.

A421575_1_En_10_Fig2_HTML.gif
Figure 10-2. Mapping domain data to Berkeley DB

For this example, we use parts of the Lubrication domain from Chapter 6. We discuss each of these concepts and also show small code sequences to demonstrate how Berkeley DB functions might appear in the MX domain and how the implementation of the mapping of model data management onto Berkeley DB is realized in C code. We don’t expect you to be a Berkeley DB expert and recognize every library call. Rather, you can get a general feel for how the data management would be coded, and documentation of the database library calls is readily available for those who wish to delve deeper. You can also get a good sense of how different data management is in ST/MX, where everything is held in memory, compared to using a key/value pair storage mechanism.

This example deals with just two classes from the Lubrication domain: Injector and Machinery. To refresh your memory, Figure 10-3 is a fragment of the class diagram from the Lubrication domain.

A421575_1_En_10_Fig3_HTML.gif
Figure 10-3. Lubrication domain class diagram fragment

Following the mapping in Figure 10-2, we start with enclosing a domain’s data in a Berkeley DB environment. This construct suits our needs to manage data for a single domain. An environment provides a grouping for databases and transaction capability, and necessary files are stored in a single directory of the file system:

int dbres = 0 ;

dbres = db_env_create(&lube_env, 0) ;                          // ❶
if (dbres != 0) {
    handle_error(dbres, "Error creating environment handle") ; // ❷
}


dbres = lube_env->open(lube_env, "./lube_domain",              // ❸
        DB_CREATE | DB_INIT_MPOOL, 0) ;
if (dbres != 0) {
    handle_error(dbres, "Environment open failed") ;
}
  • ❶ Most entities in the library are created first, before any other operations are performed.

  • ❷ For brevity, we assume some error-handling function.

  • ❸ To use the environment, we must open it.

This code creates the environment if needed (it may already exist), and all the files will be placed in the lube_domain directory.

A domain class is stored as a database. In Berkeley DB, a database is roughly the same as a table, which matches a view of class data consisting of a set of instances that form rows in a table. After an environment is open, we can create and open the databases that correspond to the classes. We show only code for the Injector class, but all classes for a domain would have their own database for instance storage:

DB *injdb ;

dbres = db_create(&injdb, lube_env, 0) ;                          // ❶
if (dbres != 0) {
    handle_error(dbres, "Failed to create injector database") ;
}


dbres = injdb->open(injdb, NULL,  "injector.db", NULL, DB_BTREE,  // ❷
        DB_CREATE, 0) ;
if (dbres != 0) {
    handle_error(dbres, "Injector database open failed") ;
}
  • ❶ The database is created in the context of the environment for the Lubrication domain.

  • ❷ The database is stored in a file named injector.db.

Berkeley DB provides several choices for the details of how data will be stored. Here we have chosen a Btree for the underlying storage organization. This is common usage in Berkeley DB. Other choices might be better, depending on the details of the application demands for storage and access to the storage.

We represent an instance of a class as a Berkeley DB cursor. A cursor specifies a location in a database, can be used to access instance attribute values, and can iterate across instances. We discuss cursors later in this example. For now, it is sufficient to know that they can be considered (roughly speaking, again) as a reference to one or more instances.

As with the ST/MX domain for our microcontroller platform, we convert each class description into a C structure. The C structure for a class provides a convenient way to transfer values back and forth to Berkeley DB and still have direct access to the attributes. Berkeley DB treats key and data values as byte arrays, and variables of a C structure type can be used as a staging area in the transfer to and from the database. The C structure variables also provide convenient access to individual members when database values are held temporarily in memory. For our two classes in this example, the C structures would appear as follows:

typedef uint32_t uniqueID ;
typedef char InjModelName[16] ;
typedef char Name[32] ;


struct Injector {
    uniqueID ID ;                 // {I}
    unsigned Pressure ;
    bool Disspation_error ;
    bool Injecting ;
    Name Default_schedule ;       // {R1}
    uniqueID Machinery ;          // {R5}
uniqueID Reservoir ;              // {R3}
InjModelName Model ;              // {R4}
} ;


struct Machinery {
    uniqueID ID ;
    bool Locked_out ;
} ;

Previously, we omitted identifying attributes from the C structure. Because ST/MX used its own identifier for an instance (that is, the pointer address of the instance in memory), we discarded identifying attributes if they were not otherwise used. Using Berkeley DB, we need a key to uniquely identify an instance, so we retain the identifiers in the model and use them as the key to the database storage. This is shown in Figure 10-4.

A421575_1_En_10_Fig4_HTML.gif
Figure 10-4. Mapping class storage to Berkeley DB

It is also possible for a class to have multiple identifiers. Consider using both a system-supplied identifier and a Customer’s e-mail address as identifiers. We map each additional identifier to a Berkeley DB secondary index. A secondary index is associated with a primary database. This is shown in Figure 10-5.

A421575_1_En_10_Fig5_HTML.gif
Figure 10-5. Using a secondary index for an alternate identifier

The secondary index is stored like any other database. The key portion of the index record is the value of the alternate identifier. The value portion of the index record is the value of the primary identifier. Berkeley DB arranges it so that each time a record is inserted into the Customer Database, a corresponding record is inserted into the Customer Email Index. Looking up a Customer by e-mail address involves obtaining a record from the Customer Email Index that matches an e-mail address and then using the value portion of that record as the key to look up a record in the Customer Database.

How information is stored to support relationship navigation is another important aspect of domain data management. Returning to the Lubrication domain class diagram fragment, consider navigating association R5 from an instance of Injector to an instance of Machinery. The multiplicity of the association establishes that we expect to obtain exactly one instance of Machinery. Figure 10-6 shows how an attribute from an Injector instance is used as a key into the Machinery database.

A421575_1_En_10_Fig6_HTML.gif
Figure 10-6. Navigating to a machinery instance

The code for navigating R5 in this direction might appear as follows:

struct Injector injinst ;
DBT key ;
DBT value ;


memset(&key, 0, sizeof(key)) ;
memset(&value, 0, sizeof(value)) ;
value.data =  &injinst ;                                        // ❶
value.ulen = sizeof(injinst) ;
value.flags = DB_DBT_USERMEM ;


// Assume "injcursor" is positioned in the Injector database.
dbres = injcursor->get(injcursor, &key, &value, DB_CURRENT) ;
if (dbres != 0) {
    handle_error(dbres, "Failed to dereference injector cursor") ;
}


DBC *machcursor = NULL ;                                        // ❷
dbres = machdb->cursor(machdb, NULL, &machcursor, 0) ;
if (dbres != 0) {


    handle_error(dbres, "Failed to create machinery cursor") ;
}


memset(&key, 0, sizeof(key)) ;                                  // ❸
key.data = &injinst.Machinery ;
key.size = sizeof(injinst.Machinery) ;
memset(&value, 0, sizeof(value)) ;


// Position the Machinery cursor to the record matching the
// value of the Machinery attribute of the Injector instance.
dbres = machcursor->get(machcursor, &key, &value, DB_SET) ;
if (dbres != 0) {
    handle_error(dbres, "Failed to set machinery cursor") ;
}
  • ❶ The attribute values of the Injector instance are retrieved into a local variable of type struct Injector.

  • ❷ Create a cursor into the Machinery database.

  • ❸ The key for positioning the cursor is the value of the Injector.Machinery attribute.

We assume that we have a cursor into the Injector Database that locates the starting instance for the navigation across R5. Using the cursor, we fetch the value of the database record. In our case, the value is all the data of an Injector structure. Contained within that structure is the Machinery member. The value of the Machinery member is used as a key into the Machinery Database. So in this example, if, when we retrieve the Injector structure value, we find the Machinery member contains the value of M685, we are able to position a cursor into the Machinery Database corresponding to the record whose key is M685. After we have a cursor located at the related instance of Machinery, we can use it to read or update the value portion of the record as needed.

To navigate R5 starting at an instance of Machinery, we must determine how we are going to handle multiple instances of Injector related to a given instance of Machinery. The R5 association is “1..*” on the Injector side, so there can be many records in the Injector database with the same value of the Machinery attribute. The brute-force approach would be to scan the entire Injector Database, reading each record and looking for those records in which the Injector.Machinery value matched the value of Machinery.ID of our starting instance.

Fortunately, we can do better. Berkeley DB supports two concepts that can be used to navigate a relationship so that the result will yield more than one instance. The idea is to create a secondary index for the referential attributes that formalize a relationship and then to join across that secondary index. The secondary index is configured to allow duplicate keys, and the join operation will create a cursor that can access the multiple matching instances. This is shown in Figure 10-7.

A421575_1_En_10_Fig7_HTML.gif
Figure 10-7. Navigating R5 from Machinery to Injector

Here the R5 Index uses Injector.Machinery as the key. So we must allow duplicate keys in the index. The value portion of an R5 Index record holds the value of an Injector ID. In the example, the R5 Index shows that the Machinery instance, M300, is lubricated by Injector 101 and Injector 102.

First we must create the R5 Index:

dbres = db_create(&R5db, lube_env, 0) ;
if (dbres != 0) {
    handle_error(dbres, "Failed to create R5 index") ;
}
dbres = R5db->set_flags(R5db, DB_DUP | DB_DUPSORT) ;                // ❶
if (dbres != 0) {
    handle_error(dbres, "Failed flag setting on R5 index") ;
}
dbres = R5db->open(R5db, NULL, "R5.db", NULL, DB_BTREE, DB_CREATE, 0) ;
if (dbres != 0) {
    handle_error(dbres, "Failed to open R5 index") ;
}
  • ❶ We must allow for duplicates. The 1..* multiplicity of R5 on the Injector side means, in general, many instances of Injector will share the same value of Machinery.

Now we can associate the R5 Index to the Injector Database as a secondary index:

dbres = injdb->associate(injdb, NULL, R5db, getMachineryID, 0) ;      // ❶
if (dbres != 0) {
    handle_error(dbres, "Failed to associate R5 index to injector") ;
}
  • ❶ The getMachineryID function constructs the key for the secondary index.

As R5 Index records are added, we must supply a key for the record. The value portion of the record is already known: it is the key portion of the associated primary database. The key for the R5 Index is just the value of the Machinery attribute of the associated Injector instance:

static int
getMachineryID( DB *R5db,
    DBT const *pkey,
    DBT const *pdata,
    DBT *skey)
{
    memset(skey, 0, sizeof(*skey)) ;
    skey->data = &((struct Injector *)pdata->data)->Machinery ;        // ❶
    skey->size = sizeof(IDvalue) ;
    return 0 ;
}
  • ❶ Here we set the key value for the R5 Index to be the same as the value of the Machinery attribute of Injector.

To navigate R5 from Machinery to Injector, we start with a cursor into the Machinery Database. Retrieving the value at the cursor, we can use Machinery.ID as a key to position a cursor into the R5 Index. In our example for Machinery, M300, there are two records in the R5 Index the cursor would access. The R5 Index cursor is then joined to the Injector Database. The join operation establishes a cursor into the Injector Database that accesses all the values of Injector.ID for the records in the R5 Index where the key is the same value as that referenced by the R5 Index cursor. In our example, the join yields a cursor that can be used to access Injector records that have keys of 101 and 102.

// Set up data areas to get the value of the Machinery instance attributes.
DBT key ;
memset(&key, 0, sizeof(key)) ;


struct Machinery machinst ;
DBT value ;
memset(&value, 0, sizeof(value)) ;
value.data = &machinst ;
value.size = sizeof(machinst) ;


dbres = machcursor->get(machcursor, &key, &value, DB_CURRENT) ;  // ❶
if (dbres != 0) {
    handle_error(dbres, "Failed to set injector cursor") ;
}


DBC *R5cursor = NULL ;
dbres = R5db->cursor(R5db, NULL, &R5cursor, 0) ;                 // ❷
if (dbres != 0) {
    handle_error(dbres, "Failed to create R5 cursor") ;
}


memset(&key, 0, sizeof(key)) ;

key.data = &machinst.ID ;
key.size = sizeof(machinst.ID) ;
memset(&value, 0, sizeof(value)) ;


dbres = R5cursor->get(R5cursor, &key, &value, DB_SET) ;           // ❸
if (dbres != 0) {
    handle_error(dbres, "Failed to set machinery cursor") ;
}


DBC *joincursors[2] = {
    R5cursor,
    NULL
} ;


DBC *navcursor = NULL ;
dbres = injdb->join(injdb, joincursors, &navcursor, 0) ;           // ❹
if (dbres != 0) {
    handle_error(dbres, "Failed to join across R5") ;
}


memset(&key, 0, sizeof(key)) ;

struct Injector injinst ;                                          // ❺
memset(&value, 0, sizeof(value)) ;
value.data = &injinst ;
value.ulen = sizeof(injinst) ;
value.flags = DB_DBT_USERMEM ;


while ((dbres = navcursor->get(navcursor, &key, &value, 0)) == 0) { // ❻
    printf("Injector ID = %u Machinery = %u ", injinst.ID, injinst.Machinery) ;
}
  • ❶ Assume that machcursor has been set to reference a Machinery instance. This function gets the current Machinery instance values. In other words, we dereference the cursor.

  • ❷ Create a cursor into the R5 secondary index.

  • ❸ Set the cursor to the beginning of the entries in the R5 index that match the Machinery instance ID.

  • ❹ Join across the R5 cursor instances. This creates a new cursor to access the related Injector instances.

  • ❺In the interation at step 6, the Injector attributes are placed in a local variable for convenient access to the attributes.

  • ❻ Iterate over the join cursor to access the related instances of Injector. The get function returns nonzero when all the joined records have been fetched.

Ensuring referential integrity is the final concept we consider. In the ST/MX for our microcontroller target, no provisions were made to check the referential integrity between the instances at runtime. The MX domain assumes that the model gets that right, and the translation provides no additional assurances. This is the customary trade-off made for these types of targets. The code and data required to enforce referential integrity are large enough, and the amount of dynamic activity in the applications deployed on such targets is small enough, that the trade-off is to verify referential integrity by scrupulous model review, simulation, and testing rather than at runtime.

However, we can do better in this particular MX domain. Berkeley DB supports the concept of a foreign-key index. In this arrangement, referential attributes can be used as keys in a secondary index, which is then used to restrict adding records unless the key for the record is present in an associated database. This enables us to enforce a limited form of referential integrity checking as a domain executes.

Referring back to our Injector/Machinery example, we would like to make sure that any record added to the Injector Database has a value for Injector.Machinery that matches a value of Machinery.ID from the Machinery Database. We have already constructed the R5 Index, and so we can enlist it to play another role as a foreign-key index. This is shown in Figure 10-8.

A421575_1_En_10_Fig8_HTML.gif
Figure 10-8. Foreign-key index for referential integrity

Each time a record is added to the Injector Database, Berkeley DB will add a record to the R5 Index, just as it would any secondary index. Because the R5 Index is also associated with the Machinery Database as a foreign-key index, the record is added only if its key value matches one of the existing key values in the Machinery Database. If there is no match when adding the record to the R5 Index, the insert fails and the record is not added to the Injector Database either. This behavior ensures that all Injector records refer to Machinery records that exist. Complementary actions are taken with records that are deleted.

We can now associate the R5 Index as a foreign-key index to the Machinery Database:

dbres = machdb->associate_foreign(machdb, R5db, NULL, DB_FOREIGN_ABORT) ;
if (dbres != 0) {
    handle_error(dbres, "Failed to associate R5 index as foreign key") ;
}

In this section, we have presented only the barest sketch of how Berkeley DB could be employed as a data-management component in an MX domain. Clearly, much more would have to be done to fully develop an MX domain based on these ideas. The code examples shown were specific to the Injector/Machinery example. In an actual MX domain, the code operations would be generalized to work on all classes of a domain. To get a sense of the data required to generalize the data management, we consider a platform-model fragment that deals with the ideas presented here.

Platform-Model Differences

The manner in which we manage class data by using Berkeley DB varies considerably from keeping all the data in primary memory. In the case of ST/MX, we discard identifiers and use pointers to store the necessary information for relationship navigation. The platform model for ST/MX reflects these choices by having classes directly related to references to class instances. This was shown in the platform-model fragment in Chapter 9.

In the Berkeley DB example, we decided to use the identifiers and referential attributes to create key/value databases, secondary indices, and foreign-key indices. This usage mapped conveniently onto facilities provided by Berkeley DB. We don’t consider this to be a lucky coincidence. The ideas of identifiers and referential attributes are fundamental to the relational model of data, for which much research and mathematical fundamentals exist and which has proven valuable in many contexts. Berkeley DB is just another example. This example is more interesting because it is applied in the context of a key/value storage mechanism. The platform models for these two cases differ considerably. Figure 10-9 shows a fragment of a platform model that could be used in conjunction with a MX domain managing data with Berkeley DB.

A421575_1_En_10_Fig9_HTML.gif
Figure 10-9. Platform model for class identifiers and references

We reiterate again, that, despite the class names, the classes in this model correspond to platform-specific entities. So, the Class class in this model is a platform-specific counterpart to a model-level Class.

The R4 association requires that each Class have at least one Identifier. Identifiers consist of one or more Attributes (R5). An Attribute may be part of more than one Identifier (or even no Identifier). This is not an unusual circumstance, although it does not appear in this example. Taking the Injector/Machinery example, Tables 10-5, 10-6 and 10-7 show a population of these platform-model classes that establishes the classes, identifiers and attributes, respectively. Note that we are showing only the population for the fragment of the example. For the entire domain, there would be many other table rows for the other classes.

Table 10-5. Class Population

Domain

Name

Lubrication

Injector

Lubrication

Machinery

Table 10-6. Identifier Population

Domain

Name

Number

Lubrication

Injector

1

Lubrication

Machinery

1

Table 10-7. Attribute Population

Domain

Class

Name

Type

Lubrication

Injector

ID

ID

Lubrication

Injector

Pressure

MPa

Lubrication

Injector

Dissipation_error

bool

Lubrication

Injector

Injecting

bool

Lubrication

Injector

Default_schedule

ID

Lubrication

Injector

Machinery

ID

Lubrication

Injector

Reservoir

ID

Lubrication

Injector

Model

Name

Lubrication

Machinery

ID

ID

Lubrication

Machinery

Locked_out

Boolean

The population of the Attribute table gives us enough information to define a C structure for each class. So for each Class, we can define a structure in which the members are named the same as the Name attribute value of the Attribute class, and the corresponding data type for the member is given by the Type attribute value.

Each Attribute used in an Identifier results in an instance of R5 and consequently an instance of Identifying Attribute, as shown in Table 10-8.

Table 10-8. Identifying Attribute Population

Domain

Name

Attribute

Number

Lubrication

Injector

ID

1

Lubrication

Machinery

ID

1

We can use the Identifying Attribute data to determine the set of attributes used as the key portion of a Berkeley DB database that would store the class instances. In this example, there is only a single attribute, but the technique can be extended to account for multiple attributes in an identifier. This information would also be used to create any secondary indices required for additional identifiers. In the example, each class has only a single identifier, and it is used as the key for the database storing the class instances. Additional identifiers would show up as Number attribute values other than 1.

In the example model fragment, the Injector class is associated with the Machinery class. In this association, the Injector class serves the role of Referring Class, and the Machinery Class serves the role of Referenced Class. We can distinguish those roles because it is the Injector class that contains the referential attribute referring to an identifier in the Machinery class. Our platform model would be populated as shown in Table 10-9 and Table 10-10.

Table 10-9. Referring Class Population

Domain

Class

Relationship

Role

Lubrication

Injector

R5

Referring

Table 10-10. Referenced Class Population

Domain

Class

Relationship

Role

Lubrication

Machinery

R5

Referenced

An Identifying Attribute may be referenced when it is an identifier for a class serving the role of a Referenced Class in a relationship. Each time that happens, an instance of Referenced ID Attribute is created, as shown by R11. This class represents an identifier being referenced (as opposed to just serving as an identifier for its class). If you have an identifier that is referenced, there is also a referential attribute performing the reference, as shown by R12. The corresponding instance of Attribute Reference represents the referential attribute having the same value as its referenced identifying attribute.

For our example, the Attribute Reference and Referenced ID Attribute populations are shown in Table 10-11 and Table 10-12.

Table 10-11. Attribute Reference Population

Domain

Referring Class

Referring Attribute

Referring Role

Referenced Class

Lubrication

Injector

Machinery

Referring

Machinery

Referenced Attribute

Referenced Role

Referenced ID Number

Relationship

 

ID

Referenced

1

R5

 
Table 10-12. Referenced ID Attribute Population

Domain

Class

Relationship

Role

Attribute

Number

Lubrication

Machinery

R5

Referenced

ID

1

These tables give us the information we would need to create Berkeley DB secondary indices and foreign-key indices. Using the Attribute Reference information, we must to create a secondary index for each Relationship. The primary database would be the one corresponding to the value for the Referring Class attribute (Injector, in this case). The key to the secondary index would be the value of the Referring Attribute attribute (Machinery, in this case). With our example values, we would create a secondary index on the Injector database by using the Machinery attribute of Injector as the key for the secondary index. This was the code sequence we showed previously. There is also sufficient information here to generate the code for the callback function supplying the key for the secondary database. In the preceding example, the callback was named getMachineryID. Careful examination of this function shows that the name of the class, the name of the referring attribute, and the type of the referring attribute can be parameterized for code generation.

Using the Referenced ID Attribute information, we would associate the secondary index created for each Relationship as a foreign-key index of the database corresponding to the value of the Class attribute (Machinery, in this case). Again, with our example values, the secondary index created for the R5 relationship would be associated as a foreign-key index to the Machinery class, because it is that class that is referenced by the keys of the secondary index.

Alternate MX Design Discussion

We have briefly and incompletely shown how a MX domain might use a key/value store to manage domain data and provide the basis for supporting other platforms. Expanding on this example, we can enumerate the general points of our approach to building translation technology:

  1. Choose the appropriate implementation technology for the class of applications to be deployed. The choice of implementation technology is rarely made in a vacuum. Most project teams have produced systems similar to the one they are undertaking. They have a good idea, even if it is not written down, of the scale and appropriate computing technology that their application will require. One is likely to fail deploying an online web store on a microcontroller. One is just as likely to fail deploying a pacemaker on a laptop.

    Implementation choices often are made at the beginning of a project, even before basic requirements are well understood. We suspect that such early, non-requirements-driven choices confuse the activity associated with making decisions for real progress in completing the project.

  2. Map out how model-level actions will be implemented in the MX domain. Some model-level operations may be directly supported in the implementation language. Others will require designing mechanisms in the implementation language or incorporating prebuilt components. The model execution rules are the same in all cases. What differs is the manner in which they are implemented and the resulting computational capabilities provided for the domains translated onto the MX domain.

  3. Develop a platform model that captures the essential characteristics of the platform and supplies the MX domain with the required data. It is important that the platform model be accessible in a way that supports ad hoc queries, and so some relational-based implementation is easiest.

  4. Design and write a DSL to populate the platform model. There are many ways to create language-based solutions. The venerable LALR(1) parser generators are typified by yacc. There are many other parser generators, such as antlr. There are also parser construction techniques based on parsing expression grammars (PEG). The implementation language of the DSL should be chosen for convenience and does not have to be the same as the target language of the MX domain.

  5. Write a code generator to produce the code and data that supports the model execution rules and drives the MX domain actions. As we implied in Chapter 9, code generation can be accomplished by template expansion, and most modern languages have libraries to support generating output based on templates. The template expansion queries the populated platform model to find the information needed. The code generator can be a separate part of the translation or, as with pycca, be invoked immediately after the platform model is populated.

  6. Write the runtime code of the MX domain itself. This code will use the data produced by the code generator to manage the model data and execution sequencing.

We don’t pretend this is a trivial process, but it is one that can be accomplished by an individual or small team managed as its own project. Furthermore, a robust MX domain and the ability to translate models onto it represents a valuable resource with potential for reuse. Writing code that writes code is thus a highly leveraged, if somewhat abstract, undertaking. Rosea, also available through the book site, is another example of this style of translation program in which the target language is Tcl.

Remember that the xUML model execution rules are the same, regardless of the means and mechanisms used to implement them. But you are always free to choose a different way to implement these rules, as long as you can guarantee that those rules work correctly. The goal is the production of a MX domain and its platform-model-based code generator that satisfies the scale and performance requirements of the class of applications you expect to deploy and has the implementation characteristics required in the deployed system.

Our approach emphasizes first establishing the required, often nonfunctional, characteristics of the implementation. Then a platform that supports the model execution rules can be constructed. Finally, we can then translate the logic of the models onto the platform with assurance that the system will have both the means to execute the logic of the models and acceptable performance characteristics when deployed to the field. Ideally, we would prefer that the entire workflow be integrated, automated, modular, and facile to use. We do not see that currently and cannot bank on the future. We must develop software systems in the present, given what is available, and cope with the engineering trade-offs as we encounter them. The demands for increasingly complex software systems will continue to accelerate, and the variety of implementation technologies, both hardware and software, will continue to grow at staggering rates. Our approach represents one attempt to draw upon this astounding advance of technology while maintaining a strict partitioning between logic and technology and with a keen focus on producing quality, working software.

Summary

We discussed the design and implementation of the pycca program itself. Pycca is designed as a language processor that reads DSL statements to populate a platform model and generates code using a template system that queries the populated platform model. It is implemented in the Tcl language.

We presented some size and execution speed measurements of the ALS system on a representative micro-controller platform to show that the resulting memory usage and execution speed are appropriate for our targeted platform.

Finally, we provided a quick overview of a target platform using Berkeley DB. Model execution domain mechanisms for how class instances are held and accessed and how relationships are navigated were mapped onto Berkeley DB facilities. Support for a limited form of referential integrity checking using Berkeley DB facilities was shown. We also presented a fragment of a platform model, demonstrating the information needed to support code generation for the Berkeley DB approach. This demonstrated how different platform implementation approaches would be reflected in distinct platform models.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset