© Leon Starr, Andrew Mangogna and Stephen Mellor 2017

Leon Starr, Andrew Mangogna and Stephen Mellor, Models to Code, 10.1007/978-1-4842-2217-1_11

11. The Translation Landscape

Leon Starr, Andrew Mangogna2 and Stephen Mellor1

(1)San Francisco, California, USA

(2)Nipomo, California, USA

In this book, we have shown, by a series of examples, how an executable model can be nondestructively translated into a running program. Our translation technique is one way to obtain code from models. We do not claim it to be the only way to translate models. Nor do we claim it to be the best way to translate. The techniques we have presented, like all software engineering processes, have benefits as well as drawbacks. But we have met our goal of producing running code that satisfies the constraints of our target platform by translation of an executable model. We consider that important because, to excerpt from the Agile Manifesto1:

We have come to value:

Individuals and interactions over processes and tools

Working software over comprehensive documentation

In our experience, tools that purport to be comprehensive aren’t and do not substitute for individual engineering skill. And regardless of how insightfully an executable model captures the logic of a domain, modeling is a necessary, but not sufficient, step in the development of software. We do value tools, development processes, and well-documented models but, in the spirit of the Agile Manifesto values, we strive foremost for high-quality, working software produced by skilled individuals.

In this chapter, we take a broader view of model translation and examine how this might be accomplished in the general case. We start by presenting a reference workflow for translation. We use the reference workflow as an opportunity to discuss some of the difficulties encountered along the translation path. We show how pycca compares to the reference workflow. Finally, we discuss how our approach to translation might be applied to other target platforms.

A Reference Workflow for xUML Translation

Figure 11-1 gives a broad view of the tasks and work products necessary to translate xUML models into code. This diagram is intended to illustrate all of the key elements that must be present, in some form or other, in any xUML translation system. The pycca approach that we have described is a step in the direction of this idealized workflow. Later in this chapter we will show how the pycca approach fits into this broader context.

A421575_1_En_11_Fig1_HTML.jpg
Figure 11-1. Reference translation workflow

The top of the diagram is divided into platform-independent and platform-specific areas. Moving left to right within the platform-independent section, we have a sequence of modeling tasks completed in the indicated order for a given iteration of the system. We do not want to imply that there is only one iteration. We advocate an agile approach to system development, but also carefully distinguish between the technical process and the project management process. Each iteration through the platform-independent section delivers an internally consistent, fully executable, modeled version of the system. The frequency and content of each iteration, how project team members are allocated to tasks, and how the results of each iteration are used to solicit feedback and additional requirements is part of the project management process. These are important aspects of any real-world project, but something each project must plan for itself. The discussion here is strictly on the technical aspects of the workflow.

Jumping across to the far right of the platform-specific section is the code that must be written by hand, acquired from a third party, previously existing or generated from non-xUML models. To the extent that requirements are known, the non-xUML code can be developed in parallel with the xUML modeling track.

The remaining Mark task must be performed when a consistent fragment of the domain model has been completed. Here, those modeled components that merit special consideration for performance reasons during code generation are annotated. For example, you might indicate which classes have large vs. small populations so that appropriate code can be generated, assuming the target platform model provides facilities to discriminate between these cases.

That covers the top of the diagram. Now let’s move downward. As the models are created and edited, they are stored in a model repository. This repository is an implementation of the xUML metamodel. As such, the domain models and all populated instances themselves constitute the instance population of the metamodel.

Model marks may also be stored in the repository. This does not compromise the platform independence of the models, as the marks are stored as annotations cross-referenced against the models. So the model repository will permit multiple sets of marks, possibly for different platforms, to be associated with the same domain models.

The platform model defines the way xUML model elements are packaged and reorganized to perform efficiently on the target platform. It is populated with a program that transforms the xUML model elements into corresponding elements of the platform model. The platform model, its populator/transformer program, and the compiled MX runtime constitute the core models-to-code solution for a given class of platform.

So now the platform model has been populated from the marked xUML models extracted from the model repository. From here, code can be generated in the target programming language. This code is then compiled, along with the non-XUML code, to yield a set of object files. These object files are linked together along with the MX runtime to yield a complete system executable.

The MX runtime, you may recall, is the chunk of code that knows how to dispatch events, make state transitions, navigate relationships, and otherwise execute xUML models.

To summarize, the components that must be supplied to enable code generation are as follows:

  • A platform model

  • A program that populates the platform model from the xUML models

  • A program that generates code in a target programming language from the population of the platform model

  • A model execution runtime module

Key Challenges

The reference workflow presents an idealistic picture of the principal elements you need to build models and generate code from them. It only partially represents the reality of how model-based systems are built today. Unfortunately, many organizations get swept up in the ideals of model-based software engineering (MBSE) and commit to an expensive and potentially constraining model development workflow. Without prior experience, project teams proceed to acquire model-drawing tools and start making pictures because model diagrams are a key artifact of the workflow. However, modeling is about solving problems in logic, whereas diagrams are simply a means of capturing the problem solution in a form that directly contributes to producing a software solution. Later in the project, theory and reality often collide in an expensive cloud of disappointment. The result is filling the gap between the diagrams and the code by using one of the approaches discussed in Chapter 1 (namely, gradually or abruptly), neither of which achieve much benefit from modeling.

Inexperienced project teams should seek help in training the team to undertake such a qualitatively different way of building software. Yet, it is hard to convince project teams that have successfully built software using conventional techniques that modeling and translation requires skills and thought processes they may not possess and that manipulating diagrams in a modeling tool will not provide those skills. In this section, we flag some of these ugly realities for you. Considering each stage of the translation workflow, we point out the common difficulties and challenges that we have seen in our many years of model-based development.

We do not mean to imply that modeling and translating are an immature or unworkable development approach just because difficulties exist in the state of the practice. Many project teams produce high-quality systems by using modeling and translation. We could easily enumerate an even longer list of problems with conventional software development techniques. We have purposely not done so because Internet sites document the horrors discovered in real-world programs more comprehensively than we could here and because highlighting problems in other approaches does nothing to solve our immediate concerns. Software development approaches are subject to passing technical fads, just like any other complex human undertaking. We won’t indulge in Pollyannaism aimed to convince you that everything is smooth and easy. We don’t believe there is any silver bullet that will slay the software werewolf (See the papers “No Silver Bullet—Essence and Accident in Software Engineering” and “No Silver Bullet Refired” by Fredrick P. Brooks, Jr..) Modeling and translation represent real, practical progress in grappling with the beast. But forewarned is forearmed, and we hope our concerns and frustrations with current practices in modeling and translation will save some project teams from their own disappointments.

Identify Domains

The first step in developing the software system is to identify all of the required domains. For translation purposes, we are interested in which domains are modeled and which ones are provided as code. Those provided as code could be hand coded, acquired from a third party, available as legacy code or existing libraries, or generated from models in a non-xUML language. Ultimately, a diagram called a domain chart is produced that inventories all domains and their dependencies.

The key challenge in this step is factoring the domains correctly. This is not a problem fixable with tools. You need experience and skill and the right approach to thinking about the problem.

When it comes to dividing up a large system, it is easy to lean on the crutch of familiar platform technology such as library, task, and hardware boundaries. These boundaries are just so tangible! But experienced developers know that technology changes all the time, and tangibility and frailty go hand in hand. “Don’t worry, the hardware design is frozen. . . ” Right. More to the point, a platform-specific partitioning yields platform-specific models, which defeats much of the purpose of modeling.

Even if platform boundaries are successfully ignored, it is also common to split up a system into functional rather than subject-matter categories. This mistake makes it difficult to develop solid class models, which are the foundation of each domain. And let’s not even get started on pointy-headed non-engineering boundaries such as managerial or political.

Here are a few guidelines that may help. For any prospective domain chart, consider these two questions:

  1. Would the domain chart change in any way if the platform technology changed? It should not. For example, what if you have two CPUs instead of one? What if you are using tasks and threads? What if you are running the system on a distributed platform?

  2. Is there any class that must simultaneously exist in one or more domains? There should not be any. A subject matter is defined by a vocabulary of classes and relationships. Such a class-based, rather than function-based partitioning then demands that each class live in only one domain. In the ALS, the Injector pressure “exists” in both SIO and Lubrication; however, it means something entirely different in each domain. In SIO, there is nothing special about pressure; it is just data coming from an input point that needs to be scaled and converted. But in the Lubrication domain, pressure really is pressure, and there, the technology to gather and convert is of no consequence. Similarly, the Injector class lives only in the Lubrication domain. For example, if we had divided the Lubrication domain into Injecting Normally and Diagnostic Injection functions, each would need to share the Injector class.

Consider the ALS domain chart from Chapter 6. It is entirely platform independent. You could put all those domains in a single task, as we have done with pycca, or spread them across multiple processors or threads. The hardware onto which the software is deployed has no impact on the domain chart itself. Note that the domains are defined by what they know rather than what they do. The Lubrication domain knows about the lubrication equipment and how lubrication works. It knows nothing about systematic handling of alarms, it knows nothing about signal processing, and it knows nothing about user-interface technology. All the functionality in the Lubrication domain follows from what it knows about the way equipment must be lubricated. The SIO domain knows about signals and actuators, but assumes no specific meaning for any of the data in the signals or controls over the external world.

Ultimately, the best way to test a domain chart is to build a certain percentage of the class models inside each of the modeled domains as a validation exercise. During this process, it is common to rethink the partitioning and end up refactoring the domain chart. Had you not identified Signal I/O as a domain on the first pass, for example, you may have found that the state models for lubrication got excessively complex, constantly polling for new data. Each physical device with sensor-driven attributes would need to replicate the same polling or event-response patterns. When you see the same cookie-cutter patterns replicating across multiple state machines in a domain, it is usually a sign that a service domain is missing or some sort of domain refactoring needs to be done.

It is often the case that the coded or non-xUML domains are not complete as domains and require some kind of wrapper that may or may not be modeled. So additional modeling may be required in these domains, which must be interfaced with the supplied code.

Build and Document the Models

For each domain modeled in xUML, one or more analysts work together with subject-matter experts to model whatever subject matter is relevant to that domain. Figure 11-2 illustrates this concept.

A421575_1_En_11_Fig2_HTML.jpg
Figure 11-2. The right talent is essential to successful modeling.

The sheer number of challenges in this phase require an entire book—Executable UML: How to Build Class Models, by Leon Starr (Prentice-Hall, 2001)—but here are a few of the most significant:

  • Get the right talent working together.

  • Enter and edit the models productively.

  • Usefully document the analysis and models.

Use the Right Modeling Talent

Three distinct talents are necessary to build models: analysis, modeling, and subject-matter expertise. Rarely do all three reside in the same individual to an adequate degree, so it is generally necessary to get teams working together. The first mistake most projects make is to just grab a bunch of programmers, because those are the people who happen to be around, and assign them to modeling duty. This is a mistake because most programmers lack the proper skills, knowledge, or inclination to build models.

Analysis is the ability to ask questions about a subject matter, and to write, draw, and otherwise describe the subject matter. This requires good communication skills. The analyst is always trying to break down problems and find the interesting cases that almost never happen but must be handled correctly when they do. The analyst must have solid communication and presentation skills to elicit expert feedback. Most important, the analyst must discard preconceptions about a subject matter and routinely expose his or her ignorance of a topic in order to elicit the fine details from subject-matter experts. These are not skills typically cultivated by programmers. Programmers like to show up to the party with patterns and libraries in hand, ready to write code.

Modeling is the ability to take the analysis and formalize it in an objective, testable way. This requires skill at putting the platform-independent building blocks of classes, relationships, states, and so forth together. This sounds a bit like programming, but there is one key difference. Whereas a programmer is constantly coming up with clever ways to package things so that they work efficiently, the modeler is typically unpacking ideas so as to expose the critical and subtle differences. The modeler is concerned about efficiency, in the sense that an idea should be expressed as simply as possible and as clearly as possible, while simultaneously handling all the subtle cases uncovered in the analysis. Whereas the programmer often takes pride in having as few elements as possible in a program, packaging and hiding along the way, the analyst takes pride in unpacking and exposing complexity. That extra class, attribute, or relationship that reveals an overlooked case is a source of pride for the analyst.

Finally, subject-matter experts need not know anything about modeling. They just need to know their subject matter well and have the inclination and time available to explain it to the analyst/modelers. And, of course, it doesn’t hurt if they can read the models to verify that they are being understood.

On real projects, these three skills come in overlapping combinations with various individuals. The necessity is to get enough of all three skill sets together to get the models done correctly. The best results occur when you pair up a good analyst with a good modeler and give them access to at least one subject-matter expert.

Enter and Edit the Models Productively

As models are developed, they are ideally entered into a graphical editor. The editor stores the non-graphical model information in a repository. To make the model information available in a useful manner to downstream tools, such as those for code generation, the repository design should be based on the xUML metamodel.

Okay, so how hard can it be to draw boxes and arrows? Based on the current crop of model-editing tools, it turns out, surprisingly hard and painful. To see what we mean, compare the task of entering and editing graphical models to that of writing code in your favorite editor. Today’s programmer expects a lot from a code editor. Features such as autocomplete, refactoring, expand/collapse, search/replace, documentation lookup, syntax checking, keyword highlighting, and visual diff allow a programmer to move nimbly through a large, complex code base. The productivity a programmer experiences is not even in the same league as that of their modeler counterpart using a graphical editor.

Compare textual vs. graphical layout, for example. While a programmer effortlessly indents, closes braces, expands and collapses a function block, a modeler limps along pushing rectangles and arrows, trying to click connection points and shifting text labels a few pixels this way and that. You get carpal tunnel just watching a large model being rearranged.

To be fair, some graphical editors take some of the pixel-shifting drudgery away and perform varying degrees of automatic layout. Unfortunately, the geometric algorithms are still rather simplistic, yielding clunky layouts. This may sound like nitpicking, but geometric layout to an experienced modeler is every bit as important as text layout is to a programmer. Imagine the outrage of a programmer working with a bizarre indenting and line-wrapping scheme, with open and closed parentheses being scattered to the wind. Experienced programmers are quite careful about how they organize their code. Experienced modelers are no different.

More pertinent to translation is the storage of non-graphical model data. Many tools store this information in either a file using some sort of interchange format or in a database or both. Our concern is the way that data is organized. Ideally, the organization of the database, or definition of the file format, should be derived directly from a model of the executable language (a metamodel). The language we are using is xUML. Most model editors use the UML standard, which brings with it a lot of stuff we don’t need. There is a standard for model interchange called XMI. But, again, it is intended to support the full UML standard. So it is filled with lots of content that we don’t need. Our preference would be to have model-level data in a commonly used implementation form on which we could directly operate, such as a population of a relational database. What we really need is a tool that uses an xUML metamodel as the basis for model storage.

In the meantime, what is the modeler to do? It really depends on whether you are taking a pycca-like approach or using a comprehensive draw tool/translation environment. In the pycca case, you are going to encode your translation decisions by hand, anyway. Here you want a model editor that will let you specify xUML elements nicely, such as verb phrases on associations, identifiers, and referential attribute tags. You want something that isn’t trying to be too smart and preventing you from drawing a model the way you want it to look. Umlet, OmniGraffle, Visio, and DrawExpress are some choices. None of these tools stores any model semantics, but if you are proceeding to text, by hand, it doesn’t matter. On the other hand, if you are using a tool such as BridgePoint, you don’t have any choice. You have to live with the model editor provided, warts and all.

Usefully Document the Models

Models are not self-documenting (and neither is code, despite some programmer assertions to the contrary). They do expose application logic in a formalism that excludes implementation aspects. But what is the reasoning behind that logic? Why is this class or attribute necessary? Why is relationship R3 unconditional on the many side? The why’s need to be answered clearly.

For the model to be useful, it must be adequately documented. In fact, we would go even further and say that good model descriptions are more important than the model graphic. Only in the descriptions can the basis of the abstraction for the model be explained. The model graphics present the model in an information-dense form, but do so by using mnemonics such as class names, attribute names, and relationship phrases. Names in a model use common natural language words that require additional explanation to give them the meanings needed to build a precise vocabulary for the subject matter of the model. The precise meanings of the mnemonics are contained only in the descriptions, which need both text and informal diagrams. Without a precise meaning for the model terms, model readers will simply apply their own notions to the model based on their own understanding of the natural language words used in the model diagram. Because natural language relies so much on context for its precise meaning, confusion is the usual result.

Useful documentation goes way beyond short text descriptions. It includes the numerous analysis notes, copies of whiteboard discussions, and informal (non-model) sketches, mathematical formulas, and other supporting technical notes that are developed in the process of building the models. Without real-world context, models quickly lose their value. Without the inclusion, integration, and maintenance of adequate documentation, there isn’t much point in bothering to model in the first place.

Unfortunately, graphical model editors tend to attach slots to various model elements that are filled in like forms. This sounds reasonable at first. But in practice, it results in lower-quality model documentation. Modelers tend to fill in the slots in a robotic fashion without giving thought to the big picture. This tendency is frequently reinforced by project development rules that insist every slot be filled in like a Bingo card, resulting in little regard for the content. The descriptions tend to restate what is already obvious on the diagram. It is our experience that informal non-model diagrams are essential elements of good model descriptions. But most tools facilitate or accept only text. Finally, it is all too easy to look at a diagram and not see the underlying documentation or to delete model elements, inadvertently trashing pages of descriptions. True, good model documentation is hard work, but it is the foundation of a long-lived knowledge base; it is valuable intellectual property, essential for training project team members and the basis for future maintenance.

It is helpful to consider two distinct approaches in practice that have arisen to deal with code documentation. In one approach, documentation is placed in stylized comments directly in the source code file, and a software tool is used to extract and format the documentation. Literate programs, a concept introduced by Donald Knuth, takes the opposite view. In a literate program, the source code is placed into the document, and a software tool is used to extract and reorganize the code from the document. Both techniques can be used to produce good documentation. We prefer the literate program approach because it allows more flexibility for the order of presentation and for including supporting materials. The worked-out examples available as supplementary material for the book are literate programs.

We develop our models using tools primarily intended to produce high-quality documents and import the model graphics as simply another diagram in that description. We do not want to imply that we think the model graphic is not useful. It is extraordinarily useful, but only when combined with the background and abstractions detailed in the model descriptions.

Specify Domain Mapping

A dependency between two domains on the domain chart is referred to as a bridge in xUML. The domain on the tail of an arrow plays the role of client, to the domain on the arrow side playing the role of a service. The direction of the arrow simply means that one domain exists for the express purpose of satisfying the needs of another. In the ALS domain chart in Chapter 6, for example, the Lubrication domain needs some way of interacting with the physical world, which is satisfied by the SIO domain.

The domain chart says nothing about what data and interactions are exchanged on a bridge. Furthermore, the models of one domain are designed to be unaware of any particular model structure in any other domain. This notion of domain-level encapsulation is a key organizational principle in xUML. That said, certain model elements in one domain will have corresponding elements across each connected bridge. These must be mapped together somehow, without muddying the interior of any one domain with knowledge of another domain’s model content.

The bridging technique we demonstrated is based on explicitly invoking actions on an external entity. Referring back to the Lubrication domain of Chapter 6, when an Injector began to apply lubricant, it made an explicit call to the Inject operation of the SIO external entity. The activity of the Start injection state makes clear its delegation of the physical injection process at a precise point in the model execution.

For some situations, sprinkling the model action language with external entity invocations detracts from the intent of the model and inhibits its potential for reuse. Consider the Reservoir state model from Chapter 6. Each of the state activities makes explicit calls to an ALARM external entity to reflect the status of the Reservoir. The intent of the actions is to mark the state of the Reservoir and signal other instances in the domain at key boundary conditions (that is, when things are Normal or when the Reservoir is Empty). The ALS system is required to post alarms at these critical junctures in the Reservoir life cycle, but the ALARM entity invocations are intrusive and have nothing to do with the essentials of the Reservoir class. Further, what is considered worthy of an alarm is subject to considerable requirements churn. Project often start with the notion of alarm everything only to find out that overwhelming amounts of information cannot be digested into practical actions.

We would prefer to specify how the ALARM entity is notified apart from the action language of state activities. We want to say that after the LOW state activity of the Reservoir class executes, the Set lube level low operation is to be invoked on the ALARM entity. The action language for the LOW activity would no longer include the explicit invocation of Set lube level low, and the translation mechanism would arrange for the ALARM invocation based on our specification.

This type of bridging arrangement is called implicit bridging. The idea has much in common with the concepts of aspect-oriented programming (AOP). AOP has an entire vocabulary to describe the approach, which we do not discuss here.

Implicit bridging is intended to reduce the coupling between domains by removing at least some of the explicit external entity interactions. By providing a level of indirection in specifying where the external entity operations are invoked, the potential for reuse of the Lubrication domain is enhanced, because the dependency on an ALARM domain is no longer explicitly encoded in the domain actions. The invocation of ALARM operations could also be applied to other domains that had not been built with any alarm concepts in mind.

We see the process of defining explicit or implicit bridges between domains as the same. It is still necessary to mark, map, and populate the bridges. The difference is that implicit bridging uses a means apart from the action language of a state activity to specify when the bridge operation is invoked. The implicit means of specifying the bridge operations is also done relative to generic (metamodel) entities. In this example, we specified the ALARM interactions relative to state transitions. Other model-level elements are also candidates, such as creating or deleting class instances or signaling events.

Sadly, we see no substantial tool support for mapping and populating bridges. Explicit bridge operations are present, but we are not aware of any implicit bridging support. It is a significant complication to the model translation process that has not been overcome. We demonstrated in Chapter 8 that, despite the lack of a complete theory, it is nonetheless possible to tackle bridging methodically using nothing more than a spreadsheet application.

Populate the Models

It is common to document scenarios with real instances as you develop your models. The diagram of air traffic controllers introduced in Chapter 2, for example, or the lubrication configurations shown in Chapter 6 are quite typical.

Once an iteration of the models is complete, it is helpful to construct initial populations for it. In fact, you usually create multiple populations for various scenarios. In the ALS example, you could create one or two populations for testing and then another two or three for anticipated real-world configurations of the ALS. By careful selection of the attribute data values that influence the execution paths through the activities, test populations can be constructed to force a larger trace of execution than might happen with a population delivered for deployment.

Tools tend to focus on the development of the models, with the instance populations as an afterthought. Few tools provide quality facilities for specifying initial instance populations. Again, our advice is to put them in spreadsheets. The examples we have presented had small initial instance populations. When the number of initial instances is small, almost any strategy to handle them will work. However, as the number grows, it becomes more difficult to manage the populations.

We believe that initial instance populations should be specified entirely by the values of the attributes of the model classes. In our examples, all the model-level descriptions of initial instance populations were accomplished by treating the class as a table and specifying instances as rows of values. By contrast, we dislike the approach of having to compose action language to explicitly create class instances and assign the attribute values of the initial instance population. This approach is tedious and error prone, and does not scale well. Worse yet, because many attributes are given the same value, the motivation to use action language variables to hold repeated values is hard to resist. The values of these variables change during runtime and, unlike explicit attribute values, are difficult to verify by simple inspection.

Another problem with the action language approach is that it excludes non-modelers from providing the initialization data. Consider an electric utility control system that requires a large amount of data to describe the transmission lines, substations, transformers, and other equipment required to distribute electric power. The control system will use the connectivity implied by the transmission specification data to route and otherwise manage the distribution of electric power. The utility staff that updates and manages the topology of the distribution grid is also the primary resource for obtaining correct instance population values. A database management system is critical at this scale, so translation tooling should either take initial instance population data directly from the database or at least provide an interface to which query results from a database can be included as a domain’s initial instance population.

For instance populations of an intermediate size, fashioning a domain-specific language (DSL) can help solve the instance management problem. In the SIO domain example, the initial instance population consisted of 50 instances of the various classes that specified the properties of 11 I/O Points. Directly specifying the attribute values was not an onerous task because the population was small.

But imagine a case with 100 I/O Points (not an excessive number even for a microcontroller-based system). Specifying values for all the attributes of the approximately 500 class instances now becomes a significant task. To reduce the amount of detailed knowledge required to directly populate the class instances, we could construct a DSL such as the following:

population SIO test_population_1 {
    Continuous_Input_Point inj1_pressure {
        group pressure_group
        thresholds {
            max_pressure
            inh4_pressure
        }
        scale ihn4_scaling
    }


    Conversion_Group  pressure_group {
        period 500 ms
        converter main_converter
    }


    Point_Threshold max_pressure {
        direction rising
        limit 20 MPa
        successive_over 2
        successive_under 3
    }


    Point_Scale ihn4_scaling {
        multiply 100
        divide 27
        intercept 100
    }


    # ... and other similarly styled declarations
}

Such languages should be declarative in nature and minimize any knowledge of the class model required to specify the attribute values. In this example, the fact that there are several generalization relationships in the model is hidden by focusing on the attributes of the leaf subclasses. In contrast to an action language, there is no order of execution. Variables are not used, and the output of the language processing is direct input to the translation mechanism. Such population languages are usually better suited to service domains in which the potential for reuse is higher than a domain whose subject matter is tied closely to a particular application. We recommend the rule of three: when a domain is reused for the third time, stop and invest some effort in making that reuse more productive.

Although constructing DSL’s is not a large undertaking, project teams must determine whether the investment has good return. Many tools are available to construct small language processors such as this. The DSL program can define a syntax and then use well-established techniques of lexical and parser generators to form the core of the processing. Alternatively, some dynamic scripting languages such as Tcl or Python are well suited to use for DSLs (sometimes called an internal DSL), and then the parser of the scripting language itself can be used for parsing the DSL. Language-based solutions to application configuration problems have a long tradition in software, and existing techniques can be employed for specifying an initial instance population.

Populate the Domain Mappings

The population of the mappings between domains that form the basis of how a bridge is realized may be known at translation time (at the time the code is generated from the model), at runtime, or a hybrid of both.

If the model elements involved in the mapping do not vary during the running of the system, the population of those mappings is specified before translation. When the initial instance population is determined, the mapping tables can be filled in. This was the case of the bridge mapping between the Lubrication and SIO domains in Chapter 8. For example, the mapping for the Inject and Stop injecting external entity operations was onto the Write Point domain operation of SIO for I/O Points that were defined as part of the initial instance population and did not vary as the system ran.

If the model elements of the domain mapping are created or deleted during the running of the system, it is not possible to populate the domain mappings before translation. In this case, the bridge code will populate the mapping at runtime. The bridge will include operations that map the creating and deleting of model elements in the client domain to the corresponding elements in the service domain and record the mapping for later use. Although the mapping table heading still describes the information that needs to be collected and maintained, it is not possible to specify the values of the tables before translation. The bridge code implementation is usually more complicated, as it must know when the creation and deletion of model elements occur, and uses more-sophisticated data structures to handle the dynamic nature of the half tables themselves.

Hybrid mappings can also occur. In this case, some of the mapping population is known at translation time, and the remainder occurs at runtime. In this case, the bridge code is patterned after the dynamic case but may start with a non-empty half-table population to be augmented as the system runs.

Again, we find no substantial support for this activity in available tools, so we recommend spreadsheets as an easy and available way to enter and display the required mapping data. The example mapping tables from Chapter 8 can serve as a guide. A specialized database application based on a spreadsheet metaphor could also work well for larger mappings.

When populating the domain mappings, three data sets are being managed:

  • Model element populations from the client domain

  • Model element populations from the service domain

  • Mapping between the two populations of model elements

With that many things in play, the challenge is that you might find out that things don’t quite match up. Filling in data values to the mapping tables is the ultimate check on whether you have the populations and domain mappings right. Don’t be surprised if you need to revise things. For example, you may find an instance in the client domain that has no corresponding instance in the service domain and have to adjust your initial instance population for the service domain. You might also find that the way an instance is identified in a service domain can’t be determined from the elements of the client domain mapping, and you might have to adjust the domain mappings themselves. To avoid too many surprises, it is advisable for the modelers of a client and service domain to communicate routinely. This ongoing communication is essential to ensure that the assumptions and dependencies given to the service domain by the client domain are accounted for. As the models progress, the conversation can be extended to ensure that proper domain mappings exist to realize all of the dependencies.

Marking

A complete iteration of the platform-independent models is marked with platform-specific features prior to translation. For illustration purposes, it is helpful to imagine marking as involving a transparent sheet laid over the top of the models and a box full of markers of different colors. The specific colors and number of markers available depends on the features provided in the platform model. For example, one platform model may prefer that you distinguish instance populations based on whether they max out at 1, 10, or 1 million for any given class so that the appropriate storage and access mechanism may be selected. Using the max population marker provided by that platform model, you mark classes accordingly. Note, however, that you aren’t marking the classes directly, thus the transparent sheet laid on top. That way, the models remain platform independent. It is only the marked-up sheet that has the marks. Because marks themselves can be abstracted as annotations, they can be stored in the metamodel. In fact, the same set of models can be marked differently for each potential target platform. Just swap the marking sheets.

Marking has two key challenges. There must be adequate types of marks available to tune the models for translation. There must also be a way to specify them without permanently embedding them into the models themselves.

In reality, we don’t use markers or transparent sheets, though these might still make a good user-interface design metaphor in some model-editing environment. Instead, there may simply be a text file that provides a keyword for each mark type and a list of affected model elements. In the pycca approach, the marking features are mixed into the DSL used to specify the design. While the markings on a particular model are interpreted to create a platform-specific implementation, they are just a particular kind of annotation that can be stored in the metamodel.

Care is taken to ensure that marks reference model elements but are never permanently mixed into the models themselves. This is in stark contrast to the elaboration style of producing code from a model, in which implementation artifacts are indiscriminately blended into the models. This style unnecessarily destroys platform independence in the process of delivering a system.

The xUML Metamodel

Now we move downward underneath the platform-independent section to consider the xUML metamodel. A metamodel is just an ordinary model whose subject matter happens to be the modeling language itself. Whereas our air traffic control domain model captured classes such as Duty Station, Air Traffic Controller, and Control Zone, a model of xUML would have classes such as Class, Association, Attribute, State, and so forth.

A metamodel serves as a formal definition of the language it is modeling. Thus, an xUML metamodel would serve as the ultimate definition of the xUML language. Executable UML: A Foundation for Model-Driven Architecture does a fine job of informally describing xUML, and it should serve as the key input to an xUML metamodel. As with any subject matter, any ambiguity, inconsistency, or incompleteness in the informal description of the modeling language should be resolved by the completed metamodel.

Just for fun, let’s assume that such an xUML model exists. Here’s how you could use it.

Imagine that you build a database schema based on such a metamodel. Assume that the database perfectly imposes all constraints defined in the metamodel. Now let’s say that you have built an application model, the Air Traffic Control application, let’s say. You should be able to populate the metamodel database with your application model. For example, you would create instances of Class: Air Traffic Controller, Control Zone, Duty Station, and so forth. You would proceed to create instances of State, Transition, Attribute, and so forth until your entire ATC domain was instantiated in the metamodel.

If you found that you were unable to populate the metamodel database without triggering errors, you would know that your models were incorrect. (We’re assuming a perfect metamodel database implementation here.) For example, xUML requires that each transition exiting a state have a different event specification. So if you have already inserted a transition out of state X on event A, and you attempt to define another transition out of X with the same event A, you should get an error and the edit operation should fail.

This means that if a model has been successfully entered into the xUML metamodel database, it is linguistically correct. The model may have runtime problems or may be incomplete, but at least it doesn’t break any of the modeling language rules. So, with respect to a set of application xUML models, a metamodel would serve the same role as a programming language grammar, and the metamodel schema populator would play the same role as a parser.

Because the metamodel should also capture instance and instance value information, you should be able to enter not just your model, but your model’s population as well. In the case of the ATC model, you could enter a population (or in fact, multiple populations) for the same domain model.

You can also use the metamodel to design a DSL for storing a model and its population in text files. Each metamodel element might correspond to a DSL statement, for example.

So the model serves as a formal definition of the modeling language, as a reference for the design of any model repository or model file format. It also means that any downstream translation or model-processing tools should be built to process any structures that can be inserted in a metamodel database.

Now for the bad news. There is no complete xUML metamodel currently in existence (as far as we know). On the bright side, there are many partial metamodels, and we anticipate that a complete one should be available in the near future. The lack of a complete, accepted xUML metamodel hasn’t stopped us from having translation tools. But it would certainly be nice to sort this out.

In the past, a number of tool-specific metamodels have been built. One open source xUML tool, BridgePoint, for example, is built around an xUML metamodel. Unfortunately, it has been modified away from the xUML as described in Executable UML to support a variety of tool-specific features. It goes under the name xtUML. Another tool-based metamodel called iUML also varies considerably from the xUML definition.

Our effort is titled miUML. The intention of this metamodel is to be open source and as tool independent as possible, with a focus on both Executable UML and the Shlaer-Mellor methodology from which it originates. It also strives to be based on firm foundations of relational theory with an orthogonal type system for attributes. It has been implemented partially with a relational database schema and editing functions in Postgres. Additionally, the models are thoroughly documented for your reading pleasure. You can download it from www.executableuml.org . As of this writing, this model is incomplete with respect to polymorphism and data types. But it has strong constraints for identifiers and referential attributes.

The OMG’s UML standard publishes a UML metamodel. It’s not much use in the context of xUML for the following reasons. xUML uses a subset of the UML notation. xUML is built on relational foundations and has special rules for the use of identifiers and referential attributes. xUML has built-in executable semantics that the greater UML lacks. UML does define a framework called Foundational UML (fUML) for defining executable UMLs. At this point, no attempt has been made to develop an fUML definition of xUML. It could probably be done, but it is not clear that it would be worth the effort.

In addition to having an agreed-upon metamodel, it is also important to have agreed-upon implementation representations of the metamodel. To be tool independent and support a more modular approach, there needs to be specific and easily accessible implementation infrastructure for populating and querying the metamodel. One such approach, as we have mentioned, is to use a relational database management system. These are typically queried using SQL and so allow for the ad hoc queries that are necessary to make effective use of the metamodel structure. Unfortunately, SQL exists as many vendor-specific dialects, and we are faced with having to support several representations, depending on the underlying database system. Other interchange representations may also be used, such as XML or JSON. The important point is that the representation must completely cover all aspects of the metamodel and provide a convenient starting point where an implementation of modeling tools can access the metamodel population.

The xUML Language

While we are on the topic of challenges, we need to discuss a few revolving around the modeling language itself.

There are several reasons for our choice of xUML. The primary reasons are that it was designed to support the translation of models on the widest variety of platforms: everything from cloud distributed to tightly embedded. This means that models built in xUML can be widely reused. The data and execution rules of the language are based on relational data theory, finite state automata, and data-flow execution rather than object-oriented foundations. There is no presumption that the target programming language be object-oriented, though there is certainly nothing prohibiting or hampering such an implementation. The language is designed to be as lean as possible. Rather than having lots of complex rules and model elements, there are only a small number of building blocks that can be assembled strategically to tackle considerable real-world complexity. This again is largely a consequence of its mathematical foundations. The benefit of this property is twofold. From an analysis perspective, language simplicity means the modeling artifact fades into the background, putting the emphasis on the subject matter being modeled. It is much more difficult to spot a subtle flaw in application logic if the modeling notation itself is intruding with its own complexity. From an execution and translation perspective, it is easier to run the models and generate code, because there are so few elements and rules to implement. Furthermore, it is easier to devise model execution platforms that guarantee the model execution rules work on diverse and challenging platforms.

But if you are coming from an object-oriented programming perspective, as most of the greater UML is practiced, you will find some aspects of xUML to be a bit alien. There are no hidden identifiers and object links, for example. The data is simply organized in a way to enforce the connectivity of the instances.

And instead of relying on a separate language to express constraints, such as the Object Management Group’s (OMG) Object Constraint Language (OCL), constraints are built directly into the class and relationship data. By declaring an identifier of a Die on a semiconductor Wafer to be Grid Location {I} + Wafer {I, R}, we have effectively declared that you cannot have two Die at the same grid location on the same Wafer. Identifiers and referential attributes can be combined in various ways to form a set of declarative model constraints without requiring extra “check the constraint” code to be generated. These built-in mechanisms cover most constraint circumstances with the exceptions handled by light annotation. In fact, we see OCL as an artifact of the lack of inherent constraints in object-oriented programming languages, filling that gap by constructing syntax trappings on predicate logic. Sadly, so many examples of the uses of OCL are based on poor models and only highlight the need for better modeling to capture the problem logic rather than explicit constraints.

Unfortunately, there is no official, widely accepted standard defining xUML. The best we have is an informal standard consisting of various white papers describing the Shlaer-Mellor method, the predecessor to xUML, and the Executable UML book. This serves as a fine guide for the analyst/modeler, but leaves a bit open for interpretation when it comes to building model execution platforms and translation tools. This of course, is where the previously mentioned metamodel fills the gap.

Fortunately, the modeling language is simple enough that there is general agreement, within the Shlaer-Mellor, xUML community on most of its class and state modeling features. There is however, some variance with regard to how activities are modeled.

Action Language

The manner in which algorithmic computations are specified is one of the more challenging areas in the translation workflow. The syntax and semantics of an action language involve many trade-offs. Because writing action language appears, superficially, to be like writing program code and because, as programmers, we have definite opinions about how best to write program code, action language syntax is subject to the extremes of a programmer’s personal taste.

But we do not consider writing action language to be coding in the usual sense. Coding is directed at making a computing machine operate in a specific manner to achieve a desired result. By contrast, action language is directed at specifying the algorithmic processing of a domain model void of implementation technology considerations. A large fraction of what model activities do is directly related to model-level concepts, such as signaling events, navigating relationships, and updating attributes. So we do not consider it to be a more abstract version of program code but rather a detailed specification of operations supplied by the formalism of the model execution rules. Clearly, action languages must be transformable into program code. But we consider that transformation to be a discontinuous operation directed by mapping functions and not a process of gradually elaborating the action language statements to some lower form of abstraction.

In other words, the notion of starting out with fuzzy actions written in natural language and then inserting more and more code-like fragments to tighten it down as a means to get closer to implementation is the antithesis of our approach. As sure as plaque will rot your teeth, elaboration erodes away the platform independence of a domain model until it is neither a good model nor a good implementation. Instead, we aim to map model-level operations to whatever constructs and idioms are appropriate to the target programming language, be it object-oriented, functional, scripting, or otherwise.

Several action languages consistent with xUML have been defined, but because of the many ways that algorithms may be stated, we see little convergence in the syntax. For example, BridgePoint (xtUML) Object Action Language (OAL) and iUML’s Action Specification Language (ASL) have semantics that match closely to our approach and have working implementations. Other action languages have been proposed, such as Shlaer-Mellor Action Language (Small) and Starr’s Concise Relational Action Language (Scrall). See www.executableuml.org for links to xUML-compatible action languages, but have not seen any production-ready translators.

On the wider front, Alf (Action Language for Foundational UML) has been established as a general-purpose UML standard. We don’t find Alf appropriate to our approach because it covers conventional UML semantics and carries the burden of object-oriented programming language constructs upon which UML was based. We see, for example, the Alf constructs for namespaces, public/private declarations, collection data types, inheritance, and so forth as implementation concepts. In our platform-independent context, these constructs offer more confusion than clarity for specifying the model-level processing of application logic.

Desirable Characteristics of an Action Language

To be truly platform independent, a model should not specify a particular sequence of computation unless that sequence must be enforced on every potential target platform. Here is an example of an arbitrary computation sequence:

y = scale(x)
z = filter(i1, i2, ... in)
result = y + z

Depending on the implementation, actions 1 and 2 could be reversed or executed concurrently. Action 3, on the other hand, must wait for both actions 1 and 2 to be complete. If the action sequence as written is intended to indicate a required sequence of computation, the model is unnecessarily limiting implementation choices. This breaks the principle that the model must specify only what is required on all potential platforms. By breaking that principle, the model loses a bit of credibility. “What else in the model might I ignore?” the implementor now begins to think!

Consider Figure 11-3.

A421575_1_En_11_Fig3_HTML.gif
Figure 11-3. No arbitrary sequencing in data-flow representation

Each action is represented as a circle, and we interpret each action to be runnable when all of its inputs are available. Action 3 must therefore wait until both actions 1 and 2 have produced output. This data-flow view of computation eliminates the statement of arbitrary sequencing. This is why the data flow is our fundamental view of algorithmic processing in xUML.

Unfortunately, text representations are faster and easier to edit than graphical representations of data flows. In Chapter 2, we showed a data-flow diagram of the Logging In activity from the Air Traffic Control model. The difficulties of dealing with data-flow graphics and specification of the data-flow processes have meant that all translation schemes of which we are aware use a text-based language to specify activities.

The use of a text-based action language does not preclude having data flow semantics in the language. Early attempts at action language, such as Small, used a UNIX pipe style of syntax to indicate data flows. The simple actions in Small would appear as follows:

x | scale > ∼y
Input(all).i | filter > ∼z
(∼y, ∼z) | sum

The text can then be processed in such a way as to yield a data-flow representation as input to the code-generation process.

And, in fact, the original step-by-step text formulation with steps 1–3 is even okay if it is understood that a data-flow analysis will determine the implementation sequencing. The important thing is that adequate information is present to construct an intermediate data-flow representation before proceeding with translation.

Unfortunately, constructing a MX domain that actually takes advantage of this fine-grained concurrency is nontrivial. Code that can deal with concurrency and map it onto the available processors to achieve parallel execution without mucking everything up is difficult problem. As programming language support for this type of fine-grained parallelism becomes available (consider Go), we hope better use of the inherent concurrency of the model execution can be achieved as actual parallel execution.

Object-oriented and other common programming idioms impose more implementation biases in action language syntax. Most action languages are patterned after programming languages and never fully escape their roots. Because statically typed, usually object-oriented, languages are the most common translation targets, most action languages have telltale signs of these programming languages baked into their design. For example, the use of an object’s address in memory as an implementation-generated identifier is common. An instance reference serves as a thinly disguised pointer, which, sadly, encourages modelers to think in those terms.

There is an unnecessary distinction between sets of instance references and a single instance reference, as if sets somehow cannot contain a single member. We suspect the differences are more related to the ease at which most programming languages can hold a single pointer value in a simple variable of a language-supplied type compared to a collection of pointers that requires a more costly implementation construct. The translation knows, by the nature of the instance selection, when the outcome can be more than one instance. The translation mechanism should then be responsible for choosing the optimal mechanism to hold the result and not press that decision back onto the modeler. On some occasions, the result of an instance selection must be limited to being less than the full selected set—for example, selecting an arbitrary instance from a set of otherwise identical instances or limiting the selection to a given number based on sorting criteria. But those operations limit the cardinality of the selected set, the result of which is still a set. The limiting operation does not dictate a particular way to store the result.

The remnants of imperative programming language constructs are evidenced in the fact that action languages require excessive explicit iteration over class instances. In xUML, the instances of a class form a set and so set at a time operations would be a desirable replacement for explicit iteration. For example, to give a percentage price discount on items in a store, we would prefer to say something like

Item().Price *= percentDiscount // (). selects all instances

rather than what is more common:

items := select all instances of Item
foreach i in items {
    i.Price := i.Price * percentDiscount
}

or worse yet:

items := select all instances of Item
for count ranging 1..items.count() {          // assuming indices start at 1
    item[count].Price = item[count].Price * percentDiscount
}

Model-level actions can be applied to sets. Consider signaling a torpedo recall for a certain model of torpedo. Again, we would like to say something like

// Send a recall event to each Torpedo having a given design specification
Recall -> Torpedo( Spec:specToRecall )

compared to this:

torpsToRecall := select many from Torpedo where (Spec == specToRecall)
foreach torp in torpsToRecall {
    signal Recall to torp
}

Allowing an instance to be created without setting a value for each of its attributes is another example of implementation seeping into an action language. Allowing attributes not to have a value assigned at creation time follows from the assumption that an instance is a block of memory allocated out of a pool or heap. It is an untrustworthy arrangement because different execution paths through the activities could potentially leave one or more attributes uninitialized and containing whatever random bit pattern might already be stored in the instance memory. An action language should not make assumptions about instance data being stored in memory. An MX domain may choose any method of data management that meets the needs of the targeted class of applications, such as a relational database management system (RDMS), a key/value pair database, a flat file, or even an EEPROM.

The only way to keep an action language executable, yet free of any implementation assumptions or biases, is to build it on mathematical foundations. Relational algebra is a branch of mathematics, extended from set theory, functions, and predicate logic that can serve as a basis for action language operations.

We do not suggest that actions should be written as pure relational algebraic or predicate logic expressions. Rather, action language operations on class instances should be derivable from relational algebra and provide operations that clearly express model-level execution concepts. From that basis, the translation mechanism can then transform the operations into the programming constructs that integrate with the data management services provided by the MX domain. A lean and orthogonal basis for action language operations makes the transformations to a wide variety of other data management techniques easier to accomplish.

Application models need a way to express and process real-world data such as pressure, temperature, video images, geophysical coordinates, stock prices, and so forth. But most action languages do not acknowledge value types of anything other than basic, supported system types or a type matching that of a class structure. This eliminates values that might be composed of the join of two classes. For example, if we wished to have a report of the launch type of our torpedoes, most action languages force us into awkward constructs such as

torps := select all instances of Torpedo
foreach thisTorp in torps {
    itsSpec := select Torpedo Spec related to torp across R5
    UI.REPORT(torpedo : thisTorp.Torpedo ID, type: itsSpec.Launch Type)
}

where we must compute the join by navigating associations, rather than this:

UI.REPORT( Torpedo().(ID, /R5/Torpedo Spec.Launch Type) )
// Join is implied

or this:

UI.REPORT( Torpedo join R5 Torpedo Spec.(ID, Launch Type) )

In fact, relational theory permits a value to have a type of arbitrary complexity. For any given data structure, the modeler must decide whether exposing that structure contributes to the domain analysis or distracts from it.

Consider, for example, a domain that tracks geographical boundaries. Certainly, there will be a need for some kind of two-dimensional Point type. You will need to store Points as attributes of classes. A Point is a representation of a two-dimensional vector. The algebra of 2D vectors is well understood, and the domain activities will need the ability to add points, multiply points by a scalar value, and determine the distance between two points. An action language should support having a Point data type along with the algebraic operations on it. The algebra of the user-defined type should be fully integrated into the action language syntax.

Encapsulation of any internal components of a user data type is critical. Even if we need to obtain the value of the components of a Point, we should not know how the components are internally represented. We may choose to hold the 2D point in Cartesian coordinates. However, if there is significant circular symmetry in the application, using polar coordinates would be a better choice for the implementation of Point operations. The two representations are equivalent, and each can be converted to the other.

We need to be able to define, presumably outside the action language itself, how user types would be implemented. We do not want to be forced to define Point type attributes as two scalar numeric attributes and have to expose to the model the details of the operations on the two attributes. This would lend nothing to the analysis of a domain dealing with geographic boundaries as its subject matter. Neither do we think data-type operations should be defined as external entity operations. Sprinkling external entity invocations into the domain activities solely to accomplish abstract data-type operations compounds the lack of user-defined data type support with action language clutter.

Furthermore, we probably don’t want to implement the algebraic operations on the Point type at all. We would rather call an existing library that is better coded and tested than what we would write in action language. The lack of good support for user-defined data types becomes more intractable when considering larger algebraic structures such as matrix algebra, which would be required of a domain dealing with three-dimensional graphics.

Translation Considerations

Referring back to the translation workflow, we work our way downward from the metamodel and produce code. The reference translation workflow shows the generation of model code as a two-step process. First, the populated metamodel is transformed into a population of the platform model. Second, the populated platform model is transformed into the model code. The generation of model code from a populated metamodel could be accomplished in a single step, and many translation schemes operate in that manner. We prefer a two-step transformation for its added flexibility. Exposing a distinct platform model also allows a close examination of how well the model fits the platform needs of an application:

  • The platform model is most dependent on the class of applications and the demands those applications place on computing technology to run them. We would like to see the development of platform models that account for different mechanisms to manage domain data, execution concurrency, and other key aspects of how model execution rules are realized by implementation technology appropriate to a particular class of applications.

  • The code generation is most dependent on the chosen implementation language and the interface details of the MX domain runtime functions.

Our conjecture is that the separation of model code generation into two steps would facilitate a modular approach as well as allowing easier support for both differing implementation mechanisms of the model execution rules and a larger variety of implementation languages. We have no direct evidence to support that conjecture. However, for conventional computer language compilers, automatic generation of a code generator from a machine description is an established technique. By analogy to language compilers, we think such ideas may be applicable to automatically generating the transformation from a platform-model population to implementation language code based on a description of the platform-model characteristics. We think separating the transformation of the metamodel population into a platform-model population as a separate phase from generating model code from the platform-model population would contribute the flexibility to try such an approach. This is clearly a subject for additional research.

Adding steps to the overall workflow unfortunately increases the difficulty of tracing between models and executing code. When something goes wrong in the execution of a modeled and translated program, it can be arduous to work back up through the layers of transformation. In general, the transformation from code to models is not reversible without additional information. Conventional language compilers accomplish the feat by recording debugging information about how the generated code is related to the source files and symbols of the program. Such backward traceability is desirable, but difficult to achieve both in terms of recording the information during the translation processing and in having a program to interpret it.

Even well-established techniques sometimes don’t work. For example, C supports adding #line directives to source code to indicate that the code originated from a file other than the one being compiled. Pycca supports adding these directives so that error messages and source debugging can reference the ultimate pycca source file (call it sio.pycca), rather than the generated C file. But sometimes source-level debuggers, particularly those used for microcontrollers, do not admit to the fact that C source code might be contained in a file that does not end in a .c suffix.

Execution tracing embedded into the generated code can help in some of these situations. It is particularly useful to trace state machine dispatch information. Tracing other aspects, such as method invocations, is sometimes available. Unfortunately, emitting trace information at runtime is intrusive on the execution speed of the program as well as its size. If the tracing is removed for deployment, as often it must be, then we are still faced with how to handle tracing execution faults that occur after deployment. Although this problem might be solved for a specific tool chain, we do not see a more general solution.

The Pycca Workflow

The reference workflow just described represents the ideal situation. By contrast, the pycca translation workflow, shown in Figure 11-4, is less ideal, but it is a working solution that yields real code for a certain class of platform and applications. The software is open source, and you can download and use it today. Realistically, we expect you’ll spend a bit of time reading through the online documentation, configuring your environment, studying the online examples, and experimenting a bit. So maybe you’ll use it tomorrow. But everything you need is, in fact, available now.

A421575_1_En_11_Fig4_HTML.jpg
Figure 11-4. Translation workflow using pycca

A pycca translation starts with one or more completed domain models with fully specified domain bridges. This includes, at a minimum, all three facets of the model:

  • A class model of the domain data including full descriptions of the classes, attributes, relationships, and data types.

  • An initial instance population.

  • The state models of all active classes expressed as both diagrams and tables.

  • The action language for all the domain activities, including the state activities, any class methods, or domain operations. You may use pseudo-code, but it must cover all model-level actions and fully specify all algorithmic computations.

  • If the domain has any external entities defined for it, bridge markings and mappings must also be supplied.

For the reference workflow, we assume that the transformations are carried out programmatically. The transforming programs would extract the necessary information from a metamodel population. As an example, consider the decisions that must be made regarding how association navigation is implemented. The multiplicity of the association is determined directly from its definition. By parsing the action language of the activities and walking the abstract syntax tree (AST), we can examine the operations performed on an association. This will tell us if, during runtime, any instances of the association are created or destroyed. Similarly, analyzing the AST of the domain activities can determine whether an association is navigated in both directions. This information can be recorded and used to make programmatic decisions as to how the class data is organized to support navigating the association.

Human Roles in Translation

For the pycca workflow, human intervention is required. The model artifacts may exist in any format or media because it will all be processed manually. For example, the state tables and instance population could be entered into a spreadsheet. The class models may be drawn in any tool you like.

The human translator must perform three tasks:

  • Analyze the model to decide how model-level constructs are mapped onto the implementation constructs provided by pycca. The decisions involved in mapping model-level constructs were discussed in Chapter 3. Pycca provides a set of choices for how model-level constructs can be implemented. For example, association linkage is configured based on the multiplicity, dynamic aspects of the association, and whether the association is navigated in a particular direction. References to support relationship navigation need to be included only for those paths navigated by the action language. Frequently, both directions are not needed. Some attributes will be elided. For example, attributes used strictly for identification purposes and not otherwise referenced in the action language need not be included in the implementation class because the model execution domain for pycca uses a platform-specific identifier (the address of the class instance in memory). All of this information can be found by reading the action language and annotating the model graphic for each attribute that is read or updated and noting the direction that each relationship is navigated.

  • Transcribe the model structure into the pycca DSL. In Chapter 4, we showed the translation of the Air Traffic Control model. Each of the three facets of the model have a correspondence in the pycca DSL. State models have the most direct representation in pycca and correspond closely to the information contained in the state transition matrix. Pycca class definitions will vary from those on the class model by virtue of the translation decisions made in the first step. Here you do get to think in terms of what C structure implements the model-level intent of a modeled class. Of the three facets, the state activities require the most transformation effort. It is necessary to formulate the semantics of the action language into C code by using the provided pycca macros to perform model-level actions. The transformation of action language to C usually falls into two categories: model-level operations and flow control or computation. If you are transforming a model-level action such as signal event, or select instance x related to instance y, use a provided macro. Expression evaluation and flow of control are directly represented in the C code.

  • Write the bridge code. In Chapter 8, we demonstrated how bridging is accomplished. After the domain mapping is done, data structures and code must be written to implement the intent of the domain mapping. In simple cases, this can be done with an array to map identifiers from one domain to another, and a small piece of code to pass control into the target domain. Pycca provides a portal into domains that can be used to execute simple model-level actions from outside of a domain. Frequently, the domains involved in the bridge will provide domain operations to assist in the bridging.

Note that the marking task from the reference workflow is missing. It is, in fact, folded into the pycca script-writing task. There is no distinct marking going on. While writing the script, the human translator makes implementation choices based on the action language.

Our experience shows that translations benefit greatly when done by someone other than the person doing the domain analysis. We find this true even in more automated translation schemes. This allows project teams to specialize between those people more interested and skilled in application analysis and those more interested and skilled in implementation technology. As previously stated, great proficiency in both skills is rarely found in the same person.

Even if the same person serves multiple roles, making a sharp distinction between roles sets the boundaries for the thought processes involved with each. The discipline of “thinking inside the box” is, in this case, a highly valued skill. The process of translating a domain model is an ideal way to become familiar with the logic of the domain in a way that merely reading the analysis materials does not accomplish. Project teams that practice pair programming will find pairing a domain analyst with a domain translator yields a higher-quality output and a deeper understanding of model and implementation for both team members.

Running pycca on the resulting source performs the two transformations shown in the workflow, populating a platform model and generating code from the populated platform model. We discussed the pycca platform model in the preceding chapter and gave a brief overview of the processing performed by pycca. Processing by pycca yields C code files, which can then be integrated with bridge code and the ST/MX domain code to produce a running program.

Pycca vs. the Big Tool Approach

The pycca approach is oriented to the implementation side of the complete development workflow. The most important reason for that orientation is to be able to obtain the required characteristics of the implementation. As beneficial as we consider modeling and translating to be, modeled and translated programs that do not achieve the expected implementation characteristics do not see the light of day as real products. Project teams that cannot see a clear path to an implementation that meets all their nonfunctional requirements are not inclined to undertake a model/translate development approach.

A vast array of implementation technologies is available in the wider computing world. From operating systems to database management systems to programming languages to web frameworks, programmers have written many useful implementation components. As programmers, we delight in finding common and reusable processing that can be implemented as a single body of code and still satisfy a larger scope of need. We want to be in a position to make better use of available implementation components when translating models if we are to achieve the required implementation characteristics and obtain the true benefits of modeling.

It is unrealistic to expect tool vendors to supply translation mechanisms that can use this variety of implementation technology, especially when the technology choices for a project must be very specific to meet broader corporate or team needs. Commercial concerns alone dictate that tool vendors attempt to make their products appeal to the broadest possible set of customers. The drive is to the lowest common denominator, as that yields the largest potential market. Tooling usually trails the leading edge of implementation concepts and technology substantially due to the rapid pace of platform technology growth and the limited market each technology presents.

Project teams have many of their own constraints to address. Target programming language support is one example. It is common for tool vendors to support C, C++ (and to say C/C++ in one breath as if they were the same language) and perhaps Java or Ada. A dizzying number of programming languages are available, and more are always being developed. There may be compelling reasons for a project to use a particular implementation language. Those reasons can be based on legacy system integration, target platform restrictions, or knowledge and convenience of the development staff. For example, Python is a popular language and has been used for implementing some rather large systems, but we are not aware of any tool vendors targeting it.

Another example is the use of a database management system. There can be compelling reasons to use them, because a central data store can be a powerful integration point for other programs of a larger system or for ad hoc queries that may be needed to satisfy future requirements. But we are not aware of substantial tool vendor support in this area either.

We believe that the separation between modeling and translation, which we have described as the separation between logic and implementation technology, is a fundamental concept that should allow us to choose the implementation technology most appropriate to our immediate engineering needs and still obtain all the benefits that modeling gives.

The primary focus of our approach is to create platform models and MX domains specifically tailored to the needs of the implementation requirements of the class of applications that a project team intends to produce. The approach attempts to solve the overall workflow by starting at the implementation side and working backward to integrate the analysis and modeling tools. We gain the benefit of all the structure provided by the model execution rules, and we are sure that the executable models have a translation. However, integration to front-end analysis capture tools is lacking. The trade-off is to place a human in that role.

It is conceivable that integration with front-end tools could be achieved when using a translation scheme such as pycca. Many tool vendors supply configurable code generators that could potentially be modified to produce pycca source rather than programming language source.

A Role for Humans in Code Generation

We don’t consider using a human in a particular role of the software development process to be an unusual circumstance. After all, project teams embark on developing large systems of many tens, if not hundreds, of thousands of lines of code with little more than a text editor and compiler and a few auxiliary tools, perhaps in an integrated development environment tool. We are mystified in the divergent attitudes: it is acceptable to undertake large software projects using conventional ad hoc design techniques with little if any tooling, but somehow the use of models and translation necessitates extensive and complex software tools. We suspect that project teams do not generally have a clear understanding of precisely how the translation is achieved and so have little choice but to assume that the gaps are filled in by software tooling.

We have demonstrated in this book that large, complex tools are not required to achieve the benefits of a model/translate development approach. We are not antagonistic toward tool vendors. They supply a vital role and service. The target platforms supplied by tool vendors are entirely appropriate for many projects. However, many organizations can ill afford the capital costs of complex tools, nor the learning costs associated with tools, nor the administrative costs required to keep complex tools operational. We are also disappointed by the monolithic nature of the available tools and would prefer a more modular solution. We are disconcerted by modeling tools that segment the application class by performance characteristics. For example, some modeling tools purport to be useful for producing “real-time” models. But it’s not the models themselves that are real-time. It’s not clear how such tools add much support for nonfunctional requirements such as fixed time limits in which a system must respond or deterministic response timing. It is also disheartening that current usage of the term real-time has devolved in many contexts to mean “fast enough so that a human is not annoyed.” The problems of real-time response must be solved by the implementation strategies of the MX domain, and we see no model-level impact.

Summary

The reference workflow describes the essential tasks and work products necessary, in some form or other, to build and translate xUML models. Any particular approach to translation, including pycca, can be contextualized with respect to this workflow.

The reference workflow describes the steps of building and storing xUML models, populating them, marking them, integrating the modeled and non-modeled domains, translating the models to a platform-specific model, generating code compatible with a model execution runtime component as well as passing through hand code, and then compiling and linking everything into a complete system.

The challenges and tool support relevant to each of these tasks was discussed.

The xUML modeling language can be formally described with a metamodel. This is a model of the language defined itself in xUML. The metamodel also serves as a design for a database schema to store xUML models. At present, most existing metamodels tend to be tool-specific representations. Open source efforts are underway to define a tool-independent metamodel of xUML. (Though, as the pycca approach demonstrates, you can get by without one).

The pycca workflow was presented in the context of the reference workflow. Notable differences are the use of simple draw tools and spreadsheets to capture the models instead of a monolithic tool. Design specifics, including marking, is performed in the process of creating a pycca script. This script is then used to populate a distinct platform model prior to code generation. Although human intervention is required to a greater degree than with a monolithic tool, there are more opportunities to customize the code. Nonetheless, the source models are always carried forward and never altered in the process of translation, as is the case in the reference workflow.

Footnotes

1 One of the authors, Stephen Mellor, was an original signatory to the Agile Manifesto.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset