This chapter explains the economic advantages and the investments that come with Model-Driven Software Development (MDSD) in general and architecture-centric MDSD in particular. It also attempts to answer both typical and critical questions.
You can find an overview of the motivation for, and basic principles of, MDSD in Chapter 2, Sections 2.1 to 2.3. We recommend you read those sections before this chapter.
MDSD combines the scalable aspects of agile approaches with other techniques to prevent quality and maintainability problems in large systems, as well as with techniques for the automation of recurring development steps in software development.
The potential of MDSD is based on a few basic principles:
Based on these basic principles, a considerable number of features of the software systems developed in this manner can be derived and mapped to economic benefits as the following table demonstrates.
We now take a detailed look at the positive effects that MDSD can have on software projects.
In a classical OO development process [JBR99] a design model is created incrementally and iteratively via a step-wise refinement of a part of the model that is established by projection to a few use cases. This model is refined to the point that it can more or less be transformed directly into an implementation.
Such a model can also be extracted automatically from an implementation by reverse engineering. We call this abstraction level an implementation model, because it contains all signature details of the implementation (but usually no more than this).
The diagram in Figure 18.1 shows the idealized development process of a development increment1 from its analysis to its implementation. Over time, both the level of understanding and the level of detail of the increment increase. At the start, more understanding is gained: towards the end, more details are worked out. A GUI design in the form of a sketch or a slide contains almost the same information as a completed implementation of the GUI using a JSP page or something similar. This means that the process of implementing the increment is primarily a task that requires tedious work and increases the increment’s level of detail, yet it hardly improves the level of understanding. Nevertheless, the work is necessary in order to convert the essential information into a computer-readable form.
Whether this happens iteratively or not is of little relevance in this context. The overall effort basically consists of the progress in both previously-mentioned dimensions – that is, it corresponds to the area underneath the curve.
The disadvantages of an implementation model can be summarized as follows:
Figure 18.1 can be transferred to agile processes if a sufficiently abstract viewpoint is taken. The implementation model exists typically only in a virtual form – that is, in the form of source code. This avoids some disadvantages, of course, partly because specific artifacts are intentionally omitted. However, as in a heavyweight process, the necessary level of understanding must be gained during development. The same is true for the functional requirements (analysis results) that are contained in the software or an increment. In other words: depending on the process, the milestones in the diagram may have other names or may not exist explicitly, yet the shape of the curve is basically the same.
For the sake of this discussion, we reduce the overall effort of software development to the factors of information gain and level of detail, and consciously ignore setbacks in the level of understanding that are brought about by changing requirements or new insights. We don’t do this because these effects are irrelevant, but because they can be examined separately and independently of the issues described here.
We now introduce the essential potentials of MDSD, automation and reuse, to this discussion.
Here we clearly see the effects of abstraction, the modeling language’s shift towards the problem space (see Figure 18.2):
In an extreme case, the model specifies the complete semantics of the (partial) application, so that no manual coding is necessary at all (see Figure 18.3). The circumstances in which this is possible or useful are discussed later, in Section 18.6.2.
A generator cannot help in increasing the level of understanding. All information must therefore already be present in the model, expressed using the domain-specific modeling language (DSL) – see Sections 2.3 and 4.1.1. Referring to Figure 18.2, the curve rises more steeply close to point A’ than to point A. This means you have to put more work into understanding the domain - which has the positive side effect that it forces you to actually think about the domain.
Here, too, we are confronted with an extreme case: if the modeling language is very close to the problem space, the analysis result can be cast directly into a formal model – that is, point A’ moves closer to the analysis result while containing the same amount of information, or it assumes the same position, thus proportionally increasing the automation potential. On the other hand, this also means that the analysis will have to be more formal (see Figure 18.3).
Taking an automotive engineering metaphor as an example: at your local car dealership, you fill in an order form that lists the vehicle type and any special features you want. The characteristics on the order form constitute the domain-specific language. You don’t have to worry about implementation details, such as engine construction, when you order your car. The factory produces the desired product (your car) based on the domain model (your order form), using prefabricated components. Obviously, the engineering achievement lies in the effort of building the production line and the mapping of your functional order onto this production line. The production of the item itself is automated.
The second important aspect about increased efficiency is reusability in the form of a domain-specific platform consisting of components, frameworks and so on. Here the effort is simply shifted from application development to the creation of the platform (see Figures 18.2 and 18.3), which is reusable. Such reusable artifacts can be used beneficially in almost all other development processes. However, their creation is usually not explicitly included in the methodology: in MDSD the platform complements both the generator and the DSL.
Now we can clearly see the potential for savings that is gained by automation in combination with a domain-specific platform. Figure 18.3 shows the extreme case. From the application development perspective it obviously offers maximum efficiency, but on the other hand a powerful domain-specific software production infrastructure must be established. However, to be able to leverage the advantages of MDSD to the full, some investments are necessary that must be weighed against the advantages. We discuss this topic later.
In general the following advantages are gained compared to a standard development process:
The implementation effort can be reduced considerably by using a generator that ‘understands’ the domain-specific modeling language. In the case of an architecture-centric approach (Section 2.5), a considerable amount of ‘classical’ programming remains, yet developers can focus on coding the functional requirements. The architectural infrastructure code is generated (Chapter 3) and also serves as an implementation guideline. A formalized, generative software architecture that includes a well-defined programming model emerges.
Until now we have focused entirely on how to increase efficiency in software development, but quality plays an equally important role, particularly because of its long-term effect. This section examines some of the factors that affect software quality and their connection with MDSD.
Code generation can only be used sensibly if the mapping of specific concepts in the model to the implementation code is defined in a systematic manner –that is, based on clearly-defined rules. It is mandatory to define these rules first. To reduce the demand for code generation, we recommend working with a set of rules that is as small and well-defined as possible. A small set of well-defined concepts and implementation idioms are the hallmark of a good architecture. We conclude that – in the context of MDSD – both the platform architecture and the realization of the application functionality on the platform must be well planned. MDSD demands a concise, well-defined architecture. This results in permanent consistency between model and implementation in spite of the model’s high abstraction level. Suitable generation techniques guarantee this consistency in strongly iterative development (see Chapter 3).
Modeling language, generator, and platform constitute a reusable software production infrastructure for a specific domain. Today’s technical platforms, such as J2EE, .NET, or CORBA, offer a large number of basic services for their respective application fields – J2EE, for example, for big enterprise applications. Yet using these services – and hence the platform itself – efficiently is not easy. One must adhere to specific patterns or idioms to exploit the full potential of the platform: the platform must be used ‘correctly’. All the patterns and idioms are documented and well-known in principle, of course, yet it is a big problem in larger projects to ensure that the team members, whose qualifications typically vary considerably, consequently apply the right patterns. MDSD can help here, because defined domain concepts are automatically implemented on the platform in always the same manner. Ergo, the transformations are design knowledge that has been rendered machine-processable. They constitute an inherent value, as they preserve the knowledge of how to use the platform effectively in the context of the respective domain.
In today’s practice it is not reasonable to generate 100% of the application code. As a rule, a skeleton is generated into which the developer adds manually written application code. The platform architecture as well as the generated application skeleton determine where and how manually-created code has to be integrated and the manner in which this code is structured. This equips developers with a safeguard that makes it unlikely that they will create ‘big ball of mud’ systems.
In the context of ‘classical’ software development, the application logic, mixed with technical code, is typically programmed against a specific platform in a 3GL language. This is also done, although to a lesser extent, when dealing with technologies such as EJB, where the application server takes over specific technical services such as transactions or security. Since many concepts are not visible at this abstraction level, the software must be documented using various means. In time-critical projects it is often impossible to keep this documentation synchronized with the code, and it is therefore usually the first victim if time runs short.
In the context of MDSD, the situation is altogether different: the various artifacts constitute a formal documentation of specific aspects of the system:
These artifacts are always synchronized with the actual application, because they serve as the source for the application’s generation.
Of course MDSD also has a need for informal documentation. The DSL and its semantics must be documented to allow its efficient use in projects. However, this kind of documentation is only required once for each DSL, not for each application (see Section 18.4.).
In general, generated code has a rather dubious reputation: barely readable, not documented, and with poor performance. However, this is an unfounded prejudice in the context of MDSD, because the quality of the generated code depends directly on the transformations (for example the templates). The latter are typically derived from a reference implementation (see Section 13.2.2). This means that the properties mentioned above, such as readability and performance, are propagated from the reference implementation into the generated applications.
This is also the reason why the reference implementation specifically should be developed and maintained with care. As long as transformations are created with sufficient care, the generated code’s quality will be no worse than that of manually-created code. On the contrary, generated code is usually more systematic and consistent than manually-created code, because repetitive aspects are always generated by the same generator, so they always look and work the same way.
For example, with today’s generators it is no problem to indent the code sensibly. Comments can also be generated easily. One should pay attention to these things – the developers will be grateful.
When MDSD is used a number of aspects have positive effects on the error rate of the software produced. This can be seen as early as your first MDSD project, but you will feel the effects mostly in subsequent projects and during maintenance:
Besides the potential for automation and quality improvement, a domain architecture also has a very high potential for reuse. Keep in mind that it is a software production infrastructure consisting of a domain-specific modeling language3 (the DSL), a domain-specific platform, as well as a set of transformations that enable the step from model to runnable software, either completely or partially automated (see Section 4.1.4).
Such a production infrastructure is then reusable in similar software systems (applications) with the same properties. The definition of ‘similarity’ and its related potential for reusability is derived from these decisions:
This makes MDSD interesting primarily in the world of software system families. In this context, a software system family is a set of applications based on the same domain architecture. This means that:
As an example, our first case study (see Chapter 3) covers the domain of architecture for business software with the MDSD platform J2EE/Struts. The modeling language described there is reusable for all software systems that use the features provided by this language. The generative software architecture of the case study maps the modeling language to the platform. The transformations are therefore reusable for all software systems that have the same architectural features and will be implemented on the same platform.
Due to its level of abstraction, the reusability of DSLs typically is even greater than that of transformations and platforms. In the context of a business, it can for example be sensible to describe the business architecture via respective high-quality DSLs and map them to relevant platforms using transformations (generators).
In the case of a clearly-defined functional/professional domain such as software for insurance applications, a functional/professional DSL can be even more effective, for example to support the model-driven, software-technical realization of insurance products or tariffs.
A combination – or more precisely, a cascade – of functional/professional and technical MDSD domains is particularly effective: in most cases, the platform of a functional domain architecture can be very well realized with the help of an architecture-centric domain architecture (see Section 7.6). In this way the advantages of MDSD can be leveraged in both functional and technical dimensions.
Either way, the more applications or parts of applications are created in such software production lines, the faster one will profit from their creation, and the greater will be the benefit.
Fan-out is an important benefit of MDSD. This term describes the fact that a number of less abstract artifacts can be generated from a single model. The fan-out has two dimensions:
An MDSD approach generally allows fast modification of the developed application(s). Since many artifacts are automatically created from one specification, changes to the specification will of course affect the whole system. In this way, the agility of project management increases.
We have seen the potential and usefulness of MDSD. Unfortunately, even in IT, there is no such thing as free lunch: to be able to enjoy the advantages, one must invest in training and infrastructure – of a technical as well as an organizational nature, depending on the desired degree of sophistication. This section introduces some experiences and conveys some condensed information from real-life projects to give you a clearer idea of the costs, the usefulness, and the break-even points in MDSD.
We recommend you first approach MDSD via architecture-centric MDSD, since this requires the smallest investment, while the effort of its introduction can pay off in the course of even a six-month project. Architecture-centric MDSD does not presuppose a functional/professional domain-specific platform, and is basically limited to the generation of repetitive code that is typically needed for use with commercial and Open Source frameworks or infrastructures.
The investment consists primarily of the training needed for handling an MDSD generator and its respective template language, as well as for the definition of a suitable DSL for modeling.
To conduct a quantitative analysis, we collected sample data from a real-life MDSD project at the time of its acceptance. This project was a strategic, Web-based back-office application for a big financial service provider. Table 18-2 shows in rough numbers the effects of architecture-centric MDSD on the source code volume.
Amount of source code [kB] | Traditional development | MDSD |
Source code reference implementation | 1.000 | 1.000 |
handwritten code | 18.800 | 2.200 |
Models | 3.400 | |
Transformations | 200 | |
____ | ____ | |
Total | 19.800 | 7.800 |
To represent the influence of models on code volume, we selected the number of kilobytes required to store the all required UML files. We also assumed that a manually-created reference implementation would be developed to validate the application architecture.
Our concrete example for architecture-centric MDSD shows that the volume of code that needs to be maintained manually is reduced to 34% of the code volume that would have to be maintained in non-model driven development scenarios, including transformation sources! This number may appear to be very low, but in our experience it has proven to be representative. Tool manufacturers quite often publish data reflecting the percentage of generated code. However, these numbers are meaningless if it is unclear whether the necessary model and transformation sources, as shown in Table 18-2, have been included in the calculation. In our example, model and transformation sources make up more than 50% of the source code that needs to be maintained. Figure 18.4 provides a graphical view of the data from our example.
Of course, other than in manual implementation, the volume of sources to be compiled remains unchanged if MDSD is used, as shown in Figure 18.5. The difference between traditional development and MDSD is that generated code doesn’t constitute a source, but an effort-neutral intermediate result.
If we ignore model and transformation sources, as well as the reference implementation, we get a ratio of 88% generated code and 12% manually-created code. In our view, however, such figures are ‘window dressing’. In MDSD, the models have the same value as normal source code, and the reference implementation should always be maintained in MDSD, because it is the basis of all refinements and extensions of the architecture.
Viewed from this perspective, the ratio of generated code and manually-created code is 72% to 28%. These figures are still impressive enough to make it clear that architecture-centric MDSD can pay off even if only a single application is developed with a given infrastructure – particularly if one takes into account medium-term maintenance requirements. To prove that our statement is true, we must consider the entire effort for the project’s realization, not just the programming effort.
Besides implementation, the project time and effort consist of:
In addition, business process analysis preceding the project, required project documentation (user manuals, help menu texts), as well as production costs must be considered. MDSD has a positive influence on some of these activities, yet it is difficult to document this influence in general numbers. We therefore only examine in which respects MDSD affects the implementation effort, and how this again affects the total core activity efforts.
The figures given in the previous section are not useful for an assessment of the total work effort, because they do not reflect how much time – and thus money – is spent on the creation of models and transformations, as well as for the manual programming of reference implementations and application-specific source code. Likewise, the brainwork involved in creating the DSL is not considered here.
In a little practical experiment [Bet02] we examined how much time the creation and maintenance of models takes, based on data entry and mouse clicks. We then converted this information into the equivalent of source code lines using an approximate formula:
These figures clearly demonstrate that MDSD is not the same as UML round-trip engineering. They also show why many software developers are skeptical regarding the use of UML tools.
In architecture-centric MDSD, the platform almost exclusively consists of external commercial and Open Source frameworks. The domain architecture that needs to be created consists primarily of a reference implementation and transformations derived from it (code generation templates).
The figures from Table 18-2’s example (reference implementation circa 1,000 kB source code, transformation source code circa 200 kB) support our thesis that the derivation of transformations from a reference implementation takes only between 20% and 25% of the effort required for creating the actual reference implementation – especially if you keep in mind that most of the mental work has already been done for the reference implementation.
The effort required to create a reference implementation is not influenced by MDSD. For further discussions, we assume that the programming effort for the reference implementation constitutes 15% of a project’s total programming effort. In the Table 18-2’s example, the size of the reference implementation was only 5% of the entire application (measured in kB), so that even if one takes into account that the reference implementation is more difficult to program, 15% is a fairly conservative estimate.
Table 18-3 shows that architecture-centric MDSD can lower the programming effort by 40%, given the conditions described above. This figure confirms our practical experience as well, and is a good reference for calculating the costs of introducing MDSD. It is best however to use metrics from your own team to calculate how much programming effort is needed compared to the total project time and effort, and thus to calculate the potential of MDSD.
Development effort [%] | Traditional development | MDSD |
Transformations | – | 4 (0,15*25) |
Reference Implementation | 15 | 15 |
Application models and code | 85 | 41 (0,48*85) |
Total | 100 | 60 |
In real life, of course, the team’s learning effort, as well as the hiring of an MDSD expert as a coach for the first project, must also be considered.
We can make an empirical assessment: if an MDSD pilot project runs for six months or longer and the team consists of more than five people, the introduction of MDSD can pay off as early as during this first project, even if the whole production infrastructure must be built from scratch. Chapter 20 describes the adaptation strategies and prerequisites that should be observed.
As you may already have inferred from the previous section, architecture-centric MDSD offers a simple and low-risk adoption path for MDSD. As the figures in our examples show, roughly half of the sources to be maintained consist of models and transformations: the other half consist of ‘traditional’ source code for the reference implementation and handwritten application logic.
If a mature implementation of architecture-centric MDSD is available, the remaining traditional source code cannot be reduced any further. Efficiency can only be increased by applying the MDSD paradigm not just to the architectural/technical domain, but also to functional/professional domains. This allows us to uncover functional domain-specific commonalities and variabilities of applications via domain analysis or product-line engineering (Section 13.5) and to increase the abstraction level of the application models even further: the models then describe domain-related problems and configurations, instead of architectural aspects, effecting a further reduction of the modeling effort and particularly the effort for manual coding.
To exemplify this it helps to look at the code volume distribution during development of a number of applications in the same domain. Figure 18.6 sketches the reduction of code volume via introduction of a functional domain architecture during the development of three applications, while assuming that the application models and application-specific, handwritten source code can be reduced by 50%. Note that Figure 18.6 is of a purely illustrative nature: the real-life facts very much depend on the domain and its complexity. The more clearly the domain’s boundaries can be defined, the more functionality can be integrated into the functional MDSD platform in a reusable form.
In practice, the development of a functional domain architecture is an incremental process based on architecture-centric MDSD.
The development effort for functional domain-specific languages (DSLs) and the corresponding frameworks should not be underestimated. The successful development of such languages requires significant experience in the respective domain and should not be attempted in your first MDSD project.
There are also scenarios in which functional frameworks already exist that can be used without the help of model-driven generators, of course. In this case, MDSD can build on them as in the architecture-centric case – that is, the existing frameworks are regarded as the MDSD platform that defines the target architecture (see Chapter 7). A DSL is then derived from the configuration options. As in the architecture-centric case, productivity can be improved remarkably with only a moderate investment.
We have addressed some of the prejudices against MDSD in the preceding sections, as well as in Chapter 5. Some of these prejudices date back to the 1980s and 90s and stem from negative experiences with CASE tools, and are now projected onto model-driven approaches in general. Serious and important questions emerge: the answers to such questions are neither trivial nor easily found. Below we discuss answers to some of these questions:
MDSD enables productivity and quality gains as soon as during your first model-driven project. Specifically, the implementation effort can be reduced by half compared to traditional manual programming. Considering the necessary training effort for the introduction of MDSD, real savings are to be expected from the second project onwards once the team is familiar with the MDSD paradigm, a concrete set of tools, and the methodology. The use of MDSD with functional/professional MDSD platforms should be the second step (cascaded MDSD, see Section 8.2.8).
Unfortunately the amount of publicly-accessible data on the efficiency of Model-Driven Software Development and product line development is rather small. This is for several reasons: on one hand, no-one will come up with the idea of executing a real-life project twice in parallel just to collect comparative metrics – and even this would be only of limited value, due to the dissimilar boundary conditions. Furthermore, positive effects on time and quality are obvious to everyone involved in the project, while interesting metrics relate more to the incremental refinement of the approaches than to a comparison with completely manual approaches. Last but not least, not all companies mention the option of using generative techniques and the resulting savings if the customer is only interested in the delivered, traditional source code or the finished application.
The sources [PLP], [Bet02] and [Bet04c] contain further data and economically-relevant statements regarding the MDSD development process.
1 In the special case of a waterfall model, only one global ‘increment’ exists.
2 We don’t take the effort required for the development of a generator into account here.
3 Compare for example the architecture-centric UML profile in Chapter 3’s case study.
4 It is difficult to express this effect in figures.