13.7. Software Considerations for MPSOCs

Although this book is largely about hardware design of SOCs, any book that advocates the broad use of processors for implementing on-chip tasks must address the software issues. The rest of this chapter discusses this topic.

At a superficial level, MPSOCs look like large multiprocessor systems that present many problems to software developers. The electronics industry has yet to develop effective, automated methods for partitioning large programs and distributing the load across large numbers of processors. However, the heterogeneous collections of task-specific processing nodes described throughout this book are not at all like large homogeneous processing arrays and they are not programmed in the same manner. Task specificity is the key to dividing and conquering large problems.

Figure 13.15 again presents the floor plan of a multimedia SOC discussed earlier in this chapter. In addition to the SOC’s main CPU, the figure shows four main processing blocks: audio, video, network, and mass storage. Each of these processing blocks has a clearly defined job on the SOC and there’s very little question about the duties of each block at this high level. Consequently, the code that needs to run on each block is also clearly evident. In fact, the code for this SOC will not have been written in one monolithic block. The audio and video decoders will likely run audio and video codecs written as standalone C programs. Similarly, the network block will run software network stacks developed as a separate networking package. Each of these programs will likely communicate with the others using messaging protocols.

Figure 13.15. A multimedia SOC.


This “divide-and-conquer” approach to the programming of large MPSOCs is essential to handling the exploding complexity of 21st-century systems. For one very practical reason, code for these systems cannot be written as a monolithic block to be later partitioned. Quite simply, if the code for a large system is written as a single, large program then code development will most assuredly miss the project schedule, probably by a wide margin. Figure 13.16, taken from Jack Ganssle’s article “Subtract software costs by adding CPUs” that’s noted in the chapter references, tells the entire, dismal story.

Figure 13.16. Programmer productivity plummets as a function of increasing program size, as predicted by the COCOMO model.


Figure 13.16 plots programmer productivity versus program size, where program size is measured in thousands of lines of code (KLOC). The plot is based on Barry Boehm’s COCOMO (Constructive Cost Model), which considers 15 cost drivers related to software development. As software complexity rises, programming productivity plummets. Larger programs require more programmers (if completed in the same amount of time) and the presence of more programmers creates many more communications channels between programmers, which in turn creates many more opportunities for miscommunication—and therefore opportunities for the insertion of program bugs that further degrade productivity. Figure 13.17 shows the effect of applying a “divide-and-conquer” software-development strategy. By slicing large programs into 20-KLOC chunks, the programming schedule stays linear with the number of LOC instead of rising exponentially.

Figure 13.17. Programmer productivity stays linear with increasing program size if the programming task is sliced into 20-KLOC chunks.


There are at least two ways to run the partitioned software. One way is to run all of the smaller programs on one large, fast processor. This is the PC’s software model. It requires multi-GHz processors that dissipate many tens of Watts. This is not an effective model for SOC development. As industry observer and commentator Jack Ganssle writes:

A single CPU manages a disparate array of sensors, switches, communications links, PWMs, and more. Dozens of tasks handle many sorts of mostly unrelated activities. A hundred thousand lines of code all linked into a single executable enslaves dozens of programmers all making changes throughout a Byzantine structure no one completely comprehends. Of course development slows to a crawl.

The other way to run partitioned software is to distribute task-specific code across several heterogeneous processing blocks. This is the approach advocated by this book and by the ITRS road map. It’s also the approach advocated by Jack Ganssle, who gets the last words in this chapter:

Suppose the monolithic, single-CPU version of the product requires 100 K lines of code. The COCOMO calculation gives a 1,403 man-month development schedule.

Segment the same project into four processors, assuming one has 50 KLOC and the others 20 KLOC each. Communication overhead requires a bit more code so we’ve added 10% to the 100-KLOC base figure. The schedule collapses to 909 man-months, or 65% of that required by the monolithic version.

Maybe the problem is quite orthogonal and divides neatly into many small chunks, none being particularly large. Five processors running 22 KLOC each will take 1,030 man-months, or 73% of the original, not-so-clever design.

Transistors are cheap—so why not get crazy and toss in lots of processors? One processor runs 20 KLOC and the other nine each run 10-KLOC programs. The resulting 724 man-month schedule is just half of the single-CPU approach. The product reaches consumers’ hands twice as fast and development costs tumble. You’re promoted and get one of those hot foreign company cars plus a slew of appreciating stock options. Being an engineer was never so good.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset