Parallel Data Processing

I remember sitting in front of my ZX Spectrum with 64 KB of memory (16 KB ROM + 48 KB RAM) with an old tape recorder plugged in, and a newly bought cassette inserted. Among the relatively large amount of programs on the cassette, there was one that specifically drew my attention. Not that it was able to do anything special; after all, it simply computed personal biorhythm graphs based on the date of birth (in fact, I had to enter the current date too) and plotted them on the screen. There wasn't even any sophistication in the algorithm (how ever sophisticated an algorithm may be when it is all about calculation of sine over some value). What seemed to be interesting was the Wait while results are being processed message, which had some kind of a progress bar that appeared for for almost half a minute (yes, I was naive enough to think that some calculations were really taking place "behind" the message), and the three graphs being plotted simultaneously. Well, it looked as if they were being plotted simultaneously.

The program was written in BASIC, so reversing it was a fairly easy task. Easy but disappointing. Of course, there was no parallel processing when plotting the graphs, simply the same function, sequentially called for each graph on each point.

Obviously, the ZX Spectrum was not the right platform to look for parallel processing capabilities. Intel architecture, on the other hand, provides us with such a mechanism. In this chapter, we will examine a few capabilities provided by the Streaming SIMD Extension (SSE), which allows simultaneous computations on the so-called packed integers, the packed single precision or packed double precision floating point numbers that are contained in 128-bit registers.

We will begin the chapter with a brief introduction to the SSE technology, reviewing available registers and access modes thereof. Later, we will proceed to the implementation of the algorithm itself, which involves parallel operations of single precision floating point values related to all three biorhythms.

Some steps, which are essential for biorhythmic graph calculation and are trivial when implemented in high-level languages, like calculation of sine, exponentiation, and factorial, will be covered in more detail, as we do not have access (at this moment) to any math library; hence, we have no ready-to-use implementation of the procedures involved in the aforementioned calculations. We will make our own implementation for each step.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset