1.2. Challenges of Distributed Computing

Despite their benefits, distributed applications can be notoriously difficult to design, build, and debug. The distributed environment introduces many complexities that aren't concerns when writing standalone applications. Perhaps the most obvious complexity is the variety of machine architectures and software platforms over which a distributed application must commonly execute. In the past, this heterogeneity problem has thwarted the development and proliferation of distributed applications: developing an application entailed porting it to every platform it would run on, as well as managing the distribution of platform-specific code to each machine. More recently, the Java virtual machine has eased this burden by providing automatic loading of class files across a network, along with a common virtual machine that runs on most platforms and allows applications to achieve “Write once, Run anywhere™” status.

The realities of a networked environment present many challenges beyond heterogeneity. By their very nature, distributed applications are built from multiple (potentially faulty) components that communicate over (potentially slow and unreliable) network links. These characteristics force us to address issues such as latency, synchronization, and partial failure that simply don't occur in standalone applications. These issues have an significant impact on distributed application design and development. Let's take a closer look at each one:

Latency: In order to collaborate, processes in a distributed application need to communicate. Unfortunately, over networks, communication can take a long time relative to the speed of processors. This time lag, called latency, is typically several orders of magnitude greater than communication time between local processes on the same machine. As much as we'd like to sweep this disparity under the rug, ignoring it is likely to lead to poor application performance. As a designer, you must account for latency into order to write efficient applications.

Synchronization: To cooperate with each other, processes in a distributed application need not only to communicate, but also to synchronize their actions. For example, a distributed algorithm might require processes to work in lock step—all need to complete one phase of an algorithm before proceeding to the next phase. Processes also need to synchronize (essentially, wait their turn) in accessing and updating shared data. Synchronizing distributed processes is challenging, since the processes are truly asynchronous—running independently at their own pace and communicating, without any centralized controller. Synchronization is an important consideration in distributed application design.

Partial failure: Perhaps the greatest challenge you will face when developing distributed systems is partial failure: the longer an application runs and the more processes it includes, the more likely it is that one or more components will fail or become disconnected from the execution (due to machine crashes or network problems). From the perspective of other participants in a distributed computation, a failed process is simply “missing in action,” and the reasons for failure can't be determined. Of course, in the case of a standalone application, partial failure is not an issue—if a single component fails, then the entire computation fails, and we either restart the application or reboot the machine. A distributed system, on the other hand, must be able to adapt gracefully in the face of partial failure, and it is your job as the designer to ensure that an application maintains a consistent global state (a tricky business).

These challenges are often difficult to overcome and can consume a significant amount of time in any distributed programming project. These difficulties extend beyond design and initial development; they can plague a project with bugs that are difficult to diagnose. We'll spend a fair amount of time in this book discussing features and techniques the JavaSpaces technology gives us for approaching these challenges, but first we need to lay a bit of groundwork.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset