Storm internode communication

To begin with, Storm has been using ZeroMQ as the communication channel for internode communications. In version 0.9, it was experimentally replaced by Netty and in version 0.9.2, Netty has been completely adopted as a replacement for ZeroMQ. In this section we will touch upon both ZeroMQ and Netty because it's very important to understand what makes them different and why these were chosen by the implementers of Storm over their peers.

Storm internode communication

The preceding figure clearly describes how the different components communicate with each other using LMAX, ZeroMQ, or Netty based on whether they are executing within same worker on same node or not.

ZeroMQ

ZeroMQ is not a fully fledged messaging system such as AMQP, Rabbit MQ, and so on. It's a library that can be extended and used in building truly performance-oriented messaging systems for lean, mean situations. It's not a full-fledged framework, and it's an extensible toolkit—an asynchronous messaging library that serves the purpose fairly well. It is highly performant without any overkilling of extra baggage of the various layers that are transient to the frameworks. This library has all the necessary components that can be used to arrive at a quick messaging solution that is efficient and can be integrated with a large number of programming components through variety of adapters. It is a lightweight, asynchronous, and ultrafast messaging toolkit that can be used by developers for crafting high-performing solutions.

The implementation of ZeroMQ is in C++, so it doesn't run into the performance and GC issues that, generally, JVM-based applications do.

Here are some of the key aspects that make this library a choice above the rest for communication between different workers on same node or different nodes:

  • It's a lightweight, asynchronous socket library that can be crafted to behave as a high performing concurrency framework
  • It's faster than TCP, and ideal for internode communication in clustered setups
  • It's not a networking protocol, but adapts to a wide variety such as IPC, TCP, multicast, and so on
  • An asynchronous flavor helps to build a scalable I/O model for multicore message transferring apps
  • It has a variety of inbuilt messaging patterns such as fan-out, pub-sub, request-reply, pipeline, and so on

Storm ZeroMQ configurations

Well, there are very simple configurations that are marked in under storm.yaml:

//Local mode is to use ZeroMQ as a message system, if set to false, using the Java message system. The default is false
storm.local.mode.zmq : false
// Each worker process used in zeromq communication number of threads
zmq.threads : 1

The first one storm.local.mode.zmq controls whether ZeroMQ or Netty has to be used. In the latest version, by default, the inter-worker and inter-node communication is handled using Netty; thus, this value is false.

The zmq.threads parameter controls the number of threads that will be spawned for every worker. The more the merrier doesn't always work. We should never end up spawning more than our cores can handle, else there will be loads of cycles lost due to contention. The default value is one per worker.

Netty

The preceding section that describes ZeroMQ clearly states that ZeroMQ was a perfect choice for Storm. But, over the course of time, Storm adopters have realized that the issues that using ZeroMQ as a transport layer was posing and, being a native library, it ran into platform specific issues. In the early days, the installation of Storm with ZeroMQ was not simple, and it was downloaded and built for every installation. One more significant issue was that Storm was tightly coupled to ZeroMQ and worked with the relatively old 2.1.7 version.

After understanding the need, let's get introduced to Netty. Netty is a client-server messaging solution that's based on an NIO framework. It enables quick development and boasts ease of use with a relatively simplified design. It's high performing and scalable, along with having features like extensibility and flexibility.

The aspects that make Netty better than its peers are as follows:

  • Low latency operations with non-blocking async operations
  • Higher throughput
  • Resource consumption is low
  • Memory copy is minimized
  • Secure SSL and TLS support

It provides a pluggability to Storm where the developers can choose between the two transport layer options that are ZeroMQ and Netty, just by means of a few settings in storm.yaml:

storm.messaging.transport: "backtype.storm.messaging.netty.Context"
storm.messaging.netty.server_worker_threads: 1
storm.messaging.netty.client_worker_threads: 1
storm.messaging.netty.buffer_size: 5242880
storm.messaging.netty.max_retries: 100
storm.messaging.netty.max_wait_ms: 1000
storm.messaging.netty.min_wait_ms: 100

The configuration starts by telling Storm that the transport layer is Netty. We will have to keep storm.local.mode.zmq as false and storm.messaging.transport as backtype.storm.messaging.netty.context.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset