Chapter 1. Introducing Storm
Figure 1.1. A batch processor and how data flows into it
Figure 1.2. A stream processor and how data flows into it
Figure 1.3. Example of how Storm may be used within a system
Figure 1.4. How Storm can be used with other technologies
Chapter 2. Core Storm concepts
Figure 2.1. Mock-up of dashboard for a running count of changes made to a repository
Figure 2.4. Design mapped to the definition of a Storm topology
Figure 2.5. Format for displaying tuples in figures throughout the book
Figure 2.7. Identifying the two streams in our topology
Figure 2.8. Topology with four streams
Figure 2.9. A spout reads from the feed of commit messages.
Figure 2.12. Individual instances of a bolt can emit to any number of instances of another bolt.
Figure 2.13. Each stream in the topology will have its own stream grouping.
Figure 2.14. Using a shuffle grouping between our spout and first bolt
Figure 2.16. Storm’s class hierarchy for the spout
Chapter 3. Topology design
Figure 3.1. Using check-ins to build a heat map of bars
Figure 3.2. Transforming input tuples to end tuples via a series of operations
Figure 3.3. Heat map design mapped to Storm concepts
Figure 3.9. Focusing our parallelization changes on the Checkins spout
Figure 3.11. Four Checkins instances emitting tuples to eight GeocodeLookup instances
Figure 3.13. Executors (threads) and tasks (instances of spouts/bolts) run on a JVM.
Figure 3.14. Updated topology with the TimeIntervalExtractor bolt
Figure 3.15. Updated topology stream groupings
Figure 3.16. Parallelizing all the components in our topology
Figure 3.18. The heat map topology design as a series of functional components
Figure 3.19. The HeatMap topology design as points of repartition
Chapter 4. Creating robust topologies
Figure 4.1. Conceptual solution of the e-commerce credit card authorization flow
Figure 4.2. E-commerce credit card authorization mapped to Storm concepts
Figure 4.5. Initial state of the tuple tree
Figure 4.6. Tuple tree after the AuthorizeCreditCard bolt emits a tuple
Figure 4.7. Tuple tree after the AuthorizeCreditCard bolt acks its input tuple
Figure 4.8. Tuple tree after the ProcessedOrderNotification bolt acks its input tuple
Chapter 5. Moving from local to remote topologies
Figure 5.1. Nimbus and Supervisors and their responsibilities inside a Storm cluster
Figure 5.2. The Zookeeper Cluster and its role within a Storm cluster
Figure 5.5. Extracted contents of a Storm release zip
Figure 5.6. Extracted contents of a Storm release zip
Figure 5.7. The command for deploying your topology to a Storm cluster
Figure 5.8. The Cluster Summary screen shows details for the entire Storm cluster.
Figure 5.9. The Topology summary screen shows details for a specific topology.
Figure 5.10. The Cluster Summary screen of the Storm UI
Figure 5.11. The Cluster Summary section on the Cluster Summary screen of the Storm UI
Figure 5.12. The Topology summary section on the Cluster Summary screen of the Storm UI
Figure 5.13. The Supervisor summary section on the Cluster Summary screen of the Storm UI
Figure 5.14. The Nimbus Configuration section on the Cluster Summary screen of the Storm UI
Figure 5.15. The Topology summary screen of the Storm UI
Figure 5.16. The Topology summary section on the Topology summary screen of the Storm UI
Figure 5.17. The Topology actions section on the Topology summary screen of the Storm UI
Figure 5.18. The Topology stats section on the Topology summary screen of the Storm UI
Figure 5.19. The Spouts section on the Topology summary screen of the Storm UI
Figure 5.20. The Bolts section on the Topology summary screen of the Storm UI, up to Capacity column
Figure 5.21. The Bolts section on the Topology summary screen of the Storm UI, remaining columns
Figure 5.22. The Topology Configuration section on the Topology summary screen of the Storm UI
Figure 5.23. The bolt summary screen in the Storm UI
Figure 5.24. The Component Summary section for a bolt in the Storm UI
Figure 5.25. The Bolt stats section in the Storm UI
Figure 5.26. The Input stats section for a bolt in the Storm UI
Figure 5.27. The Output stats section for a bolt in the Storm UI
Figure 5.28. The Executors section for a bolt in the Storm UI, through Capacity column
Figure 5.29. The Executors section for a bolt in the Storm UI, remaining columns
Chapter 6. Tuning in Storm
Figure 6.1. The Find My Sale! topology: its components and data points
Figure 6.2. The Find My Sale! design mapped to Storm concepts
Figure 6.3. The spout emits a tuple for each customer ID that it receives.
Figure 6.7. Topology summary screen of the Storm UI
Figure 6.8. Identifying the important parts of the Storm UI for our tuning lesson
Figure 6.11. Storm UI showing improved capacity for all three bolts in our topology
Figure 6.13. LatencySimulator constructor arguments explained
Figure 6.14. Storm UI showing the last error for each of our bolts
Chapter 7. Resource contention
Figure 7.2. Many worker processes running on a worker node
Figure 7.3. Diagnosing issues for a particular topology in the Storm UI
Figure 7.6. Storm UI: Zero free slots could mean your topologies are suffering from slot contention.
Figure 7.9. GC log output showing details of a minor garbage collection of young generation memory
Figure 7.11. GC log output showing entire heap values and complete GC time
Figure 7.13. sar command breakdown
Figure 7.14. Output of sar –S 1 3 for reporting swap space utilization
Figure 7.15. Output of sar –u 1 3 for reporting CPU utilization
Figure 7.18. Output of sar –u 1 3 for reporting CPU utilization and, in particular, I/O wait times
Chapter 8. Storm internals
Figure 8.1. Commit count topology design and flow of data between the spout and bolts
Figure 8.4. Focusing on data flowing into the spout
Figure 8.6. Focusing on passing tuples within the same JVM
Figure 8.8. Focusing on a bolt that emits a tuple
Figure 8.9. The executor for our bolt, with two threads and two queues
Figure 8.10. Focusing on sending tuples between JVMs
Figure 8.12. Focusing on a bolt that does not emit a tuple
Figure 8.15. An executor with multiple tasks
Figure 8.16. Mapping out the steps taken when determining the destination task for an emitted tuple
Figure 8.17. Snapshot of a debug log output for a bolt instance
Figure 8.18. Breaking down the debug log output lines for the send/receive queue metrics
Chapter 9. Trident
Figure 9.2. Distribution of a Kafka topic as group of partitions on many Kafka brokers
Figure 9.4. Trident topology for internet radio application
Figure 9.5. The Trident Kafka spout will be used for handling incoming play logs.
Figure 9.9. Breaking down the stateQuery operation
Figure 9.10. Our Trident topology broken down into spouts and bolts in the Storm UI
Figure 9.11. Our Trident topology displayed on the Storm UI after naming each of the operations
Figure 9.12. How our Trident operations are mapped down into bolts
Figure 9.13. The Storm UI with named operations for both the Trident topology and DRPC stream
Figure 9.14. How the Trident and DRPC streams and operations are being mapped down into bolts
Figure 9.16. Partitioned stream with a series of batches in between two operations
Figure 9.17. Kafka topic partitions and how they relate to the partitions within a Trident stream