Chapter 8. Periodic Job

The Periodic Job pattern extends the Batch Job pattern by adding a time dimension and allowing the execution of a unit of work to be triggered by a temporal event.

Problem

In the world of distributed systems and microservices, there is a clear tendency toward real-time and event-driven application interactions using HTTP and lightweight messaging. However, regardless of the latest trends in software development, job scheduling has a long history, and it is still relevant. Periodic Jobs are commonly used for automating system maintenance or administrative tasks. They are also relevant to business applications requiring specific tasks to be performed periodically. Typical examples here are business-to-business integration through file transfer, application integration through database polling, sending newsletter emails, and cleaning up and archiving old files.

The traditional way of handling Periodic Jobs for system maintenance purposes has been the use of specialized scheduling software or Cron. However, specialized software can be expensive for simple use cases, and Cron jobs running on a single server are difficult to maintain and represent a single point of failure. That is why, very often, developers tend to implement solutions that can handle both the scheduling aspect and the business logic that needs to be performed. For example, in the Java world, libraries such as Quartz, Spring Batch, and custom implementations with the ScheduledThreadPoolExecutor class can run temporal tasks. But similarly to Cron, the main difficulty with this approach is making the scheduling capability resilient and highly available, which leads to high resource consumption. Also, with this approach, the time-based job scheduler is part of the application, and to make the scheduler highly available, the whole application must be highly available. Typically, that involves running multiple instances of the application, and at the same time, ensuring that only a single instance is active and schedules jobs—which involves leader election and other distributed systems challenges.

In the end, a simple service that has to copy a few files once a day may end up requiring multiple nodes, a distributed leader election mechanism, and more. Kubernetes CronJob implementation solves all that by allowing scheduling of Job resources using the well-known Cron format and letting developers focus only on implementing the work to be performed rather than the temporal scheduling aspect.

Solution

In Chapter 7, Batch Job, we saw the use cases and the capabilities of Kubernetes Jobs. All of that applies to this chapter as well since the CronJob primitive builds on top of a Job. A CronJob instance is similar to one line of a Unix crontab (cron table) and manages the temporal aspects of a Job. It allows the execution of a Job periodically at a specified point in time. See Example 8-1 for a sample definition.

Example 8-1. A CronJob resource
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: random-generator
spec:
  # Every three minutes
  schedule: "*/3 * * * *"       1
  jobTemplate:
    spec:
      template:                 2
        spec:
          containers:
          - image: k8spatterns/random-generator:1.0
            name: random-generator
            command: [ "java", "-cp", "/", "RandomRunner", "/numbers.txt", "10000" ]
          restartPolicy: OnFailure
1

Cron specification for running every three minutes

2

Job template that uses the same specification as a regular Job

Apart from the Job spec, a CronJob has additional fields to define its temporal aspects:

.spec.schedule

Crontab entry for specifying the Job’s schedule (e.g., 0 * * * * for running every hour).

.spec.startingDeadlineSeconds

Deadline (in seconds) for starting the Job if it misses its scheduled time. In some use cases, a task is valid only if it executed within a certain timeframe and is useless when executed late. For example, if a Job is not executed in the desired time because of a lack of compute resources or other missing dependencies, it might be better to skip an execution because the data it is supposed to process is obsolete already.

.spec.concurrencyPolicy

Specifies how to manage concurrent executions of Jobs created by the same CronJob. The default behavior Allow creates new Job instances even if the previous Jobs have not completed yet. If that is not the desired behavior, it is possible to skip the next run if the current one has not completed yet with Forbid or to cancel the currently running Job and start a new one with Replace.

.spec.suspend

Field suspending all subsequent executions without affecting already started executions.

.spec.successfulJobsHistoryLimit and .spec.failedJobsHistoryLimit

Fields specifying how many completed and failed Jobs should be kept for auditing purposes.

CronJob is a very specialized primitive, and it applies only when a unit of work has a temporal dimension. Even if CronJob is not a general-purpose primitive, it is an excellent example of how Kubernetes capabilities build on top of each other and support noncloud-native use cases as well.

Discussion

As you can see, a CronJob is a pretty simple primitive that adds clustered, Cron-like behavior to the existing Job definition. But when it is combined with other primitives such as Pods, container resource isolation, and other Kubernetes features such as those described in Chapter 6, Automated Placement, or Chapter 4, Health Probe, it ends up being a very powerful Job scheduling system. This enables developers to focus solely on the problem domain and implement a containerized application that is responsible only for the business logic to be performed. The scheduling is performed outside the application, as part of the platform with all of its added benefits such as high availability, resiliency, capacity, and policy-driven Pod placement. Of course, similar to the Job implementation, when implementing a CronJob container, your application has to consider all corner and failure cases of duplicate runs, no runs, parallel runs, or cancellations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset