This article on Kubernetes Jobs will take you through what actually is Kubernetes and its concepts. We get you detailed information on configuring the job file, different types of k8s jobs, and k8s pods. Also, check Kubernetes pod failure limit and jobs use cases in this article. If you are a person who is looking to get certified on Kubernetes, then Kubernetes certification training is the course for you.
Getting Started with Kubernetes Job
Before diving into jobs, it is better to understand a bit about Kubernetes pods.
Pods are like the building blocks of Kubernetes. It is the smallest execution unit in Kubernetes. A pod can contain one or more applications. So, when we deploy an application to Kubernetes, we are creating a pod with replicas (similar pods) and publish the containerized image into the pod, and the replicas.
Usually, pods are not defined to run forever. When a pod is deleted or terminated, there is no way to bring it back. All pods have a status that represents the current status of the pod. They are mentioned below:
- Pending: Pod is created and accepted by the cluster. But, one or more of its containers are not running.
- Running: Pod is bound to a corresponding node, and the containers are created. Also at least one container is running.
- Succeeded: All containers within the pod have successfully terminated. Those terminated pods will not restart.
- Unknown: Pod’s state cannot be determined
Back to jobs, it tracks the successful pod creations/ completions. And when all the defined number of pod creations are succussed, the job is referred to as complete.
Deleting a Job will clean up the Pods it created. Suspending a Job will delete its active Pods until the Job is resumed.
Kubernetes Job Definition File
So, Kubernetes resources are mainly executed with YAML files. So, for jobs also, we will be creating YAML file. Suppose we use VS code for this session. Go ahead and create a new file called sample-job.yml. And if you have the Kubernetes extension installed on VS code, just type job and it will give you the template to select.
Kubernetes job example is provided above. Let us talk a bit about the definition file. Here we have the apiVersion, which is defined as batch/v1. It is the default version for the job definition file provided by Kubernetes. Then we have the metadata of the job. Metadata is nothing but details about the job. Like name.
Then we have the specifications of the job. Like the container details, image name, and commands to execute when running the YAML file.
Once the job is executed, just run the following code to see the completed jobs.
kubectl get jobs
Multiple Parallel Jobs (Work Queue)
You can use queue services like RabbitMQ, Azure Queue storage to run multiple jobs at a time in a parallel manner. This is a time saving approach since the jobs are running parallel. There are couple of variations of this approach. Like Fine parallel processing, Coarse parallel processing …etc.
Kubernetes Job Failure and Concurrency Considerations
Kubernetes job is just a container running. As with any container, the final exit code determines if the run was successful or not. So, the job failed with a code other than 0, means the job has failed. There can be many reasons for a k8s job failure. Like running out of disk spaces.
When a k8s job fails, you can easily investigate logs and see what has happened. To do that, you can use;
kubectl command.
kubectl logs job.batch/sample-job
The Pod Failure Limit
A pod can exceed its memory request if the Node has memory available. But a pod is not allowed to use more than its memory limit. If a pod allocates more memory than its limit, the pod becomes a candidate for termination. If the pod continues to consume memory beyond its limit, the pod is terminated.
Limiting Kubernetes Job Execution Time
In another words, limiting the execution time is like giving a fixed deadline for the job to finish.
The best way to limit job execution time is setting activeDeadlineSeconds within the spec section of the job definition file.
As you can see here, we have set the activeDeadlineSeconds to 20. Which means, this job creation will last only 20 mins. After that, all running pods will be terminated. And the Job status will become type: Failed with reason: DeadlineExceeded.
Kubernetes Job Deletion and Cleanup
When a k8s job is referred to as complete, that means it is finished creating pods. Or all the pods are created successfully. So, there is no more pod creation or execution happening. When the pods are created, you can use the kubectl commands to check logs and see errors or warnings in the pods. It is up to the user to delete the jobs. The best way to delete a job is using kubectl command like below. Let us assume that you have a job definition file named sample-job.YAML.
kubectl delete –f sample-job.YAML
This will delete all pods that were created with the sample-job.YAML file.
By default, a job will execute uninterrupted unless there is a pod failure. Another way that we can fail the job is by defining a spec.backoffLimit in the job definition file. We use seconds to define that. Once the spec.backoffLimit is set and time is reached, the job is marked as failed, and any running pods will be terminated.
Another way of terminating the job is by setting an active deadline. Not just to fail the job, we can use that property to limit the execution time of the job. But the final result is the same. We can do this by setting the spec.activeDeadlineSeconds. Once the time is reached, all running pods created by the job are terminated. And the status of the job is set to type: Failed with reason: DeadlineExceeded
Note that the activeDeadlineSeconds take priority compared to the backoffLimit. Even if the backoffLimit is not reached and the activeDeadlineSeconds is reached, the job will be terminated.
Furthermore, if you wish to learn more about Docker and Kubernetes, best way to learn Docker and Kubernetes is a good course for you. Not just the training, it will lead you to a certification as well.
Kubernetes Jobs Use Cases
Running Automated Tasks with a CronJob
Let us look at what a CronJob is. CronJob creates K8s jobs on a repeating schedule. A CronJob object is like a line of a Cron table, written in Cron format. CronJob runs a job periodically on a given schedule. So simply, CronJobs are Kubernetes scheduled jobs.
Mainly CronJobs are used to perform regular scheduled actions such as report generations, backups, …etc. These tasks should be configured to recur indefinitely. For an example once a day, week, or a month.
Let us see how the CronJob definition files look like.
As you can see here, the file is somewhat the same as the K8s job definition file. One noticeable change is the kind. Here the kind is defined as “CronJob” while in k8s job, the kind is just “Job”
And another noticeable thing is the schedule property. Let us talk about the value “* * * *”
- The first * is for the minutes. And the value can be from 0 – 59
- Second * is for the hour. The value can be from 0 – 23
- Third * is for the day of the month. Value can be from 1 – 31
- Fourth * is for the month. Value can be from 1 – 12
- Additionally, you can specify another star here, just to mention the day of the week
(E.g.: Monday, Tuesday …etc.). Value can be from 0 - 6
Let us take 0 0 13 * 5 as an example. This means, the task should start on every Friday at midnight, as well as on the 13th of each month at midnight.
Coarse Parallel Processing Using a Work Queue
As discussed we are using a queue service like RabbitMQ or Azure Queue storage to store tasks. When we execute the job with parallel processing, and when each pod is created, it picks up one unit of work from the queue, completes it, and then deletes it, and finally exits.
As the first step, we create Queue, and fill it up with the tasks that need to be executed. After that, the job starts several pods. Every pod picks a single task from the queue and processes it. This process is repeated until no more tasks are left in the queue.
Talking about the advantages of coarse parallel processing, you don’t have to modify the worker program, AKA the pods, to be aware of the queue service.
But this method does require you to run a message queue service like RabbitMQ. If you don’t want to use a queue service, it is better to use another pattern like Queue with pod per work item, Queue with variable pod count ...etc.
Fine Parallel Processing Using a Work Queue
This is another parallel processing method associated with a queue service like RabbitMQ or Azure queue storage. Here, when a pod is created, the pod picks up a task from the queue and processes it. This process is repeated until there are no more tasks left in the queue.
Indexed Job for Parallel Processing with Static Work Assignment
As per the name, you can see that here an index is used. Each worker is a container in its own pod. And each pod has an index. The index allows each pod to identify and work on a part of the overall task.
This index is a string consisting of a dynamic value to create the index, k8s have defined a method called the downward API mechanism. Once the index is created, you will be able to see a variable called JOB_COMPLETION_INDEX in the environment variable section.
Parallel Processing Using Expansions
What this simply means is you can use a common job template to run multiple jobs. Of course, you must change some values when you are executing the job file. But you can use it as a template. Let us see an example.
Here you can see a Kubernetes job template named as job-template.yml
This is not a valid manifest file yet. You need to replace $ITEM with a value. We can create two new files with this template:
Now, we have two valid manifest files that were created from the template. To execute the files, you just need to use kubectl command, but this time with the path to the files. We have these two files inside a folder called expansion. So, the command can be executed as below:
kubectl apply -f ./expansion
See? Easy.
Conclusion
Now that we have explained everything in detail let us sum up what we learned. We investigated Kubernetes pods as a start and then learned what a Kubernetes job is along with the definition file. Then we investigated work queues, like job and pod failures. Also, we investigated pod failure limits as well.
Furthermore, we investigated limiting job execution time, job deletion and cleanup, and some solid use cases of Kubernetes jobs including CronJobs, Parallel processing with queues. If you are planning to get a certification for your preferred DevOps path, you can go for KnowledgeHut best devops courses online. It has more areas that can guide you to the certificates.