Job Definitions, Task Types, and Tasks#
Overview#
Omniverse Farm is a scheduling system for executing configurable tasks on Windows or Linux systems. Farm organizes and processes workloads using three key components:
A task represents a unit of work submitted to Farm for execution.
Each task specifies a task type, which determines how it should be executed.
The task type corresponds to a job definition, which provides execution details such as the command to run, required arguments, and environment settings.
The mapping between a task’s task type and a job definition’s name allows Farm to execute tasks correctly. When a task is submitted, a Farm agent looks up the job definition by its name, merges the task-specific information with the predefined job execution details, and then runs the task accordingly.
This guide will explore job definitions, differentiate between base and kit-service job definitions, and explain how tasks are structured in relation to job definitions.
Job Definitions#
A job definition serves as a blueprint for executing tasks. It defines how Farm should process a task by specifying the execution details, including:
The command to run
The arguments required for execution
The environment settings
The name, which serves as the identifier that tasks reference via their task type
Each job definition contains a name, which uniquely identifies it within a Farm instance. When submitting a task, the user specifies the task type, which must match the name of an existing job definition or the task will stay pending in submitted
until a matching job definition is found.
Job Definition Schema#
A job definition consists of the following properties:
Property |
Type |
Description |
---|---|---|
|
|
A unique identifier for the job definition. Tasks use this as their |
|
|
The type of job, either |
|
|
The application or script to run. |
|
|
Module to execute when specifying a |
|
|
The directory in which the command should be executed. |
|
|
List of return codes that indicate a successful execution. |
|
|
Static arguments that apply to all tasks using this job definition. |
|
|
Arguments that may change per task, defined with default values. |
|
|
Environment variables required for execution. |
|
|
Paths to any additional Kit extensions required. |
|
|
Whether to capture |
|
|
Indicates if the job should run without a GUI. |
|
|
Specifies if the job definition is enabled. |
|
|
The Docker image location for containerized execution. The |
|
|
Defines resource requirements such as CPU and memory. Kubernetes only. |
capacity_requirements
schema reference (Kubernetes only)
The following contains a list of capacity_requirements
properties available if deployed within a Kubernetes environment.
The following properties are specific to the container-v1-core and podspec-v1-core from Kubernetes version 1.24.
Two special properties are provided container_spec_field_overrides
and pod_spec_field_overrides
for specifying fields that may come in future Kubernetes specs.
Container core properties
Property |
Type |
Description |
---|---|---|
|
|
Special property that does not apply to any particular Kubernetes field, instead this can used to inject fields that may be added in future Kubernetes releases. [job.sample-job.capacity_requirements.container_spec_field_overrides]
futureKuberneteContainerCoreField = "foobar"
|
|
|
List of environment variables to set in the job’s container pod env. [[job.sample-job.capacity_requirements.env]]
name = "foo"
value = "bar"
|
|
|
List of sources to populate environment variables in the job’s container pod env from. [[job.sample-job.capacity_requirements.envFrom]]
[job.sample-job.capacity_requirements.envFrom.configMapRef]
name = "sample-config"
|
|
|
The image pull policy for the job’s container image image pull policy. [job.sample-job.capacity_requirements]
image_pull_policy = "Always"
|
|
|
Specify the job’s container lifecycle lifecycle. [job.sample-job.capacity_requirements.lifecycle.postStart.exec]
command = [
"/bin/sh",
"-c",
"echo Hello from the postStart handler > /usr/share/message"
]
[job.sample-job.capacity_requirements.lifecycle.preStop.exec]
command = [
"/bin/sh",
"-c",
"sleep 1"
]
|
|
|
Specify the job’s container pod liveness probe. [job.sample-job.capacity_requirements.liveness_probe]
[job.sample-job.capacity_requirements.liveness_probe.httpGet]
path = "/status"
port = "http"
|
|
|
Specify the job’s container pod container ports. [[job.sample-job.capacity_requirements.ports]]
name = "http"
containerPort = 80
protocol = "TCP"
|
|
|
Specify the job’s container pod resource limits. Refer to resource units for acceptable units. [job.sample-job.capacity_requirements.resource_limits]
cpu = 1
memory = "4096Mi"
"nvidia.com/gpu" = 1
|
|
|
Specify the job’s container pod readiness probe. [job.sample-job.capacity_requirements.readiness_probe]
[job.sample-job.capacity_requirements.readiness_probe.httpGet]
path = "/status"
port = "http"
|
|
|
Specify the job’s container pod, security context. [job.sample-job.capacity_requirements.security_context]
runAsUser = 2000
allowPrivilegeEscalation = false
|
|
|
Specify the job’s container pod startup probe. [job.sample-job.capacity_requirements.startup_probe]
[job.sample-job.capacity_requirements.startup_probe.httpGet]
path = "/status"
port = "http"
|
|
|
Control whether the job’s container should allocate a buffer for stdin in the container runtime stdin. [job.sample-job.capacity_requirements]
stdin = true
|
|
|
Control whether the job’s container runtime should close the stdin channel after it has been opened by a single attach stdin once. [job.sample-job.capacity_requirements]
stdin_once = false
|
|
|
Path at which the file to which the container’s termination message will be written is mounted into the container’s filesystem termination message path. [job.sample-job.capacity_requirements]
termination_message_path = "/dev/termination-log"
|
|
|
Indicate how the termination message should be populated termination message policy. [job.sample-job.capacity_requirements]
termination_message_policy = "File"
|
|
|
Control whether the job’s container should allocate a TTY for itself, also requires ‘stdin’ to be true tty. [job.sample-job.capacity_requirements]
tty = true
|
|
|
Specify the job’s container pod volume devices volume devices. [[job.sample-job.capacity_requirements.volume_devices]]
devicePath = "/myrawblockdevice"
name = "blockDevicePvc"
|
|
|
Specify the job’s container pod volume mounts. [[job.sample-job.capacity_requirements.volume_mounts]]
mountPath = "/root/.provider/"
name = "creds"
|
Pod Spec Properties
Property |
Type |
Description |
---|---|---|
|
|
Special property that does not apply to any particular Kubernetes field, instead this can used to inject fields that may be added in future Kubernetes releases. [job.sample-job.capacity_requirements.pod_spec_field_overrides]
futureKubernetesPodSpecField = "foobar"
|
|
|
Duration in seconds the pod may be active on the node relative to StartTime before the system will actively try to mark it failed and kill associated containers active deadline seconds. [job.sample-job.capacity_requirements]
active_deadline_seconds = 30
|
|
|
Specify the job’s container pod affinity. [[job.sample-job.capacity_requirements.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms]]
[[job.sample-job.capacity_requirements.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms.matchExpressions]]
key = "name"
operator = "In"
values = [ "worker-node" ]
[[job.sample-job.capacity_requirements.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution]]
weight = 1
[[job.sample-job.capacity_requirements.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution.preference.matchExpressions]]
key = "type"
operator = "In"
values = [ "01" ]
|
|
|
Indicate whether a service account token should be automatically mounted automount service account token. [job.sample-job.capacity_requirements]
active_deadline_seconds = 30
|
|
|
Specifies the DNS parameters of the job’s container pod dns config. [job.sample-job.capacity_requirements.dnsConfig]
nameservers = [ "1.2.3.4" ]
searches = [ "ns1.svc.cluster-domain.example", "my.dns.search.suffix" ]
[[job.sample-job.capacity_requirements.dnsConfig.options]]
name = "ndots"
value = "2"
[[job.sample-job.capacity_requirements.dnsConfig.options]]
name = "edns0"
|
|
|
Set DNS policy for the job’s container pod dns policy. [job.sample-job.capacity_requirements]
dns_policy = "ClusterFirst"
|
|
|
Indicates whether information about services should be injected into pod’s environment variables enable service links. [job.sample-job.capacity_requirements]
enable_service_links = true
|
|
|
List of ephemeral containers run in the job’s container pod. ephemeral containers. |
|
|
List of hosts and IPs that will be injected into the pod’s hosts file if specified. This is only valid for non-hostNetwork pods. host aliases. |
|
|
Use the host’s IPC namespace host IPC. |
|
|
Host networking requested for the job’s container pod host network. |
|
|
Use the host’s PID namespace host PID. |
|
|
Specifies the hostname of the Pod hostname. |
|
|
List of references to secrets in the same namespace to use for pulling any of the images image pull secrets. [[job.sample-job.capacity_requirements.imagePullSecrets]]
name = "registry-secret"
|
|
|
List of initialization containers init containers. |
|
|
Node name is a request to schedule this pod onto a specific node node name. |
|
|
Selector which must be true for the pod to fit on a node node selector. [job.sample-job.capacity_requirements.node_selector]
"beta.kubernetes.io/instance-type" = "worker"
"beta.kubernetes.io/os" = "linux"
|
|
|
Specifies the OS of the containers in the pod os. |
|
|
Overhead represents the resource overhead associated with running a pod for a given RuntimeClass overhead. |
|
|
Policy for preempting pods with lower priority preemption policy. |
|
|
Priority value priority. |
|
|
Indicate the pod’s priority priority class name. |
|
|
Pod’s readiness gates. |
|
|
Set the pod’s runtime class name. |
|
|
Specific scheduler to dispatch the pod scheduler name. |
|
|
Specify the job’s container pod, pod security context. [job.sample-job.capacity_requirements.pod_security_context]
runAsUser = 1000
|
|
|
Set the pod’s service account. |
|
|
Name of the service account to use to run this pod service account name. |
|
|
The pod’s hostname will be configured as the pod’s FQDN set hostname as FQDN. |
|
|
Share a single process namespace between all of the containers in a pod share process namespace. |
|
|
Specify the pod’s subdomain. |
|
|
Duration in seconds the pod needs to terminate gracefully termination grace period seconds. |
|
|
Specify the job’s container pod tolerations. [[job.sample-job.capacity_requirements.tolerations]]
key = "key1"
operator = "Equal"
value = "value1"
effect = "NoSchedule"
|
|
|
Topology domain constraints see details. |
|
|
Specify the job’s container pod volumes. Refer to volumes for more examples and valid fields. The follow is an example of mounting a config map. [[job.sample-job.capacity_requirements.volumes]]
name = "creds"
[job.sample-job.capacity_requirements.volumes.configMap]
name = "credentials-cm"
|
Sample Job Definition#
# Schema for the job definition of a system command or executable:
[job.hello-world]
# Type of the job. Using "base" makes it possible to run executable files:
job_type = "base"
# User-friendly name for the job:
name = "hello-world"
# The command or application that will be executed by the job:
# on some systems this may need to be python3
command = "python"
# Arguments to supply to the command specified above:
args = ["-c",'print("Hello_World!")']
# Capture information from `stdout` and `stderr` for the job's logs:
log_to_stdout = true
The command
needs to be valid for the Farm Agents that will be running this type of job.
base
vs kit-service
Job definitions#
Omniverse Farm supports two primary types of job definitions:
base
job definitions#
A base job definition is a standalone execution of command
. It follows a typical batch processing model where Farm executes the command separately for each submitted task.
base
job definition for a conversion program#[job.file_conversion]
job_type = "base"
name = "file_conversion"
command = '/usr/bin/converter'
success_return_codes = [0]
args = ["--verbose"]
log_to_stdout = true
headless = true
active = true
[job.file_conversion.allowed_args]
source = { arg = "--source", default = "/data/input.file" }
destination = { arg = "--destination", default = "/data/output.file" }
[job.file_conversion.env]
LOG_LEVEL = "info"
Note
It is important to differentiate between executable commands versus shell builtins on both Windows and Linux
Executables exist as discrete pieces of executable software.
Shell builtins are implemented within the shell and do not exist independently.
Farm uses asyncio.create_subprocess_exec
to launch command
, which will not work with shell commands. To do so, specify the shell as the command
and the builtin as an argument.
When a task is submitted, the task’s task_args
dictionary is validated against and then merged with the job definition’s allowed_args
, which are then passed onto the command
.
kit-service
job definitions#
A kit-service job definition allows you to launch the Kit application specified in command
and then call the service endpoint specified in task_function
, typically implemented in a Kit extension.
This allows you to use the same code for persistant services as well as on-demand execution. It also provides flexibility in regards to how information is passed.
This is how the create-render job definition works, in conjunction with the omni.services.render
Kit extension.
[dependencies]
"omni.services.render" = {}
"omni.services.farm.agent.runner" = {}
[job.create-render]
job_type = "kit-service"
name = "create-render"
# Set the path to your Kit application script or batch file.
# Use single-quotes to avoid having to escape characters in the path
command = 'path/to/composer-startup-script'
args = [
"--enable omni.services.render",
]
task_function = "render.run"
headless = true
env = {}
log_to_stdout = true
Similar to how the task’s task_args
dictionary is passed to command
, a task’s task_function_args
is used to pass arguments to the task_function
during execution. Because this happens outside of the initial command invocation, there is no equivalent to the job definition’s allowed_args
.
Specifying a Task#
A task is a unit of work submitted to a Farm instance for execution. Each task must include a task type, which is used to locate the corresponding job definition, by matching the task type to the name of a job definition.
Task schema#
A task consists of the following properties:
Property |
Type |
Description |
---|---|---|
|
|
The user who submitted the task (metadata only). |
|
|
The |
|
|
A dictionary of arguments to pass to the command. This should match the |
|
|
The task function to call for |
|
|
A dictionary of arguments to pass to the |
|
|
A dictionary of capacity requirements for the task (Kubernetes only). |
|
|
A comment to associate with the task to provide context. |
|
|
A integer to set the relative priority of submitted tasks. Lower numbers have higher priority. |
|
|
Extensible metadata for defining the task, such as retry values. |
|
|
The task’s status. This should be set to |
|
|
Labels to associate with the task which can be used for filtering which Farm Agents can process the task. |
Sample task definitions#
Tasks are submitted to a Farm Queue’s /queue/management/tasks/submit
endpoint with the task definition passed as a JSON dictionary.
{
"user": "Username",
"task_type": "hello-world",
"task_args": {},
"status": "submitted"
}
{
"userid": "Username",
"task_type": "create-render",
"task_args": {},
"task_function": "render.run",
"task_function_args": {
"usd_file": "file:/C:/Users/Username/my_scenes/my_scene.usd",
"render_settings": {
"camera": "/OmniverseKit_Persp",
"range_type": 0,
"capture_every_nth_frames": 1,
"fps": 24,
"start_frame": 0,
"end_frame": 9,
"start_time": 0,
"end_time": 2,
"res_width": 1920,
"res_height": 1080,
},
"render_start_delay": 3,
"bad_frame_size_threshold": 0,
"max_bad_frame_threshold": 3
},
"task_comment": "my_scene test render - changed camera",
"status": "submitted",
}
The render_settings in task_function_args was shortened for simplicity. You can use your own create-render submission for a full list of arguments.
Submitting tasks#
Submitting the hello-world task using curl
from a Linux shell, as an example.
curl -X 'POST' \
'http://localhost:8222/queue/management/tasks/submit' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"user": "Username",
"task_type": "hello-world",
"task_args": {},
"status": "submitted"
}'
Typically, Farm task submission is embedded in purpose-built UIs such as Movie Capture.