Streaming Session Readiness Probes#

Streaming Session Readiness Probes are a powerful feature in the Application Streaming API that ensures streaming sessions are fully prepared before allowing connections. This feature is built on top of Kubernetes’ probe mechanism, providing a robust way to verify that a streaming session is ready to accept user connections.

By default, the Application Streaming API attempts to connect to a stream as soon as it’s deployed. However, various factors such as shader compilation, container image pulling, and network configuration can affect the time it takes for a streaming session to be fully operational. The Readiness Probes feature addresses this issue by implementing a “wait for ready” setup.

A key advantage of using startup probes is the ability for users and developers to fully customize what constitutes a ‘ready’ state for their specific application. This flexibility allows for precise control over when a streaming session is considered operational, ensuring optimal performance and user experience.

Enabling Session Readiness Checks#

To take advantage of the feature, it’s necessary to opt-in by enabling it in the configuration. This is done through the kit-appstreaming-manager Helm chart’s values file.

To enable the feature, set the following configuration option to true:

streaming:
  serviceConfig:
    session_check_ready: true

Note

The session_check_ready option is set to false by default. Enabling this feature will cause the Application Streaming API to wait for the streaming session to be fully ready before allowing connections.

When enabled, the Application Streaming API will use the Readiness Probes to ensure that the streaming session is completely prepared before attempting to establish a connection. It will wait for the pod to be in a Running state and that all containers are ready within the pod. If no probes are defined, the Application Streaming API will use Kubernetes’ default startup probe to check for the running state of the pod.

Simply enabling the session_check_ready option will enable the default readiness probes for the streaming session. This covers cases such as:

Container image pulling
Error state of the pod
Error state of the containers within the pod

Configuring Probes in Profiles#

While the default readiness checks cover basic scenarios, the Application Streaming API allows for more fine-grained control over session readiness through the use of probes in Application Profiles. These probes can be configured under the streamingKit section of the profile.

Here’s an example of how to configure a startup probe:

streamingKit:
  startupProbe:
    httpGet:
      path: /v1/streaming/ready
      port: 8011
    initialDelaySeconds: 20
    periodSeconds: 10
    failureThreshold: 90

The Application Streaming API supports all options available for Kubernetes probes. For a comprehensive list of available fields and their descriptions, refer to the Kubernetes documentation on startup probes.

Field Explanations#

path: The endpoint path to check for readiness (specific to httpGet).
port: The port on which the readiness check should be performed.
initialDelaySeconds: Number of seconds to wait before performing the first probe.
periodSeconds: How often (in seconds) to perform the probe.
failureThreshold: Number of times the probe can fail before giving up.

Startup Probes vs. Readiness Probes#

While both startup and readiness probes are supported in the Application Streaming API, they serve different purposes:

Startup Probes: Used to know when a container application has started. They are particularly useful for slow-starting containers, preventing them from being killed before they are up and running.
Readiness Probes: Indicate when a container is ready to start accepting traffic. A pod is considered ready when all of its containers are ready.

In the context of the Application Streaming API, startup probes are generally preferred as they provide a more accurate representation of when a streaming session is fully operational, including aspects like shader compilation and network configuration.

Note

If both startup and readiness probes are defined, the readiness probe will not start until the startup probe has succeeded.

By configuring these probes appropriately, you can ensure that the Application Streaming API only attempts to establish connections to fully prepared streaming sessions, enhancing the overall user experience.

Pending State Timeout#

The Application Streaming API now includes an option to check if a pod remains in a pending state for an extended period. This feature helps identify potential issues with pod scheduling or resource allocation. The timeout is configurable either as a general setting or within a specific profile.

Global Configuration#

To set a global default for the maximum pending time, the kit-appstreaming-manager can be configured with the following setting:

backend_csp_args:
  default_max_pending_time: 60
  <additional_args>

The default_max_pending_time value is specified in seconds. Setting this to 0 (the default) means the Application Streaming API will not flag pods for being in a pending state, regardless of duration.

Profile-Specific Configuration#

For more granular control, the maximum pending time can be set for specific profiles using a pod annotation:

streamingKit:
  podAnnotations:
    nvidia.omniverse.ovas.pod.maxPendingTime: "30"

This annotation sets the maximum pending time to 30 seconds for pods created with this profile.

Note

The profile-specific setting takes precedence over the global configuration.

When a pod exceeds the specified maximum pending time, it will be flagged as potentially problematic.

Session Status Codes#

When the Readiness Probes feature is enabled, it affects how the status of streaming sessions is reported:

Querying a Specific Session: - 202 (Accepted): Returned when the session is not yet ready but is still in a valid state. This indicates that the request has been accepted for processing, but the session is not fully prepared. - 424 (Failed Dependency): Returned if an error occurs that causes the session to fail completely (such as image pull errors or startup errors). This indicates that the session has failed and will not return to a valid state.
Listing Sessions: When listing all sessions, the status of individual sessions will be included in the response: - Sessions still in the process of becoming ready will be marked as “not ready”. - Sessions that have failed completely will be marked as “failed”. - Fully operational sessions will be marked as “ready”.

These status codes and flags provide clear feedback about the state of streaming sessions, allowing for appropriate actions to be taken based on their current status.

Note

A 202 status code suggests that the session may become available soon and retrying after a short delay might be appropriate.
A 424 status code indicates a critical error, and the session will need to be recreated or the underlying issue addressed before it can be used. Resources might have been allocated for the session, and it might be necessary to clean up and recreate the session to resolve the issue.

Custom Readiness Checks with Omniverse Services#

Omniverse applications can leverage the omni.services framework to create highly customized endpoints for readiness probes. This capability allows developers to define precise criteria for what constitutes a ‘ready’ state based on internal application knowledge.

By using omni.services, developers can:

Create custom endpoints that reflect the application’s specific readiness criteria.
Implement complex logic that considers various internal states of the application.
Provide detailed readiness information that goes beyond simple “up/down” status.

For example, a custom endpoint could check if all necessary assets are loaded, shaders are compiled, and any initialization processes are complete before reporting the application as ready.

Alternatively, sidecar containers can also be used to implement custom readiness checks. These containers can run alongside the main application container and perform specialized health checks or readiness probes.

When configuring startup probes for applications with custom endpoints, ensure that the probe settings in the Application Profile correctly target these endpoints and interpret their responses appropriately.

omni.services is part of the Kit application templates.

Sample Extension: omni.services.streaming.readiness#

This sample extension, omni.services.streaming.readiness, demonstrates how to create a custom readiness check for Omniverse applications in a streaming context. It provides a service that monitors the application’s startup process and RTX initialization, ensuring that the application is fully prepared before allowing streaming connections.

The extension sets up an HTTP endpoint /v1/streaming/ready that reports the readiness status of the application. It checks two main components:

Application readiness: Ensures that the Omniverse application has fully started up.
RTX readiness: Verifies that the RTX rendering system is initialized and ready.

To use this extension in an Omniverse application:

Add the extension to a new directory in the custom kit application’s source tree. Generally underneath source/exts/.
Add the extension to the application’s dependencies in the .kit file.
Configure the streaming setup to check the /v1/streaming/ready endpoint before attempting to establish a connection.
The endpoint will return a 200 status code when the application is fully ready, and a 503 status code if it’s not ready yet.

This extension serves as a template that can be customized to include additional readiness checks specific to an application’s needs.

Please find the folder structure and source code below: