Tackling the Imbalance in Video Analytics Pipelines with Hierarchical Embodied Intelligence

Zhou, Wenhui; Xie, Lei; Ning, Jingyi; Cao, Shuyu; Wu, Hao; Peng, Qinghua; Fan, Long

doi:10.1109/INFOCOM59046.2026.11571610

IEEE INFOCOM 2026

Tackling the Imbalance in Video Analytics Pipelines with Hierarchical Embodied Intelligence Tackling the Imbalance in Video Analytics Pipelines with Hierarchical Embodied Intelligence

Wenhui Zhou, Lei Xie^*,
Jingyi Ning, Shuyu Cao, Hao Wu,
Qinghua Peng, Long Fan

State Key Laboratory
for Novel Software Technology,
Nanjing University, China
^*Corresponding author

Paper DOI Slides Code

Paper overview figure of Hier-EI macro and micro scheduling — Hier-EI overview from the paper. The page now uses the original clear figure assets: the macro scheduler perceives runtime context and makes coarse decisions, while micro negative feedback produces fine-grained executable knob values.

3.6x latency compliance improvement

67.4% P95 latency reduction

~1.7 ms scheduling overhead

Spatiotemporal imbalance

Models spatial cloud-edge bottlenecks and temporal pipeline stragglers as a unified scheduling challenge.

Coarse-to-fine decisions

Reduces the multi-knob search space from million-level combinations to tractable hierarchical actions.

Runtime embodied feedback

Continuously perceives resource/task context and updates policies through measured QoE feedback.

Abstract

With the proliferation of heterogeneous software-hardware infrastructures in camera deployment, video analytics pipelines (VAPs) are increasingly burdened by spatiotemporal workload imbalance, where uneven task distributions lead to latency constraint violations and degraded quality of experience (QoE). Eliminating this imbalance is challenging due to the inherent complexity of adjusting large-scale parameters and the dynamic nature of VAP runtime environments.

We propose Hier-EI, a scheduling framework that combines a two-phase hierarchical design with embodied intelligence. It decomposes million-level combinatorial knob decisions into a linear coarse-to-fine workflow and uses closed-loop feedback to adapt to runtime dynamics. On a KubeEdge-based prototype, Hier-EI achieves a 3.6x improvement in latency compliance and a 67.4% reduction in P95 latency compared with state-of-the-art scheduling methods.

Motivation

Cloud-edge VAPs suffer from two coupled sources of imbalance. Spatial imbalance arises when wireless bandwidth and heterogeneous edge/cloud hardware create transmission bottlenecks. Temporal imbalance arises when adjacent pipeline stages process tasks at different rates, causing stragglers and queue buildup.

Existing profiling, end-to-end learning, and single-knob feedback methods struggle when the system must jointly adjust frame resolution, frame rate, buffer size, pipeline partitioning, and region allocation under continuously changing resource and task contexts.

Spatial and temporal imbalance in video analytics pipelines — Spatiotemporal imbalance in VAPs: network bottlenecks accumulate tasks in transmission queues, while pipeline stragglers accumulate tasks in processing queues.

Method

Macro Scheduling

A Soft Actor-Critic agent perceives bandwidth, task features, historical delay, and previous decisions, then emits coarse directions such as decrease, keep, or increase.

Micro Scheduling

Negative-feedback controllers transform macro directions into fine-grained knob values for resolution, frame rate, buffer size, and pipeline partitioning.

Embodied Feedback

Hier-EI treats the scheduler and VAP as a closed-loop MDP: decisions change system state, measured runtime context updates rewards, and the next policy adapts.

Asynchronous Collaboration

Macro scheduling runs at a longer interval for global coordination, while micro scheduling reacts faster to sudden imbalance without waiting for every DRL update.

Macro-micro collaboration and embodied feedback mechanism in Hier-EI — Hier-EI couples hierarchical collaboration with embodied feedback: macro decisions provide low-frequency global guidance, while micro decisions respond at a finer interval using the latest feedback.

Results

Applications. Road surveillance over UA-DETRAC videos and pedestrian monitoring over YouTube videos.

Scenarios. Four settings combine stable/unstable networks and sparse/dense object workloads.

Baselines. Chameleon, FC, CASVA-L, CASVA-D, and CEVAS.

Extreme case. In S4, Hier-EI improves latency compliance by 358.42% and reduces P95 latency by 67.43%.

Latency Compliance

P95 Tail Latency

Ablation Study

Ablation without asynchronous collaboration

Implementation in Dayu

Hier-EI is implemented as a scheduler policy inside Dayu. The code path mirrors the paper: a policy YAML selects the HEI agent, the agent runs macro DRL and micro negative feedback, and the scheduler exports updated video-configuration and offloading decisions.

1Policy YAMLselects SCH_AGENT_NAME=hei

2HEIAgentcoordinates macro and micro loops

3Runtime feedbackupdates state, reward, and schedule plan

Scheduler Policy template/scheduler/hei.yaml

Declares the HEI scheduler configuration, model directory, mode, and mounted runtime files.

Agent Orchestration schedule_agent/hei_agent.py

Runs SAC macro scheduling and negative-feedback micro scheduling with asynchronous intervals.

Macro DRL schedule_agent/hei/drl/

Implements the Soft Actor-Critic model, replay buffer, adapters, and neural networks.

Micro Feedback schedule_agent/hei/nf/negative_feedback.py

Maps coarse directions to fine-grained resolution, FPS, buffer-size, and partition decisions.

BibTeX


@inproceedings{zhou2026hier-ei,
  title = {Tackling the Imbalance in Video Analytics Pipelines with Hierarchical Embodied Intelligence},
  author = {Zhou, Wenhui and Xie, Lei and Ning, Jingyi and Cao, Shuyu and Wu, Hao and Peng, Qinghua and Fan, Long},
  booktitle = {IEEE INFOCOM 2026 - IEEE Conference on Computer Communications},
  year = {2026},
  pages={1--10},
  publisher = {IEEE},
  doi={10.1109/INFOCOM59046.2026.11571610}
}