IEEE INFOCOM 2026

Tackling the Imbalance in Video Analytics Pipelines with Hierarchical Embodied Intelligence Tackling the Imbalance in Video Analytics Pipelines with Hierarchical Embodied Intelligence

Wenhui Zhou, Lei Xie*,
Jingyi Ning, Shuyu Cao, Hao Wu,
Qinghua Peng, Long Fan
State Key Laboratory
for Novel Software Technology,
Nanjing University, China

*Corresponding author
Paper overview figure of Hier-EI macro and micro scheduling
Hier-EI overview from the paper. The page now uses the original clear figure assets: the macro scheduler perceives runtime context and makes coarse decisions, while micro negative feedback produces fine-grained executable knob values.
3.6x latency compliance improvement
67.4% P95 latency reduction
~1.7 ms scheduling overhead
01

Spatiotemporal imbalance

Models spatial cloud-edge bottlenecks and temporal pipeline stragglers as a unified scheduling challenge.

02

Coarse-to-fine decisions

Reduces the multi-knob search space from million-level combinations to tractable hierarchical actions.

03

Runtime embodied feedback

Continuously perceives resource/task context and updates policies through measured QoE feedback.

Abstract

With the proliferation of heterogeneous software-hardware infrastructures in camera deployment, video analytics pipelines (VAPs) are increasingly burdened by spatiotemporal workload imbalance, where uneven task distributions lead to latency constraint violations and degraded quality of experience (QoE). Eliminating this imbalance is challenging due to the inherent complexity of adjusting large-scale parameters and the dynamic nature of VAP runtime environments.

We propose Hier-EI, a scheduling framework that combines a two-phase hierarchical design with embodied intelligence. It decomposes million-level combinatorial knob decisions into a linear coarse-to-fine workflow and uses closed-loop feedback to adapt to runtime dynamics. On a KubeEdge-based prototype, Hier-EI achieves a 3.6x improvement in latency compliance and a 67.4% reduction in P95 latency compared with state-of-the-art scheduling methods.

Motivation

Cloud-edge VAPs suffer from two coupled sources of imbalance. Spatial imbalance arises when wireless bandwidth and heterogeneous edge/cloud hardware create transmission bottlenecks. Temporal imbalance arises when adjacent pipeline stages process tasks at different rates, causing stragglers and queue buildup.

Existing profiling, end-to-end learning, and single-knob feedback methods struggle when the system must jointly adjust frame resolution, frame rate, buffer size, pipeline partitioning, and region allocation under continuously changing resource and task contexts.

Spatial and temporal imbalance in video analytics pipelines
Spatiotemporal imbalance in VAPs: network bottlenecks accumulate tasks in transmission queues, while pipeline stragglers accumulate tasks in processing queues.

Method

Macro Scheduling

A Soft Actor-Critic agent perceives bandwidth, task features, historical delay, and previous decisions, then emits coarse directions such as decrease, keep, or increase.

Micro Scheduling

Negative-feedback controllers transform macro directions into fine-grained knob values for resolution, frame rate, buffer size, and pipeline partitioning.

Embodied Feedback

Hier-EI treats the scheduler and VAP as a closed-loop MDP: decisions change system state, measured runtime context updates rewards, and the next policy adapts.

Asynchronous Collaboration

Macro scheduling runs at a longer interval for global coordination, while micro scheduling reacts faster to sudden imbalance without waiting for every DRL update.

Macro-micro collaboration and embodied feedback mechanism in Hier-EI
Hier-EI couples hierarchical collaboration with embodied feedback: macro decisions provide low-frequency global guidance, while micro decisions respond at a finer interval using the latest feedback.

System Demo

Runtime workflow. Video streams are segmented into tasks, routed across cloud-edge processors, and returned to the scheduler as feedback.

Scheduler loop. Hier-EI updates video configuration and offloading decisions from resource, task, and decision contexts.

Replacement slot. The current MP4 is a placeholder; replace it with a real Dayu/Hier-EI run when the demo recording is ready.

Dayu cloud-edge system architecture
Prototype architecture: KubeEdge-based cloud-edge orchestration with collaborative scheduling, monitoring, and pipeline services.
Real-world cloud-edge testbed and data examples
Real-world testbed and application examples used for evaluation.

Results

Applications. Road surveillance over UA-DETRAC videos and pedestrian monitoring over YouTube videos.

Scenarios. Four settings combine stable/unstable networks and sparse/dense object workloads.

Baselines. Chameleon, FC, CASVA-L, CASVA-D, and CEVAS.

Extreme case. In S4, Hier-EI improves latency compliance by 358.42% and reduces P95 latency by 67.43%.

Implementation in Dayu

Hier-EI is implemented as a scheduler policy inside Dayu. The code path mirrors the paper: a policy YAML selects the HEI agent, the agent runs macro DRL and micro negative feedback, and the scheduler exports updated video-configuration and offloading decisions.

BibTeX

@inproceedings{zhou2026hier-ei,
  title={Tackling the Imbalance in Video Analytics Pipelines with Hierarchical Embodied Intelligence},
  author={Zhou, Wenhui and Xie, Lei and Ning, Jingyi and Cao, Shuyu and Wu, Hao and Peng, Qinghua and Fan, Long},
  booktitle={IEEE INFOCOM 2026 - IEEE Conference on Computer Communications},
  year={2026}
}